OTHER PLACES OF INTEREST
Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.
Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...
Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...
Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.
R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.
HOW DID YOU GET HERE?
While I’ve been between full time gigs, I’ve had the chance to do some consulting with a couple of companies, and I wanted to describe a bit of what I’m seeing.
One of the more interesting companies I’ve worked with recently is Rocket Fuel, Inc..
The founders include George John, engineering lead behind Yahoo!‘s BT (behavioral targeting), Richard Frankel, product lead behind most of Yahoo!‘s targeting products (and an early employee of NetGravity; if you remember that one, you know how long he’s been in this game), and Abhinav Gupta, who built many of the analytic data systems powering BT at Yahoo!.
They very quickly added talent including another co-worker of mine, Jarvis Mak, who built up their analytics and client services team, and he was kind enough to allow me to help out in their NYC office.
What does Rocket Fuel do? We all know that you can use various data points to buy ad inventory off of the exchanges, but just throwing data at the problem isn’t enough. Instead, you need to decide which data is most predictive to drive the behavior of interest. There are pre-packaged “BT” categories from many vendors, but those are built to be general models. The RF team has built the next generation of those models, using many more data points but also including much more flexibility in their approach. In effect, they build a custom configured model for each campaign designed to optimize the advertiser’s specific behavioral goal and optimize the spend across multiple exchanges and tactics. Their approach also allows for very rapid updating of the models: Not just rapid rescoring, but literally rapidly rebuilding to take into account the most recent behavioral data they’ve encountered. That might range from just coefficient updates to actual model vs. model comparisons to account for new variables. Finally, they are hitting moving targets: as the exchanges move to real-time-bidding, they can optimize the offer in real time to recognize changes in the market.
There were a couple of things they are doing these days which exemplify the modern data-marketing approach:
They keep ALL the data. Every impression was stored with every piece of data they could link to it. They did a nice job of using HBase for some processing with a Hadoop/Hive system for others. Some modeling was experimented with in R or MATLAB, but the heavy duty stuff quickly productionized to Hadoop. And because all the data was there, pretty much any question I had could be answered using all the data that occurred during a campaign without too much waiting for the query to finish. Yes, Hive is a queue oriented system, but these guys had some great UDFs which reduced some of the MR phases, especially around joins. Knowing that all the data is available, it became kind of fun to be able to kick off queries around, say, every cookie who received more than X of our ads over the last month to see just which cookies we were seeing too many of, and start to dig into why. While I’ve had this elsewhere, often the databases were just not designed to deal with this volume… Rocket Fuel was built around it.
It’s not enough to just be an automated system when dealing with marketing. The recommendation guys learned this long ago: If you just do recommendations without recognizing the difficulty of interfacing with content systems and supply/inventory systems, you don’t have much success; similarly, most of the successful email marketing companies for the largest brands (e-Dialog, Responsys) have built great service teams on top of their powerful tech. While Rocket Fuel’s success is driven a lot by the effectiveness of their models, different clients need different levels of help with everything from conversion pixel placement to how to understand their results.
Honestly, I was surprised. I expected to find a magic machine pumping out predictive models that automatically scored everything and had great performance and used APIs on the exchanges to just get ads up and out to the right people. The founders would be modeling and coding, and the rest would just be automated. Instead, Jarvis Mak has had to build a good team across the country of account reps who assist with campaign design and delivery, and analysts who assist with strategic consulting, pre-and in-campaign forecasting and evaluation, and post-campaign recommendations. Together, they help the marketers move from tactical to more strategic targeting. There are also people working on deals to get access to content pools at pre-negotiated pricing or that aren’t easily available in the exchanges, and people working on dealing with trafficking issues (can you believe this is still a problem after all of this time?) or data quality. All the tech we have today doesn’t eliminate the need for these services, so if you are building your startup and assume it will just work and mint money while you sleep, well, it if involves something online, you should assume it will need more care and feeding from humans then you ever imagined. And as I’ve chatted with other data-centric companies small and large, I’m seeing a similar pattern. Sure, the tech does a lot… but you still need the people.
They experimented. A lot. One advantage of being a small company (well, growing fast, and so not as small as they used to be) is that they can really try to innovate in things like targeting for brand impact or targeting for social-network growth. Even within a client’s campaign, they have the ability to throw in extra variables into small holdout groups to see how their presence impacts model performance; if they win, they migrate to more of the campaign; if not, no harm done.
Their business model is a mix of services, and that in itself is an experiment. Should they have an API to allow others to use them as a full DSP? Should they expose more external reporting to advertisers who want more visibility? Should they add more people to the client services team? They have different clients with different relationships, to see which is most effective. They haven’t had to pivot really hard, but they’ve certainly taken on certain client relationships that they recognize were not scalable, and minimized them. But without experimenting with the business model, it can be hard to know what’s both profitable and scalable.
That agile nature was also applied to the back-end systems. I’ve mentioned the rapid model updating, but they also were rapidly rolling out improvements to all sorts of systems. For example, their Hive install leveraged the Cloudera distro for things like the Hue user interface… but they had built some really clever custom functions which eliminated the need to do some joins, speeding up both the analyst queries as well as the data feeds for some of R&D models. These functions were written as rapidly iterating agile projects, so they met analysts needs almost immediately. Having real analytic-system coders who are familiar with Hadoop and it’s family is a completely different (and much more enjoyable) experience than dealing with the usual db programmer with 20 years of PL/SQL on simple, small transactional data.
Rocket Fuel was very, very cloud based for basic business apps. From using Google apps to manage communication and basic documents, SugarSync to manage file sharing, wikis to manage knowledge sharing, Salesforce to manage the sales process, nothing major for day-to-day work had to be stored on any local storage. Instead, much of the day-to-day work could run off of light and cheap boxes with Chrome. Now, of course folks used MS Office and Project and whatever when they had to… but I was surprised at how rarely that came up. Interestingly, when I asked about putting some of the data or ops systems in a cloud, they mentioned that in their constant cost-benefit evaluation, having the hardware under their control allowed them to reduce latency in a way that the cloud vendors couldn’t meet at scale. I suspect that will change in the future, but it jives with what I’ve heard from others: cloud PAAS/SAAS are great for proof-of-concept, but when you really need speed, you need to control your HW.
As a startup, they skimped on some things… but they made coming to work pretty nice. From incredible daily lunches in Redwood Shores to a fully stocked fridge and pantry in NYC, from poker-and-RockBand afternoons to kayak breaks, they recognized that they had to move fast but could have fun at the same time. Yes, it’s one thing to say “Google does that too!”... but the Goog has a zillion dollars; it’s another to say “every startup starts out doing that stuff to get talent”... but how many keep doing that even as they grow well past fitting the entire company in 1 room?
So, Rocket Fuel was a pretty cool place. But they are in a tough industry. Everyone in this space seems to tell the same story of data-driven marketing: “We use data to optimize the ad by showing the right creative to the right person at the right time at the right price”. When you get deeper into it, you can start to differentiate, but the average marketer can easily get confused around the differences between X+1, RocketFuel, DataXu, Interclick, Turn, or the zillion others.
Also, there’s still massive inventory issues: Lots of exchanges and places to buy, but still limitations on some of the “good stuff” or high quality inventory. For example, many advertisers like expandables (ads which can grow outside of their original unit size), but those are hard to get on the exchanges, so the full power of the algorithms are not always brought to bear (for any of these companies). Agencies building their own trading desks and linking those to the publisher private exchanges may also remove some of the better inventory, but that remains to be seen; the agency trading desks are very new, and they vary in sophistication.
Another issue: part of the value in this ad network business is an arbitrage play. If you can buy inventory for $0.01, but your models show that it’s being shown to a person worth $0.05 to your advertiser, then you can pocket the difference. This falls apart if inventory rises in price (for certain types of creative, or high quality or focused content, the inventory can be expensive) or if advertisers don’t want to pay more for targeting (some advertisers can afford to pay more than others), and also falls apart if the arbitrager can’t show the additional value to defend the higher price. All players in the exchanges have this as part of their playbook, though the better players are trying to build their business on more than just this pricing play. The AdExchanger blog is a good place to see how folks talk about the exchanges and how the online ad business is changing.
Now, many of you know that I’m also a big fan of X+1 (warning: they have an autorunning talker so turn down the volume if you visit).
There are similarities and differences between the companies, beyond the fact that X+1 is East Coast and Rocket Fuel is West Coast (like rappers, everyone has a favorite side of the country). I very much like how X+1 tries to optimize the entire experience, from ad-selection all the way through to landing page optimization (like an Optimost light). I think that’s very compelling, since it uses all the data available to make the entire experience consistent, and also gives some additional attribution capabilities. Rocket Fuel has chosen to focus on the ad optimization side only, and to their credit, does a great job of it. Also, X+1 offers a full DSP suite, while Rocket Fuel has tended towards more internal management of campaigns. Talent at both companies are top notch, and it’s hard to pick one over the other on that front. If you are considering using an audience optimization service, these should both be in your short list.
BTW, Rocket Fuel careers and X+1 careers are both hiring, and if you like working with big data, big optimization problems, and big clients… either of these are great choices, and tell them I sent you!
* * *