To the untrained eye, all the suggestions we’re inundated with online might seem essentially the same. After all, there’s no big difference between “you might like,” “your friends liked” and “other people who bought this also bought,” right? Actually, there is, and the right approach to making recommendations can make or break a web business.
The trick to doing recommendations right, according to Eventbrite Director of Data Engineering Vipul Sharma, is rooted in both business and data architecture. Companies must know who their audience is, what types of data they can collect and how they can best use that data to discern what consumers really want. Or, to put it more succinctly, companies have to understand data science.
Amazon is a good example of the approaches a company might take. The e-commerce giant, Sharma explained, used to use a hierarchical model to recommend additional purchases to shoppers. Products were indexed in such a way that, for example, the system would always recommend batteries to someone buying a camera. However, as the product catalog grew and Amazon picked up its analytical abilities, it moved to the current model of recommending purchases based on what other people who bought the same thing also bought.
Now, Amazon is able to present more accurate suggestions because it’s using real-world purchase data instead of static indexes that make assumptions about what someone should buy. Presenting this information as “Customers Who Bought This Item Also Bought” is not only a way to sell more stuff, it also provides shoppers the peace of mind of knowing they’re pairing items that — if the wisdom of the crowd is to be believed — go well together.
Of course, Amazon also uses a process called collaborative filtering to make recommendations to members even before they start looking at individual items. These recommendations aren’t based strictly on products that are frequently bought together, but also on how shoppers with similar purchase histories and interests tend to behave. Amazon actually explains the process pretty clearly on its Recommendations FAQ page:
We determine your interests by examining the items you’ve purchased, items you’ve told us you own items you’ve rated, and items you’ve told us you like. We then compare your activity on our site with that of other customers, and using this comparison, are able to recommend other items that may interest you.
Down, I like. Otep, not so much.
But Amazon does have one trick up its sleeve that many other companies don’t: Shoppers spend a lot of time on its site and many of them might actually be willing to put in a little effort to get more-accurate recommendations. So, Amazon is able to build even better recommendations for users by asking them questions about recommended items — do they own it, do they like it, and is the purchase that spurred the recommendation worth using for future recommendations. The better the information a company has about what users actually want, the better recommendations (or ads) it can show them.Everyone does it different
However, Sharma explained, every web company has its own unique methods for making recommendations. Facebook, for example, relies on users’ social graphs to make recommendations based on what someone’s friends also like (it also relies on users’ stated interests primarily to serve ads). It’s a platform built on the idea of connecting with friends, so it assumes users care what their friends are up to and are interested in.
Netflix takes a different approach (its Facebook Connect feature notwithstanding), focusing its recommendation efforts around items. Its algorithms are about calculating the relationships among pieces of content based on factors such as genre, actors, ratings and even the sequence in which users typically watch stuff. Personalization in this case is more interest graph than social graph — Netflix knows what you like (or at least what you’ve watched) and suggests new content that’s somehow similar to it or related to it. As I explain here, a graph is essentially a method for storing data based on their relation to each other.
(Some, however, have suggested that clustering users based on their interests first might make content recommendations more accurate, because, they argue, what others with the same interests are watching is probably more relevant than how the content itself is related.)
In June, my colleague Stacey Higginbotham wrote about the mysteries behind Apple’s Genius recommendations in iTunes. The methods actually don’t seem too out of the ordinary, but a Quora post explaining them did get pulled …
In order to accurately recommend events for its members to attend, Eventbrite has to use a combination of all these tactics. Its model takes into account what events someone has attended in order to discern and graph their interests, Sharma explained, but also builds what he calls an “implicit social graph.” The people in this graph aren’t necessarily a user’s friends, but rather are people who frequently attend the same conferences.
“We internally treat them as your friends and believe they’re part of a social graph because you share an interest with them,” he said. For Eventbrite’s purposes of recommending new events a user might like, if someone is always attending country music festivals or big data conferences, his peers are the people attending those events.
Eventbrite actually does use Facebook Connect, Sharma noted, but clarified that “it’s useful, but it’s not really great for us.” People don’t update their interests too often, he explained, and someone’s personal interests don’t necessarily align with their professional interests. And although Sharma didn’t mention it, others in the business of building interest graphs have noted that the interests people express publicly in front of our friends and family don’t always align with the interests we express by acting in certain ways.
Another wrinkle for companies, such as Eventbrite, that are more services than platforms is that users probably aren’t willing to stick around and answer questions to help the sites build better models. Amazon and Netflix users might provide express feedback on recommendations, and Sharma noted that even Facebook can assume users are less interested in particular friends when they block certain updates or actions from them. Because users aren’t addicted, he said (or perhaps because they don’t attach the same importance to events as to the movies they watch), Eventbrite’s approach to learning what users want has to be frictionless.
I’m not a member, so Eventbrite can only assume I want local events
And while the talk today is all about social graphs and interest graphs, Sharma thinks the future of online suggestions is object graphs. No longer will it be good enough just to know a user is into music, but companies will also have to know what instrument she plays. This will be more difficult for companies that can’t simply ask users about their specific interests, but Sharma said it’s now so easy (relatively speaking) to collect lots of data both internally and from other web services, and analyze it deeply, that this level of specificity should be possible even for small data science teams.
Personalization, after all, is a popular method by which web services prove their worth. Netflix claims that 75 percent of what people watch comes from some sort of a recommendation. As for Eventbrite, Sharma said, “We are not a subscription product, but when people love the product, they stay with the product.”
Feature image courtesy of Shutterstock user Ivelin Radkov.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
- Infrastructure Q1: Cloud and big data woo enterprises
- A near-term outlook for big data
- Cloud computing infrastructure: 2012 and beyond