1 / 40

Quest for $1,000,000: The Netflix Prize

Quest for $1,000,000: The Netflix Prize. Bob Bell AT&T Labs-Research July 15, 2009 Joint work with Chris Volinsky, AT&T Labs-Research and Yehuda Koren, Yahoo! Research. Recommender Systems. Personalized recommendations of items (e.g., movies) to users Increasingly common

terence
Download Presentation

Quest for $1,000,000: The Netflix Prize

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quest for $1,000,000:The Netflix Prize Bob Bell AT&T Labs-Research July 15, 2009 Joint work with Chris Volinsky, AT&T Labs-Research and Yehuda Koren, Yahoo! Research

  2. Recommender Systems • Personalized recommendations of items (e.g., movies) to users • Increasingly common • To deal with explosive number of choices on the internet • Netflix • Amazon • Many others

  3. Content Based Systems • A pre-specified list of attributes • Score each item on all attributes • User interest obtained for the same attributes • Direct solicitation, or • Estimated based on user purchases or ratings

  4. Pandora • Music recommendation system • Songs rated on 400+ attributes • Music genome project • Roots, instrumentation, lyrics, vocals • Two types of user feedback • Seed songs • Thumbs up/down for recommended songs

  5. Drawbacks of Content Based Systems • Effort to score all items on many attributes • Best attributes may be unknown • Some attributes may be unscorable • Need for direct solicitation of data from users in some systems

  6. Collaborative Filtering (CF) • Does not require content information about items or solicitation of users • Infers user-item relationships from purchases or ratings • Used by Amazon and Netflix

  7. “We’re quite curious, really. To the tune of one million dollars.” – Netflix Prize rules • Goal to improve on Netflix’ existing movie recommendation technology • Prize • Based on reduction in root mean squared error (RMSE) on test data • $1,000,000 grand prize for 10% drop • Or, $50,000 progress for best result each year • Contest began October 2, 2006

  8. Data Details • Training data • 100 million ratings (from 1 to 5 stars) • 6 years (2000-2005) • 480,000 users • 17,770 “movies” • Test data • Last few ratings of each user • User, movie, date given • Ratings withheld (for most of test data) • Teams are allowed daily feedback on their RMSE

  9. Higher Mean Rating in Test Data

  10. Something Happened in Early 2004 2004

  11. Movies Rated Most Often

  12. Most Active Users

  13. Ratings per Movie in Training Data Avg #ratings/movie: 5627

  14. Ratings per User in Training Data Avg #ratings/user: 208

  15. Progress after 2 Months 15

  16. Progress after 8 Months 16

  17. Nearest Neighbor (NN) Methods • Most common CF tool • Predict rating for a specific user-item pair based on ratings of • Similar items • By the same user • Or vice versa • Requires no “content” about items or users • Easy to apply • Easy to explain to users • But not as powerful as other methods

  18. Latent Factor Models • Explain ratings by a set of latent factors (attributes) • Factors are learned from the data • No need for pre specification • Neural networks • SVD (Singular Value Decomposition) • AKA matrix factorization • Dominant method used by leaders of competition

  19. Item Factors • Each item summarized by a d-dimensional vector qi • Potential factors • Comedy vs. drama • Amount of action • Depth of character development • Totally uninterpretable • Choose d much smaller than number of items or users • e.g., d = 50 << 18,000 or 480,000

  20. User Factors • Similarly, each user summarized by pu • Same number of factors • User factors measure interest in corresponding item factors • Predicted rating for Item i by User u • Inner product of qi and pu

  21. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Independence Day escapist

  22. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males Dave The Lion King Dumb and Dumber The Princess Diaries Independence Day Gus escapist

  23. Challenges in Using SVD • Need lots of factors (large d)

  24. Challenges in Using SVD • Need lots of factors (large d) • Easy to over fit

  25. The Fundamental Challenge • How can we estimate as much signal as possible where there are sufficient data, without over fitting where data are scarce?

  26. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Independence Day Gus escapist

  27. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Independence Day Gus escapist

  28. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Gus Independence Day escapist

  29. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males Gus The Lion King Dumb and Dumber The Princess Diaries Independence Day escapist

  30. Challenges in Using SVD • Need lots of factors (large d) • Easy to over fit • User behavior may change over time • Ratings go up or down • Interests may change • Composition of account may change, for example, with addition of a new rater

  31. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Independence Day Gus escapist

  32. serious Braveheart Amadeus The Color Purple Lethal Weapon Sense and Sensibility Ocean’s 11 Geared towards females Geared towards males The Lion King Dumb and Dumber Gus The Princess Diaries Independence Day escapist

  33. serious Braveheart The Color Purple Amadeus Lethal Weapon Sense and Sensibility Ocean’s 11 Gus + Geared towards females Geared towards males The Lion King Dumb and Dumber The Princess Diaries Independence Day escapist

  34. Challenges in Using SVD • Need lots of factors (large d) • Easy to over fit • User behavior may change over time • Misses some types of patterns

  35. Neither SVD nor NN is Perfect • SVD is poorly situated to fully capture strong “local” relationships • e.g., among sequels • NN ignores cumulative effect of many small signals • May be ineffective for items with no close neighbors • Each method complements the other

  36. The Wisdom of Crowds (of Models) • All models are wrong; some are useful – G. Box • Our best entry during Year 1 was a linear combination of 107 sets of predictions • Nearest neighbors, SVD, neural nets, et al. • Many variations of model structure and parameter settings • Years 2 and 3 • Individual models are more comprehensive and much more accurate • Combining many models still helps • Five models suffice to beat Year 1 score

  37. Progress after 1 Year 37

  38. Is this Any Way to do Science? • Wide participation • Submissions from 5,000 teams • 8,300 posts on the Netflix Prize forum • Generation and dissemination of new methods • Presentations/workshops in academic conferences • Journal publications • Reasons for success • Well designed by Netflix • Industrial strength data set • Opportunity to build on work of others • Collegial spirit of competitors

  39. The Race is On

  40. Thank You! • rbell@research.att.com • www.netflixprize.com • …/leaderboard • …/community • Click BellKor on Leaderboard for details

More Related