slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
movie #7614 PowerPoint Presentation
Download Presentation
movie #7614

Loading in 2 Seconds...

play fullscreen
1 / 52

movie #7614 - PowerPoint PPT Presentation


  • 164 Views
  • Uploaded on

Lessons from the Netflix Prize. Yehuda Koren The BellKor team (with Robert Bell & Chris Volinsky). movie #15868. movie #7614. movie #3250. Recommender systems. We Know What You Ought To Be Watching This Summer. Collaborative filtering.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'movie #7614' - art


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Lessons from the Netflix Prize

Yehuda Koren

The BellKor team

(with Robert Bell & Chris Volinsky)

movie #15868

movie #7614

movie #3250

slide2

Recommender systems

We Know What You OughtTo Be Watching This Summer

collaborative filtering
Collaborative filtering
  • Recommend items based on past transactions of users
  • Analyze relations between users and/or items
  • Specific data characteristics are irrelevant
    • Domain-free: user/item attributes are not necessary
    • Can identify elusive aspects
movie rating data
Movie rating data

Training data

Test data

netflix prize
Netflix Prize
  • Training data
    • 100 million ratings
    • 480,000 users
    • 17,770 movies
    • 6 years of data: 2000-2005
  • Test data
    • Last few ratings of each user (2.8 million)
    • Evaluation criterion: root mean squared error (RMSE)
    • Netflix Cinematch RMSE: 0.9514
  • Competition
    • 2700+ teams
    • $1 million grand prize for 10% improvement on Cinematch result
    • $50,000 2007 progress prize for 8.43% improvement
overall rating distribution
Overall rating distribution
  • Third of ratings are 4s
  • Average rating is 3.68

From TimelyDevelopment.com

ratings per movie
#ratings per movie
  • Avg #ratings/movie: 5627
ratings per user
#ratings per user
  • Avg #ratings/user: 208
average movie rating by movie count
Average movie rating by movie count
  • More ratings to better movies

From TimelyDevelopment.com

important rmses

erroneous

accurate

Important RMSEs

Global average: 1.1296

User average: 1.0651

Movie average: 1.0533

Personalization

Cinematch: 0.9514; baseline

BellKor: 0.8693; 8.63% improvement

Grand Prize: 0.8563; 10% improvement

Inherent noise: ????

challenges
Challenges
  • Size of data
    • Scalability
    • Keeping data in memory
  • Missing data
    • 99 percent missing
    • Very imbalanced
  • Avoiding overfitting
  • Test and training data differ significantly

movie #16322

the bellkor recommender system
The BellKor recommender system
  • Use an ensemble of complementing predictors
  • Two, half tunedmodels worth more than asingle, fully tunedmodel
the bellkor recommender system16
The BellKor recommender system
  • Use an ensemble of complementing predictors
  • Two, half tunedmodels worth more than asingle, fully tunedmodel
  • But:Many seemingly different models expose similar characteristics of the data, and won’t mix well
  • Concentrate efforts along three axes...
slide17

The three dimensions of the BellKor system

The first axis:

  • Multi-scale modeling of the data
  • Combine top level, regional modeling of the data, with a refined, local view:
    • k-NN: Extracting local patterns
    • Factorization: Addressing regional effects

Global effects

Factorization

k-NN

multi scale modeling 1 st tier
Multi-scale modeling – 1st tier
  • Mean rating: 3.7 stars
  • The Sixth Sense is 0.5 stars above avg
  • Joe rates 0.2 stars below avg

Baseline estimation:Joe will rate The Sixth Sense 4 stars

Global effects:

multi scale modeling 2 nd tier
Multi-scale modeling – 2nd tier
  • Both The Sixth Sense and Joe are placed high on the “Supernatural Thrillers” scale

Adjusted estimate:Joe will rate The Sixth Sense 4.5 stars

Factors model:

multi scale modeling 3 rd tier
Multi-scale modeling – 3rd tier
  • Joe didn’t like related movie Signs
  • Final estimate:Joe will rate The Sixth Sense 4.2 stars

Neighborhood model:

slide21

The three dimensions of the BellKor system

The second axis:

  • Quality of modeling
  • Make the best out of a model
  • Strive for:
    • Fundamental derivation
    • Simplicity
    • Avoid overfitting
    • Robustness against #iterations, parameter setting, etc.
  • Optimizing is good, but don’t overdo it!

global

local

quality

slide22

The three dimensions of the BellKor system

  • The third dimension will be discussed later...
  • Next:Moving the multi-scale view along the quality axis

global

local

???

quality

local modeling through k nn
Local modeling through k-NN
  • Earliest and most popular collaborative filtering method
  • Derive unknown ratings from those of “similar” items (movie-movie variant)
  • A parallel user-user flavor: rely on ratings of like-minded users (not in this talk)
slide24

- unknown rating

- rating between 1 to 5

k-NN

users

movies

slide25
k-NN

users

movies

- estimate rating of movie 1 by user 5

slide26
k-NN

users

movies

Neighbor selection:Identify movies similar to 1, rated by user 5

slide27
k-NN

users

movies

Compute similarity weights:s13=0.2, s16=0.3

slide28
k-NN

users

movies

Predict by taking weighted average:(0.2*2+0.3*3)/(0.2+0.3)=2.6

properties of k nn
Properties of k-NN
  • Intuitive
  • No substantial preprocessing is required
  • Easy to explain reasoning behind a recommendation
  • Accurate?
k nn on the rmse scale

erroneous

accurate

k-NN on the RMSE scale

Global average: 1.1296

User average: 1.0651

Movie average: 1.0533

0.96

Cinematch: 0.9514

k-NN

0.91

BellKor: 0.8693

Grand Prize: 0.8563

Inherent noise: ????

k nn common practice
k-NN - Common practice
  • Define a similarity measure between items: sij
  • Select neighbors -- N(i;u): items most similar to i, that were rated by u
  • Estimate unknown rating, rui, as the weighted average:

baseline estimate for rui

k nn common practice32
k-NN - Common practice
  • Define a similarity measure between items: sij
  • Select neighbors -- N(i;u): items similar to i, rated by u
  • Estimate unknown rating, rui, as the weighted average:

Problems:

  • Similarity measures are arbitrary; no fundamental justification
  • Pairwise similarities isolate each neighbor; neglect interdependencies among neighbors
  • Taking a weighted average is restricting; e.g., when neighborhood information is limited
interpolation weights
Interpolation weights
  • Use a weighted sum rather than a weighted average:

(We allow )

  • Model relationships between item i and its neighbors
  • Can be learnt through a least squares problem from all other users that rated i:
interpolation weights34

Mostly unknown

Estimate inner-products among movie ratings

Interpolation weights
  • Interpolation weights derived based on their role; no use of an arbitrary similarity measure
  • Explicitly account for interrelationships among the neighbors

Challenges:

  • Deal with missing values
  • Avoid overfitting
  • Efficient implementation
slide35

Latent factor models

serious

Braveheart

The Color Purple

Amadeus

Lethal Weapon

Sense and Sensibility

Ocean’s 11

Geared towards

males

Geared towards

females

Dave

The Lion King

Dumb and Dumber

The Princess

Diaries

Independence Day

Gus

escapist

latent factor models
Latent factor models

users

~

items

users

~

items

A rank-3 SVD approximation

estimate unknown ratings as inner products of factors
Estimate unknown ratings as inner-products of factors:

users

?

~

items

users

~

items

A rank-3 SVD approximation

estimate unknown ratings as inner products of factors38
Estimate unknown ratings as inner-products of factors:

users

?

~

items

users

~

items

A rank-3 SVD approximation

estimate unknown ratings as inner products of factors39
Estimate unknown ratings as inner-products of factors:

users

2.4

~

items

users

~

items

A rank-3 SVD approximation

latent factor models40
Latent factor models

Properties:

  • SVD isn’t defined when entries are unknown  use specialized methods
  • Very powerful model  can easily overfit, sensitive to regularization
  • Probably most popular model among contestants
    • 12/11/2006: Simon Funk describes an SVD based method
    • 12/29/2006: Free implementation at timelydevelopment.com

~

factorization on the rmse scale
Factorization on the RMSE scale

Global average: 1.1296

User average: 1.0651

Movie average: 1.0533

Cinematch: 0.9514

0.93

factorization

BellKor: 0.8693

0.89

Grand Prize: 0.8563

Inherent noise: ????

our approach
Our approach
  • User factors:Model a user u as a vector pu ~ Nk(, )
  • Movie factors:Model a movie i as a vector qi ~ Nk(γ, Λ)
  • Ratings:Measure “agreement” between u and i: rui ~ N(puTqi, ε2)
  • Maximize model’s likelihood:
    • Alternate between recomputing user-factors, movie-factors and model parameters
    • Special cases:
      • Alternating Ridge regression
      • Nonnegative matrix factorization
combining multi scale views
Combining multi-scale views

Residual fitting

Weighted average

factorization

global effects

regional effects

k-NN

A unified model

local effects

factorization

k-NN

localized factorization model
Localized factorization model
  • Standard factorization: User u is a linear function parameterized by purui = puTqi
  • Allow user factors – pu – to depend on the item being predictedrui = pu(i)Tqi
  • Vector pu(i) models behavior of u on items like i
slide47

Seek alternative perspectives of the data

  • Can exploit movie titles and release year
  • But movies side is pretty much covered anyway...
  • It’s about the users!
  • Turning to the third dimension...

global

local

???

quality

slide48

The third dimension of the BellKor system

  • A powerful source of information:Characterize users by which movies they rated, rather than how they rated
  •  Abinary representation of the data

users

users

movies

movies

slide49

The third dimension of the BellKor system

Great news to recommender systems:

  • Works even better on real life datasets (?)
  • Improve accuracy by exploiting implicit feedback
  • Implicit behavior is abundant and easy to collect:
    • Rental history
    • Search patterns
    • Browsing history
    • ...
  • Implicit feedback allows predicting personalized ratings for users that never rated!

movie #17270

slide50

The three dimensions of the BellKor system

Where do you want to be?

  • All over the global-local axis
  • Relatively high on the quality axis
  • Can we go in between the explicit-implicit axis?
  • Yes! Relevant methods:
    • Conditional RBMs [ML@UToronto]
    • NSVD [Arek Paterek]
    • Asymmetric factor models

global

local

explicit

implicit

ratings

binary

quality

lessons
Lessons

What it takes to win:

  • Think deeper – design better algorithms
  • Think broader – use an ensemble of multiple predictors
  • Think different – model the data from different perspectives

At the personal level:

  • Have fun with the data
  • Work hard, long breath
  • Good teammates

Rapid progress of science:

  • Availability of large, real life data
  • Challenge, competition
  • Effective collaboration

movie #13043

slide52

Yehuda Koren

AT&T Labs – Research

yehuda@att.com

BellKorhomepage:www.research.att.com/~volinsky/netflix/