Collaborative Filtering - PowerPoint PPT Presentation

sandra_john
collaborative filtering l.
Skip this Video
Loading SlideShow in 5 Seconds..
Collaborative Filtering PowerPoint Presentation
Download Presentation
Collaborative Filtering

play fullscreen
1 / 74
Download Presentation
Collaborative Filtering
537 Views
Download Presentation

Collaborative Filtering

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Collaborative Filtering Rong Jin Dept. of Computer Science and Engineering Michigan State University

  2. Information Filtering • Basic filtering question: Will user U like item X? • Two different ways of answering it • Look at what U likes  characterize Xcontent-based filtering • Look at who likes X  characterize Ucollaborative filtering

  3. Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.

  4. Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.

  5. A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests

  6. A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests 5

  7. Important Problems in Collaborative Filtering • How to estimate users’ similarity if rating data is sparse? • Most users only rate a few items • How to identify interests of a test user if he/she only provides ratings for a few items? • Most users are inpatient to rate many items • How to combine collaborative filtering with content filtering? • For movie ratings, both the content information and the user ratings are available

  8. Problem I: How to Estimate Users’ Similarity based on Sparse Rating Data?

  9. Sparse Data Problem(Breese et al., 1998) Most users only rate a small number of items and leave most items unrated

  10. Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests

  11. Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests • Cluster items with similar ratings

  12. Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Unknown ratings are gone!

  13. Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Introduce rating uncertainty • Unknown ratings are gone! • Cluster both users and items simultaneously

  14. Zu: user class Zo: item class U: user O: item R: rating Cluster variable Observed variable Zu Zo U O R Flexible Mixture Model (FMM) (Si & Jin, 2003) An Expectation Maximization (EM) algorithm can be used for identifying clustering structure for both users and items

  15. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  16. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  17. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  18. Decoupling Model (DM)(Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu R U O Hidden variable Observed variable

  19. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu Zpref Zpref: whether users like items R U O Hidden variable Observed variable

  20. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable

  21. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable

  22. Empirical Studies • EachMovie dataset: • 2000 users and 1682 movie items • Avg. # of rated items per user is 130 • Rating range: 0-5 • Evaluation protocol • 400 training users, and 1600 testing users • Numbers of items rated by a test user: 5, 10, 20 • Evaluation metric: MAE • MAE: mean absolute error between true ratings and predicted ratings • The smaller the MAE, the better the performance

  23. Baseline Approaches • Ignore unknown ratings • Vector similarity (Breese et al., 1998) • Fill out unknown ratings for individual users with their average ratings • Personal diagnosis (Pennock et al., 2000) • Pearson correlation coefficient (Resnick et al., 1994) • Only cluster users • Aspect model (Hofman & Puzicha, 1999)

  24. Experimental Results

  25. Summary • The sparse data problem is important to collaborative filtering • Flexible Mixture Model (FMM) is effective • Cluster both users and items simultaneously • Decoupling Model (DM) provides additional improvement for collaborative filtering • Take into account rating variance among users of similar interests

  26. Problem II:How to Identify Users’ Interests based on A Few Rated Items?

  27. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  28. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  29. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  30. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  31. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  32. Active Learning Approaches(Ross & Zemel, 2002) • Selective sampling • Ask a user to rate the items that are most distinguishable for users’ interests • A general strategy • Define a loss function that represents the uncertainty in determining users’ interests • Choose the item whose rating will result in the largest reduction in the loss function

  33. Active Learning Approach (I)(Jin & Si, 2004) • Select the items that have the largest variance in the ratings by the most similar users

  34. Active Learning Approach (II) (Jin & Si, 2004) • Consider all the training users when selecting items • Weight training users by their similarities when computing the “uncertainty” of items

  35. A Bayesian Approach for Active Learning (Jin & Si, 2004) • Flexible Mixture Model • Key is to determine the user class for a test user • Let D be the ratings already provided by test user y • D = {(x1, r1), …, (xk, rk)} • Let  be the distribution of user class for test user y estimated based on D •  = {z = p(z|y)|1z m}

  36. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that

  37. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r)

  38. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r) • Take into account the uncertainty in rating prediction

  39. Two types of uncertainties • Uncertainty in user class distribution  • Uncertainty in rating prediction A Bayesian Approach for Active Learning (Jin & Si, 2004) • But, in reality, we never know the true user class distribution trueof the test user • Replace true with the distribution p(|D)

  40. Computational Issues • Estimating p(|D) is computationally expensive • Calculating the expectation is also expensive

  41. Approximate Posterior Distribution (Jin & Si, 2004) • Approximate p(|D) by Laplacian approximation • Expand the log-likelihood function around its maximum point *

  42. Compute Expectation (Jin & Si, 2004) • Expectation can be computed analytically using the approximate posterior distribution p(|D)

  43. Empirical Studies • EachMovie dataset • 400 training users, and 1600 test users • For each test user • Initially provides 3 rated items • 5 iterations, and 4 items are selected for each iteration • Evaluation metric • Mean Absolute Error (MAE)

  44. Baseline Approaches • The random selection method • Randomly select 4 items for each iteration • The model entropy method • Select items that result in the largest reduction in the entropy of distribution p(|D) • Only considers the uncertainty in model distribution • The prediction entropy method • Select items that result in the largest reduction in the uncertainty of rating prediction • Only considers the uncertainty in rating prediction

  45. Experimental Results

  46. Summary • Active learning is effective for identifying users’ interests • It is important to take into account every bit of uncertainty when applying active learning methods

  47. Problem IIIHow to Combine Collaborative Filtering with Content Filtering?

  48. Collaborative Filtering + Content Info.

  49. Collaborative Filtering + Content Info.

  50. Linear Combination (Good et al., 1999) • Build a different prediction model for content information and collaborative information • Linearly combine their predictions together