1 / 74

Collaborative Filtering

Collaborative Filtering Rong Jin Dept. of Computer Science and Engineering Michigan State University Information Filtering Basic filtering question: Will user U like item X ? Two different ways of answering it Look at what U likes  characterize X  content-based filtering

sandra_john
Download Presentation

Collaborative Filtering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaborative Filtering Rong Jin Dept. of Computer Science and Engineering Michigan State University

  2. Information Filtering • Basic filtering question: Will user U like item X? • Two different ways of answering it • Look at what U likes  characterize Xcontent-based filtering • Look at who likes X  characterize Ucollaborative filtering

  3. Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.

  4. Collaborative Filtering(Resnick et al., 1994) Make recommendation decisions for a specific user based on the judgments of users with similar interests.

  5. A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests

  6. A General Strategy(Resnick et al., 1994) • Identify the training users that share similar interests as the test user. • Predict the ratings of the test user as the average of ratings by the training users with similar interests 5

  7. Important Problems in Collaborative Filtering • How to estimate users’ similarity if rating data is sparse? • Most users only rate a few items • How to identify interests of a test user if he/she only provides ratings for a few items? • Most users are inpatient to rate many items • How to combine collaborative filtering with content filtering? • For movie ratings, both the content information and the user ratings are available

  8. Problem I: How to Estimate Users’ Similarity based on Sparse Rating Data?

  9. Sparse Data Problem(Breese et al., 1998) Most users only rate a small number of items and leave most items unrated

  10. Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests

  11. Flexible Mixture Model (FMM) (Si & Jin, 2003) • Cluster training users of similar interests • Cluster items with similar ratings

  12. Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Unknown ratings are gone!

  13. Movie Type I Flexible Mixture Model (FMM) (Si & Jin, 2003) Movie Type II Movie Type III • Introduce rating uncertainty • Unknown ratings are gone! • Cluster both users and items simultaneously

  14. Zu: user class Zo: item class U: user O: item R: rating Cluster variable Observed variable Zu Zo U O R Flexible Mixture Model (FMM) (Si & Jin, 2003) An Expectation Maximization (EM) algorithm can be used for identifying clustering structure for both users and items

  15. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  16. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  17. Rating Variance (Jin et al., 2003a) • The Flexible Mixture Model is based on the assumption that users of similar interests will have similar ratings for the same items • But, different users of similar interests may have different rating habits

  18. Decoupling Model (DM)(Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu R U O Hidden variable Observed variable

  19. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating Zo Zu Zpref Zpref: whether users like items R U O Hidden variable Observed variable

  20. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable

  21. Decoupling Model (DM) (Jin et al., 2003b) Zu: user class Zo: item class U: user O: item R: rating ZR Zo Zu Zpref Zpref: whether users like items ZR: rating class R U O Hidden variable Observed variable

  22. Empirical Studies • EachMovie dataset: • 2000 users and 1682 movie items • Avg. # of rated items per user is 130 • Rating range: 0-5 • Evaluation protocol • 400 training users, and 1600 testing users • Numbers of items rated by a test user: 5, 10, 20 • Evaluation metric: MAE • MAE: mean absolute error between true ratings and predicted ratings • The smaller the MAE, the better the performance

  23. Baseline Approaches • Ignore unknown ratings • Vector similarity (Breese et al., 1998) • Fill out unknown ratings for individual users with their average ratings • Personal diagnosis (Pennock et al., 2000) • Pearson correlation coefficient (Resnick et al., 1994) • Only cluster users • Aspect model (Hofman & Puzicha, 1999)

  24. Experimental Results

  25. Summary • The sparse data problem is important to collaborative filtering • Flexible Mixture Model (FMM) is effective • Cluster both users and items simultaneously • Decoupling Model (DM) provides additional improvement for collaborative filtering • Take into account rating variance among users of similar interests

  26. Problem II:How to Identify Users’ Interests based on A Few Rated Items?

  27. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  28. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  29. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  30. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  31. Identify Users’ Interests • To identify the interests of a user, the system needs to ask the user to rate a few items • Given a user is only willing to rate a few items, which one should be asked to solicit rating?

  32. Active Learning Approaches(Ross & Zemel, 2002) • Selective sampling • Ask a user to rate the items that are most distinguishable for users’ interests • A general strategy • Define a loss function that represents the uncertainty in determining users’ interests • Choose the item whose rating will result in the largest reduction in the loss function

  33. Active Learning Approach (I)(Jin & Si, 2004) • Select the items that have the largest variance in the ratings by the most similar users

  34. Active Learning Approach (II) (Jin & Si, 2004) • Consider all the training users when selecting items • Weight training users by their similarities when computing the “uncertainty” of items

  35. A Bayesian Approach for Active Learning (Jin & Si, 2004) • Flexible Mixture Model • Key is to determine the user class for a test user • Let D be the ratings already provided by test user y • D = {(x1, r1), …, (xk, rk)} • Let  be the distribution of user class for test user y estimated based on D •  = {z = p(z|y)|1z m}

  36. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that

  37. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r)

  38. A Bayesian Approach for Active Learning (Jin & Si, 2004) • When the user class distribution true of the test user is given, we will select the item x* that • x,r be the distribution of user class for test user y estimated based on D + (x,r) • Take into account the uncertainty in rating prediction

  39. Two types of uncertainties • Uncertainty in user class distribution  • Uncertainty in rating prediction A Bayesian Approach for Active Learning (Jin & Si, 2004) • But, in reality, we never know the true user class distribution trueof the test user • Replace true with the distribution p(|D)

  40. Computational Issues • Estimating p(|D) is computationally expensive • Calculating the expectation is also expensive

  41. Approximate Posterior Distribution (Jin & Si, 2004) • Approximate p(|D) by Laplacian approximation • Expand the log-likelihood function around its maximum point *

  42. Compute Expectation (Jin & Si, 2004) • Expectation can be computed analytically using the approximate posterior distribution p(|D)

  43. Empirical Studies • EachMovie dataset • 400 training users, and 1600 test users • For each test user • Initially provides 3 rated items • 5 iterations, and 4 items are selected for each iteration • Evaluation metric • Mean Absolute Error (MAE)

  44. Baseline Approaches • The random selection method • Randomly select 4 items for each iteration • The model entropy method • Select items that result in the largest reduction in the entropy of distribution p(|D) • Only considers the uncertainty in model distribution • The prediction entropy method • Select items that result in the largest reduction in the uncertainty of rating prediction • Only considers the uncertainty in rating prediction

  45. Experimental Results

  46. Summary • Active learning is effective for identifying users’ interests • It is important to take into account every bit of uncertainty when applying active learning methods

  47. Problem IIIHow to Combine Collaborative Filtering with Content Filtering?

  48. Collaborative Filtering + Content Info.

  49. Collaborative Filtering + Content Info.

  50. Linear Combination (Good et al., 1999) • Build a different prediction model for content information and collaborative information • Linearly combine their predictions together

More Related