1 / 32

CSE 482

CSE 482. Lecture 15 (Collaborative Filtering). Outline. What is a recommender system? What is collaborative filtering? What are the collaborative filtering techniques?. Information Overload. Recommender Systems. Automated systems that make recommendation based on the preference of users

phylliss
Download Presentation

CSE 482

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 482 Lecture 15 (Collaborative Filtering)

  2. Outline • What is a recommender system? • What is collaborative filtering? • What are the collaborative filtering techniques?

  3. Information Overload

  4. Recommender Systems • Automated systems that make recommendation based on the preference of users • Motivation from User’s Perspective • Lots of online products, books, movies, etc. • Help me narrow the choices available… • Motivation from Business’ Perspective “ If I have 3 million customers on the web, I should have 3 million stores on the web.” CEO of Amazon.com

  5. Book Recommendation

  6. Movie Recommendation

  7. Collaborative Filtering • The technology behind most recommender systems • The process of filtering information by soliciting judgments from others to overcome the information overload problem • "Based on the premise that people looking for information should be able to make use of what others have already found and evaluated." (Maltz & Ehrlich, 1995)

  8. Another Application: Netflix $1M Prize Task Given customer ratings on some movies Predict customer ratings on other movies If John rates “Mission Impossible” a 5 “Over the Hedge” a 3, and “Back to the Future” a 4, how would he rate “Harry Potter”, … ?

  9. Collaborative filtering techniques are used to predict how well a user will like an item that he/she has not rated given a set of historical preference judgments for a community of users. Collaborative Filtering

  10. Technique: Nearest Neighbor • User-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using the ratings for i from users in u’s neighborhood • Need to define similarity measure and neighborhood size

  11. Technique: Nearest Neighbor • User-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using the ratings for i from users in u’s neighborhood • Neighbor = users with similar interests Average ratings of neighbor n Average ratings of user u

  12. Technique: Nearest Neighbor • Item-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using a weighted sum of the user u’s ratings for items that are most similar to i.

  13. Technique: Nearest Neighbor • Item-Based Nearest Neighbor • Given a user u, generate a prediction for an item i by using a weighted sum of the user u’s ratings for items that are most similar to i.

  14. Similarity Measure • Numerical measure of how alike two data instances are. • Higher when the instances are more alike • Examples of similarity measures

  15. Jaccard Similarity • Let x and y be a pair of binary 0/1 vectors • Mij: number of elements in which x = i and y = j Jaccard(x, y) = (M11) / (M01 + M10 + M11) • Example Jaccard(John, Mary) = = 0.25

  16. Cosine Similarity • If d1 and d2 are two document vectors, then cos( d1, d2 ) = (d1d2) / ||d1|| ||d2|| , where  indicates vector dot product and || d || is the length of vector d. • Example: d1= 3 2 0 5 0 0 0 2 0 0 d2 = 1 0 0 0 0 0 0 1 0 2 d1d2= 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5 ||d1|| = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0)0.5 = (42) 0.5 = 6.481 ||d2|| = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2)0.5= (6) 0.5 = 2.245 cos( d1, d2 ) = .3150

  17. Gaussian Radial Basis Function • Let x and y be the feature vectors for 2 data instances • Example:  = 0.1

  18. Python Example

  19. Python Example

  20. Python Example

  21. Python Example

  22. Technique: Matrix Factorization • Items are not independent and have inherent groupings • Movies can be grouped based on genres • Books can be grouped based on their topic areas • The groups can be treated as “latent” features of the data Given: ratings matrix R (users x items)

  23. Technique: Matrix Factorization • Movie ratings What if genre is not the optimal grouping (since some movies may belong to multiple genres)? Can we automatically find an appropriate grouping of features?

  24. Technique: Matrix Factorization • Given: ratings matrix R (users x items) • Goal is to factorize R into a product of two latent matrices, U and M, such that the following quantity is minimized: • where (R) is the set of non-missing ratings in R

  25. Technique: Matrix Factorization Given: ratings matrix R (users x items) Goal: To decompose matrix R into a product of matrices U and MT (the superscript T denote a matrix transpose operation) that best approximates R T Predicted matrix U  MT user feature matrix U (users  features) item feature matrix M (items  features)  =

  26. Technique: Matrix Factorization • Given: an incomplete matrix R and parameter k • Alternating least-square (ALS) algorithm • Randomly initialize U, M, and the missing values in R • Repeat until convergence • Find M such that ||R – UMT||F is minimized • Find U such that ||RT–MUT||F is minimized • For each missing value in Rij, replace with the corresponding value in (UMT)ij

  27. Example ratings matrix R (users x items) Iteration = 1 U = M = UMT =

  28. Example ratings matrix R (users x items) Iteration = 50 U = M = UMT =

  29. Example ratings matrix R (users x items) Iteration = 100 U = M = UMT =

  30. Example ratings matrix R (users x items) Iteration = 200 U = M = UMT =

  31. Example ratings matrix R (users x items) Iteration = 500 U = M = UMT =

  32. Cold-Start Problem • What will you recommend to a new user who has not provided any ratings? • Utilize side information to make the recommendation • Examples: demographic and item content information • How to incorporate side information? • Factorization machines or more generally, User features Item features Cast into a regression problem!

More Related