1 / 27

Artificial Intelligence with Web Applications

Artificial Intelligence with Web Applications. Dell Zhang Birkbeck, University of London 2010/11. CI Chapter 2. Making Recommendations. Recommendation Systems. State of the Art. An Ensemble The Neighbourhood Approach User-based Collaborative Filtering

Download Presentation

Artificial Intelligence with Web Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Artificial Intelligence with Web Applications Dell Zhang Birkbeck, University of London 2010/11

  2. CI Chapter 2 Making Recommendations

  3. Recommendation Systems

  4. State of the Art • An Ensemble • The Neighbourhood Approach • User-based Collaborative Filtering • Item-based Collaborative Filtering • The Latent-Factor Approach • ……

  5. The Neighbourhood Approach • The nearest-neighbour method works on the principle that a person tends to give similar ratings to similar movies. • Joe likes the three movies on the left, so to make a prediction for him, find users who also liked those movies and see what other movies they liked. • Here the three other viewers all liked Saving Private Ryan, so that is the top recommendation. Two of them liked Dune, so that’s ranked second, and so on.

  6. The Latent-Factor Approach • A second, complementary method scores both a given movie and viewer according to latent factors, themselves inferred from the ratings given to all the movies by all the viewers. • The factors define a space that at once measures the characteristics of movies and the viewer’s interest in those characteristics. • Here we would expect the fellow in the southeast corner of the graph to love Norbit, to hate Dreamgirls, and, perhaps, to rate Braveheart about average.

  7. Collaborative Filtering (user-based)

  8. Collecting Preferences • Possible mappings of user actions to numerical scores

  9. Collecting Preferences • A set of movie critics and their ratings of movies • 'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5, 'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5, 'The Night Listener': 3.0} • 'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5, 'Just My Luck': 1.5, 'Superman Returns': 5.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 3.5} • 'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0, 'Superman Returns': 3.5, 'The Night Listener': 4.0} • 'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0, 'The Night Listener': 4.5, 'Superman Returns': 4.0, 'You, Me and Dupree': 2.5} • 'Mick LaSalle': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'Just My Luck': 2.0, 'Superman Returns': 3.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 2.0} • 'Jack Matthews': {'Lady in the Water': 3.0, 'Snakes on a Plane': 4.0, 'The Night Listener': 3.0, 'Superman Returns': 5.0, 'You, Me and Dupree': 3.5} • 'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman Returns':4.0}

  10. Finding Similar Users • Euclidean Distance Score • Pearson Correlation Score

  11. Euclidean Distance Score • The straight-line distance between two points in a multidimensional space, which is the kind of distance you measure with a ruler. Given the Euclidean distance d between two points, their similarity score can be defined as 1/(d+1).

  12. Euclidean Distance Score • The movie critics in (Euclidean) preference space

  13. Pearson Correlation Score • A measure of how correlated two variables are (how well their values fit on a straight line) • It is a value between 1 and –1 • 1: the variables are perfectly correlated • 0: the variables are not correlated • –1: the variables are perfectly inversely correlated • It tends to work better in situations where the dataset isn’t well normalized

  14. Pearson Correlation Score • Comparing two movie critics on a scatter plot low positive correlation

  15. Pearson Correlation Score • Comparing two movie critics on a scatter plot high positive correlation

  16. Pearson Correlation Score

  17. User-based Collaborative Filtering • Step 1: Look for users who share the same rating patterns with the active user (the user whom the prediction is for) • e.g., using the k-Nearest Neighbours algorithm • Step 2: Use the ratings from those like-minded users to calculate a prediction for the active user.

  18. User-based Collaborative Filtering • The user-based recommendations for Toby

  19. Weighted Mean • A type of average that has a weight for every observation being averaged

  20. Finding Similar Items • Just switch user and items in the previous slides!

  21. Finding Similar Items • Comparing two movies on a scatter plot negative correlation

  22. Finding Similar Items

  23. Item-based Collaborative Filtering • Step1: Build an item-item matrix determining relationships between pairs of items. • The similarities between items will not change as often as the similarities between users. • Step 2: Infer the taste of the active user, using his top-rated items and the above item-item matrix.

  24. Item-based Collaborative Filtering • The item-based recommendations for Toby

  25. User-based vs Item-based • Efficiency • The latter is significantly faster than the former on large datasets as it allows many calculations to be performed in advance, but it has the additional overhead of maintaining the item similarity table. • Effectiveness • The latter usually outperforms the former on sparse datasets, while they perform about equally on dense datasets. • Simplicity • The former is simpler to implement as it doesn’t have the extra steps.

More Related