1 / 20

Combining Content-based and Collaborative Filtering

Combining Content-based and Collaborative Filtering. Gabriela Polčicová Pavol Návrat. Department of Computer Science and Engineering, Slovak University of Technology polcicova @dcs.elf.stuba.sk navrat @elf.stuba.sk. Overview. Information Filtering and its Types Combined Method

trixie
Download Presentation

Combining Content-based and Collaborative Filtering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining Content-based and Collaborative Filtering Gabriela Polčicová Pavol Návrat Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk navrat@elf.stuba.sk

  2. Overview • Information Filtering and its Types • Combined Method • Experiment with Information Filtering Methods • Conclusions

  3. InformationFiltering (1) • delivery of relevant information to the people who need it • Types of Information Filtering • Content-based - for textual documents • Collaborative - for communities of users • Interests • information about interests - stored in profiles • expressing opinions to documents - ratings • Ratings {i, j, rij} • for user i, item j, the value of rating rij

  4. Information Filtering (2) Filter Rated items {user, item, value} Learning interests Unrated items {user, item} Estimating the value of rating Recommendations {user, item, estimation} Choosing recommendations

  5. Content-based Filtering (1) • Basic idea • recommending documents based on content and properties of document • Profile • consists of keywords with assigned weights • only documents matching profile are recommended • Recommendations • based on objective measurable properties

  6. Content-based Filtering (2) Documents rated by the user Documents unrated by the user Documents of interest Documents, ratings PROFILE Keywords, phrases with weights Documents matching profile => recommended documents

  7. Collaborative Filtering (1) • Basic idea • automating “word of mouth” • leverage opinions of like-minded users while making decisions • Schema • collecting users’ opinions • searching for like-minded users • making recommendations

  8. Collaborative Filtering (2) Profile of user 1 Profile of user 2 Profile of current user Profile of user 3 Profile of user 4 Documents from like-minded users’ profiles => recommended documents Profile of user 5

  9. Collaborative Filtering (3) • Similarity measure: Pearson Correlation Coefficient  (rcj - rc) (rij - ri) j  Ici kci =  (rcj - rc)2  (rij - ri)2 j  Ici j  Ici • Recommendations computation: weighted sum of ratings  (rij - ri) kci i  Ucj rcj = rc +  |kci| i  Ucj

  10. Combining Content-based and Collaborative Filtering (1) • Computing of estimates for missing ratings by Content-based Filtering method for each user • Searching for like-minded users • computing coefficient kci between current and i-th user (only from ratings) • computing coefficient kci’ between current and i-th user (from both ratings and estimates) • New recommendations computation • using ratings (with coefficients kci) and also ratings with estimates (with coefficient kci’) as weights in weighted sum of ratings and estimates

  11. Datasets for Experiments • Data: • EachMovie - users‘ ratings for movies www.research.digital.com/SRC/eachmovie/ • IMDB - textual information for CBF (movies‘ descriptions) www.imdb.com/ • Datasets: • A - ratings from the period up to Mar 1, 1996 (810 ratings from 71 users) • B - ratings from the period uo to Mar 15, 1996 (2407 ratings from 131 users) • C - ratings from the period up to Apr 1, 1996 (12290 ratings from 651 users)

  12. EachMovie Data and Constant Method • Constant Method rcj = 5

  13. Experiments with Combination of Content-based and Collaborative Filtering (2) Dataset Content-based Filtering method recommendations test, training sets Collaborative Filtering method Apply filtering methods and evaluate their performance recommendations Divide dataset into training set (90%) and test set (10%) test, training sets Combined Filtering method recommendations test set Constant method recommendations Evaluation of methods’ performance

  14. Metrics • Coverage = percentage of items for which the method is able to compute estimates • Accuracy = • F-measure = • NMAE = |R L| + |R L| |L| + |L| R - set of recommended items L - set of liked items 2.Precision.Recall Precision + Recall |R  L| |R| Precision = Recall = |rij - rij| n.s |R  L| |L|

  15. Results of Experiments

  16. Conclusions • Combination of content-based and collaborative filtering might help in initial phase Future work • Weighting of coefficients • Comparing method with additional methods

  17. Content-based Filtering - Vector Representation of Documents and Profiles Documentj computer machine learning W . Profile Sim(W, Profile) = |W| . |Profile| n profilei =  rj .wij j = 1 TF-IDF TF-IDF TF-IDF Wj= (0, … , 0, 0.5 , 0, … , 0, 0.3 , 0, … , 0, 0.2 , 0, … , 0) D = ( … , computer, … , learning, … , machine, …. )

  18. Collaborative Filtering - Example A B C D E F G current1 4 5 1 3 5 1 2 21 3 2 5 3 5 1 4 5 4 1 4 2 4 5 2 4 2 5 2

  19. Combining Content-based and Collaborative Filtering (2) • Similarity measure: Pearson Correlation Coefficient  (rcj - rc) (rij - ri) CBF CBF ’ j  Ici ’ kci =  (rcj - rc)2  (rij - ri)2 CBF CBF ’ ’ j  Ici j  Ici • Recommendations computation: weighted sum of ratings and estimates  (rij - ri) kci+  (rij - ri) kci’ CBF i  Ucj i  U’cj rcj = rc +  |kci| + |kci’| i  Ucj i  U’cj

  20. Experiments with Combination of Content-based and Collaborative Filtering (1) • Content-based Filtering Method (CBF) • documents and profiles: vector representation - weighted keywords (TF-IDF) • estimation computation: normalized dot product of document and profile vectors • Collaborative Filtering (CF) • Pearson correlation coefficient • weighted sum of ratings • Combination of CF and CBF • Pearson correlation coefficients • weighted sum of ratings and CBF estimations • Constant Method (rcj = 5)

More Related