The Research Progress of Recommender Systems in Social Tagging Systems

The Research Progress of Recommender Systems in Social Tagging Systems Dr. Guandong Xu Intelligent Web and Information system Department of Computer Science Aalborg University

Outline • Why recommender systems • The state-of-the-art of recommender systems • Social tagging systems • Tag-based recommender system • Personalized recommendation • Tag recommendation • User profiling • Open research questions • Conclusion • Appendix: Our recent work on group approaches

Why recommender systems • The Internet computing era • Information overload • Low precision: retrieved info is not what you need • Low recall: the correctly relevant info is not exhaustively returned Example:

Why recommender systems • No personalized • Different users returned the same search results • Personalization or recommendation Same results

Why Recommender System GroupLens: An open architecture for collaborative filtering of netnews, Resnick, P.; Iacovou, N.; Sushak, M.; Bergstrom, P.; Riedl, J. , 1994 ACM Conference on Computer Supported

Why Recommender Systems Recommender systems recommendation systems recommendation engines Information information filtering system technique Content-based approach, Collaborative filtering approach recommendinformation items (films, television, video on demand, music, book, news, images, web pages, etc) Interested in users

Tradition Recommendation Methodology

Recommender System Categories Content-based recommendations Preferred Similar items User recommend Collaborative recommendations items Preferred User1 User Similar taste User2 Preferred recommend Hybrid approaches Content-based &Collaborative

Example for Content-based approach • Considering a recommendation scenario • Page 1: “Department of Computer Science at Aalborg University…”. • Page 2: “ Department of Health Science and Technology…” • Search queue: “Computer Science” • R1={”Department”, ”Computer Science”, “Aalborg University”,…} • R2={“Department", "Health Science", "Technology”, …} • Use TF-IDF (term frequency – inverse document frequency) • Result: R1

Example for Content-based approach R1 R2 “… Department of Computer Science at Aalborg University… ” … … … Department of Health Science and Technology… • R1={”Department”, ”Computer Science”, “Aalborg University”,…} • R2={“Department", "Health Science", "Technology”, …} Query “Computer Science” TF-IDF (term frequency – inverse document frequency) • Result: R1

Principle of Collaborative filtering • Two kinds of approaches: • User-based: select the K similar users (KNN) or called memory-based • Item-based: select the closest item set – or called model-based

Example for Collaborative Filtering • Example 2 in Amazon.com: • The algorithm generates recommendations based on customers who bought this book also brought other book (similar preferences to the user).

Recommendations

Similarities and Differences Content-based recommendations Vectors of TF-IDF weights Similarity Vectors of the actual user-specified ratings Collaborative Filtering recommendations

Limitations

Some Extending Capabilities (1/2) • Comprehensive Understanding of Users and Items • Extensions for Model-Based Recommendation Techniques • Multidimensionality of Recommendations • Extend 2-Dimensional to Multi-dimensional User other1 … other2 User Item Item User, Movie, Time, Place

Some Extending Capabilities (2/2) • Multi-criteria Ratings • Restaurant (food, décor, service, price) • Non-intrusiveness • Flexibility • User’s flexibility • Effectiveness of Recommendations • Metrics related

Insights of recommender systems Closely look at recommender systems from different perspectievs

What do with data - implementation • Two kinds of problem with data: • Information retrieval (IR): static content, dynamic query -> modeling content (organized with index) • Information Filtering (IF): dynamic content, static query -> modeling query (organized as filters) • Recommendation is between IR and IF since the content varies slowly and the queries depend of few parameters. Methods of both IR and IF are then used to reduce computation at query time.

General purpose • Top-k filtering: list of "best" items (main usage) or anti-spam • Items correlation: find similar items • Prediction of rating: predict any pair between any pair of an user and an item (more general)

Degree of personalization • Generic: everyone receives same recommendations • Demographic: everyone in the same category receives same recommendations • Contextual: recommendation depends only on current activity • Persistent: recommendation depends on long-term interests

What the Data be • Context of the current page (current request, item currently explored and structured content about this context) • History of the current user on the system (explicit or implicit ratings) • History of all users on the system • History of the current user on multiple systems, the whole web or even on its computer • History of all users on multiple systems, the whole web or even their computer

How to design Recommender System • Explicit Data • Rating data (Rate a film in Netflix, Like or Dislike in Youtube) • Implicit Data • Log (users’ activities-the implicit feedback) Recommender System based on users’ data

Emerging of New Recommendation Approaches • Collaborative Filtering (Social Recommender) • Compare with traditional content based approach • Recommendation from friends • Daily recommendation from friends • News feeds, FaceBook, Re-tweet • Recommendation over social media (blog, YouTube) • Recommendation by using social data • Social network • Social tagging

Multi-Relational Social Data We are in a big social network. Node: facet Hyper-edge: relationship http://www.dasfa.net/wiki/index.php?title=Image:Metafac.png

Recommendation from friends-Facebook

Social recommendation by social media

Social relationship is powerful SF vs. CF Social Filtering approach outperforms CF approach in the experiments G.Groh et al. Recommendations in taste related domains, GROUP’07, November 4–7, 2007, Sanibel Island, Florida, USA

Recommender System Overview Query Time Location… Information item Tags Merchandise/Ads Persons Community… Input Output User item rating Social relations Social tagging… User-Item KNN; Clustering-based; Graph-based; Matrix Factorization; Information Diffusion; Probabilistic model; … Algorithms

Tags is personal annotation Tag User1 Tag User2 Tag … User3 >Metadata >Index User4 Tag User5 Tag Tag … Resources >A user’s personal opinion expression >Implicit rating or voting on the tagged information resources or items.

Tagging Types • Self-tagging • Users can only tag their own contributions • Permission-based • Users decide who can tag their resources • Free-for-all • Any user can tag any resource

Tagging support • Blind tagging • User cannot see the other tags assigned to the resource they’re tagging • Viewable tagging • Users can see the other tags assigned to the resource they’re tagging • Suggestive tagging • User sees suggested tags for the resource they’re tagging

Aggregation of Tag • Bag-model • Same tag can be assigned to a resource multiple times. (Delicious) • Set-model • A tag can be applied only once to a resource. (Flickr)

Tag Temporal Behavior over time • Tags convergence • The tags assigned to a certain Web resource tend to stabilize and to become the majority. • Tags divergence • Tag-sets don’t converge to a smaller group of more stable tags and where the tag distribution continually changes. • Tags periodicity • Tags evolve and decay with time.

Tag based Recommender System Tags Resources Users t1,t2,t3 t7,t2,t5 t1,t2,t3 Tag based RS t1,t8,t7 t1,t8,t7 t1,t8,t9 t1,t8,t7

Extension of User-Item Tso-Sutter et al. 2008 User tags as items, Item tags as users <users, tags, items> <user, tag>,<item, tag>,<user ,item> reduce

Folksonomy model Definition :A folksonomy is a quadruple F := (U; T; R; Y), where U, T, R are finite sets of instances of users, tags, and resources and Y defines a relation, the tag assignment, between these sets, that is, Y ⊆ U × T × R. Converting the Folksonomy into an Undirected Graph. First we convert the folksonomy F = (U, T,R, Y ) into an undirected tripartite graph GF = (V,E) as follows. E = {{u, t}, {t, r}, {u, r} | (u, t, r) ∈ Y }, with each edge {u, t} being weighted with |{r ∈ R : (u, t, r) ∈ Y }|, each edge {t, r} with |{u ∈ U : (u, t, r) ∈ Y }|, and each edge {u, r} with |{t ∈ T : (u, t, r) ∈ Y }| V = U∪ T∪ R Employ: Adapted PageRank Algorithm

FolkRankHotho et al ECSW2006 PageRank Directed graphs. A page is important if there many pages linking to it, and if those pages are important themselves A resource which is tagged with important tags by important users becomes important itself. (The same holds, symmetrically, for tags and users.) FolkRank graph of tags has no direction Recommend a set of related users and resources for a given tag.

Difference highlights • Documents that are of potential interest to a user can be suggested to him. • When using a certain tag, other related tags can be suggested. • Folk-Rank additionally considers the tagging behavior of other users. • Other users that work on related topics can be made explicit, improving thus the knowledge transfer within organizations and fostering the formation of communities.

Tensor Factorization Symeonidiset al.2008 Rendle et al.2009 Tensor Factorization

Tensor factorization • HOSVD (Symeonidis et al TKDE 2010) • Basic idea: by optimizing the square loss: • Other optimization measure, e.g., AUC (Area Under Curve) Rendle et al SIGKDD 2009

Others • The GroupMe! System (Abel et al. 2007). • PLSA (Probabilistic Latent Semantic Analysis) (Wetzker et al. 2009). • Tag-based profile construction • Naïve (Szomszor et al. 2007), co-occurrence (Michlmayr and Cayzer 2007) andadaptation approach (Dorigo and Caro 1999). • WebDCC (WebDocument Conceptual Clustering) (Godoy and Amandi 2006) • Music recommendation system (Uitdenbogerd and van Schnydel 2002) • …

The limitations • Tags have little semantics and many variations • The correlation between sets of tags • Uncontrolled vocabulary- users’ behavior in their ways • Redundancy and ambiguity in the tag database • Tags do not describe the document, but a judgment. • Non-English-speaking language tags, e.g. Vienna, Wien.

Data Quality • How to manage the cold start problem (new user, new item) or more generally data sparsity? • The system must have a special behavior for user with few ratings (eg. not personalized recommendation) • The system may use bot-users to rate new items according to the content

Confidence and display • How to improve the confidence in the recommender system? • By providing good recommendations! • By providing information about each recommendation (eg. Ratings, explanation) • How to display recommendations? • The item recommended must be easy to identify and evaluate by the user • Ratings must be easy to understand and meaningful • Explanations must provide a quick way for the user to evaluate the recommendation

Interaction (1/2) • How to interact with the user? • You may ask the user to correct a prediction • You must update your rating matrix with this prediction and update your recommendation accordingly • You may want to learn the key parameters of your algorithm using the feedback • You may ask the user to provide feedback on the explanation • You may ask the user to provide more context for the current task (eg. by using categories)

Interaction (2/2) • How to manage scalability • Applications usually need real-time prediction computation • The computation time has to scale with number of users and items • How to manage temporal changes? • You can not run your algorithms each time a modification occurs • The off-line computation must be robust to small modification and scheduled accordingly • The on-line computation must benefit from modifications • The computation must be done incrementally when possible • The system may "forget" older information

The Research Progress of Recommender Systems in Social Tagging Systems

The Research Progress of Recommender Systems in Social Tagging Systems

Presentation Transcript

Recommender Systems; Social Information Filtering

Recommender systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender systems

Recommender Systems

Recommender Systems

Recommender Systems