slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches PowerPoint Presentation
Download Presentation
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches

Loading in 2 Seconds...

play fullscreen
1 / 19

Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches - PowerPoint PPT Presentation


  • 154 Views
  • Uploaded on

John Hannon , Mike Bennett, Barry Smyth CLARITY Centre for Sensor Web Technologies University College Dublin. Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches. Outline. 1. Problem. 2. Related work & Innovation. Method & Experiment. 3. 4.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches' - holmes-pate


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
John Hannon, Mike Bennett, Barry Smyth

CLARITY Centre for Sensor Web Technologies

University College Dublin

Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches

outline
Outline

1

Problem

2

Related work & Innovation

Method & Experiment

3

4

Result & Analysis

problem
Problem
  • The paper solves an important recommendation problem— for a given user, UT which other users might be recommended as followers/followees, based on a large dataset of Twitter users and their tweets.
  • The motivation of the paper is to demonstrate the potential for effective and efficient followee recommendation.
related work
Related Work
  • Analysis of Twitter’s real-time data.
    • Kwak et al : reciprocity and homophily among Twitter users, information diffusesion.
  • User-generated content like review as an additional source is used in recommender system.
    • The use of user-generated movie reviews from IMDb as part of a movie recommender system.
  • Research to help users find and contact with people online.
    • The information such as co-authorships are used to identify similar users.
    • Freyne and Geyer et al have done much work about relationship building.
      • Make recommendations to new users during their sign-up process.
      • Recommend Topics for Self-Descriptions in Online User Profiles
innovation
Innovation
  • Twitter’s potential as a powerful source of profiling data. This is a novel take on profiling and recommendation in itself.
  • Focus on noisy, unstructured micro-blogging data.
  • Novel contribution of the paper is that noisy as Twitter data is, it can still provide a useful recommendation signal.
approach
Approach
  • How users are profiled
    • Content-based techniques which rely on the content of tweets.
    • Collaborative filtering approaches based on the followees and followers of users.
  • How these profiles can be used to suggest interesting users to follow.
    • Lucene platform are used to develop the framework.
profiling users on twitter
Profiling Users on Twitter
  • 5 basic profiling strategies:

(1) Representing users by their own tweets (tweets(UT));

(2) By the tweets of their followees (followeetweets(UT));

(3) By the tweets of their followers (followertweets(UT));

(4) By the ids of their followees (followees(UT));

(5) By the ids of their followers (followers(UT)).

indexing recommendation
Indexing & Recommendation
  • Using Lucene’s indexing features we can represent each, UT , as a weighted term-vector, profile (UT, source).
  • profile (UT ,source) = {w1,…,wn}
  • Term weighting function: TF-IDF
  • Query-based retrieval and profile-based recommendation are then implemented using Lucene's standard retrieval function, with the target user's profile document serving as the search query in the case of the latter.
experiment dataset
Experiment—dataset
  • Imported 20,000 users directly using the Twitter API as dataset. The dataset is split into two sets of users –onecontaining1000userstoactas testusers,andalargertraining-set of19,000users;
9 different profile information
9 different profile information
  • S1: tweets(UT)
  • S2: followeestweets(UT)
  • S3: followerstweets(UT)
  • S4: tweets(UT), followeestweets(UT), followerstweets(UT)
  • S5: followee(UT)
  • S6: follower(UT)
  • S7: followee(UT), follower(UT)
  • S8: the scoring function is based on a combination of content and collaborative strategies S1 and S6;
  • S9: the scoring function is based on the position of the user in each of the recommendation lists.
recommendation precision
Recommendation Precision
  • Our basic measure of recommendation performance is the average percentage overlap between a given recommendation list and the target user's actual followees-list;
  • We can also see that relevant recommendations tend to be clustered towards the top of recommendation lists since the precision of all strategies is seen to decline within increasing recommendation-list size. Interestingly, the collaborative strategies perform better than the content strategies;
ranking effectiveness
Ranking Effectiveness
  • The position of relevant recommendations is also an important consideration, especially since we know that users focus the lion's share of their attention on items at the top of results or recommendation-lists.
a live user trial
A live-user trial
  • Shortage of the off-line evaluation ?
  • It’s unwise to discount the non-overlapping recommendations as definitively not relevant to the target user.
slide16

User Recommendation: an average of 6.9 users per recommendation-list.

User Search: an average of 4.9 of the suggested users per search.

conclusion
Conclusion

Advantage:

  • User-generated contents are used as a source of profiling data.
  • Tweet doesn’t been preprocessed.

My idea:

  • User’s tweet should be preprocessed, such as extracting tag from tweet. The tag may be more important than content.
  • Besides, other information such as the group user join in is also worthy to take into account.
  • Users can be divided into celebrity and people. For different kind of users, different strategy should be take into account.
slide19
Barry Smyth:Centre Director
  • his research interests include personalization, recommender systems, case-based reasoning, machine learning, and information retrieval. 
  • Mr. John HannonPh.D. Student
  • Mike Bennett is a postdoctoral researcher and interaction designer