Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis - PowerPoint PPT Presentation

jacob
slide1 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis PowerPoint Presentation
Download Presentation
Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis

play fullscreen
1 / 24
Download Presentation
Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis
373 Views
Download Presentation

Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Semantic Modelling of User Interests Based on Cross-Folksonomy Analysis Martin Szomszor, Harith Alani, Kieron O’Hara, Nigel Shadbolt University of Southampton Iván Cantador Universidad Autonoma de Madrid TAGora: Semiotic Dynamics of Online Social Communities EU-IST-2006-034721

  2. Outline • Introduction and Motivation • Why is your folksonomy interaction useful? • How could it be exploited? • Architecture • Matching user accounts • Collecting Data • Tag Filtering • Profile Building • Experiment and Evaluation • Conclusions and Future Work

  3. Introduction http://news.bbc.co.uk/ http://slashdot.org/ Dream Theater Metallica Rush delicious.com

  4. Increasing number ofonline identities • Recent Ofcom study found that UK adults have on average 1.6 profiles. 39% of those that have one profile have at least 2 • Many predict that in the near future, individuals will have in excess of 10 profiles • [Ofcom 2008] Social Networking: A quantative and qualitative research report into attitudes, behaviours, and use.

  5. The Big Picture Profile of Interests delicious.com

  6. Personalisation Profiles could be exported to other sites to improve recommendation quality Profile of Interests Better user experience Profiles could be used to support personalised searching delicious.com

  7. Consolidation and Integration cuba cuba hotels holiday travel 2008 currency http://dbpedia.org/resource/Cuba http://dbpedia.org/resource/Travel http://dbpedia.org/resource/Holiday http://dbpedia.org/resource/Category:Tourism

  8. User Tagging delicious.com

  9. Tag Clouds delicious.com

  10. Tagging Variation Filtered Tags Raw Tags [1] Szomszor, M., Cantador, I. and Alani, H. (2008). Correlating User Profiles from Multiple Folksonomies. In: ACM Conference on Hypertext and Hypermedia, 2008 , Pittsburgh, Pennsylvania.

  11. Architecture for Building Profiles of Interests

  12. Account Correlation • Using Google’s Social Graph API http://users.ecs.soton.ac.uk/mns2 account homepage delicious.com

  13. Data Collection • Delicious • Custom python scripts • Flickr • Using public API • Only public information is harvested

  14. Tag Filtering Process

  15. Creating User Profiles • Three stage process: • Identify Wikipedia page • London is matched with http://en.wikipedia.org/wiki/London • Extract Category list • Host cities of the Summer Olympic Games | Host cities of the Commonwealth Games | London | 1st century establishments | British capitals | Capitals in Europe | Port cities and towns in the United Kingdom • Select representative Categories • Only choose categories that match the tag string • Excludes spurious categories such as: • Host cities of the Summer Olympic Games • Needs more sources

  16. Profile of Interest

  17. Experiment Setup • Bootstrapped using 667,141 delicious profiles obtained in previous work • Only accounts with a matching Flickr profile and > 50 distinct tags were added • Final list contains 1,392 users

  18. Evaluation • Four evaluation procedures: • The performance of the tag filtering and matching to Wikipedia Entries • The difference between the most common categories found in delicious and Flickr • The amount learnt from merging profiles from the two folksonomies • The accuracy of matching tags to Wikipedia categories

  19. Tag Filtering and Matching

  20. Global Category View • What are the differences in the interests that are learnt from each domain?

  21. Learning More About Users • How much more can we learn by using multiple profiles?

  22. Category Matching • How good is the category matching? • Take 100 random users and choose 1 Delicious tag and 1 Flickr tag • Classify tag into one of 3 classes: • Correct • Unresolved (not matched to any category) • Ambiguous (Disambiguation required)

  23. Conclusions • We have proposed a novel method for the creation of Profiles of Interest by exploiting an individual’s tagging activities across two popular folksonomy sites • Frequently used tags often specify areas of interest but not always! • Common delicious tags are daily, toread, howto • Flickr tags often include names of people • Expanding the analysis across folksonomies increases the amount learnt • On Average 15 new concepts per user

  24. Future Work • Improve page matching • 22.5% of sample tags unresolved • Handle disambiguation • 13% of sample tags refer to ambiguous terms • Cooccurrence networks • Category hierarchy • Increase network coverage • Already have the data to include Last.fm • Understand which tags actually specify an interest of the individual • Filter out categories such as ‘Surname’