1 / 56

The Social Web or how flickr changed my life

The Social Web or how flickr changed my life. Kristina Lerman USC Information Sciences Institute http://www.isi.edu/~lerman. Web 1.0. Web 2.0. Elements of Social Web. Users contribute content

Leo
Download Presentation

The Social Web or how flickr changed my life

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Social Web orhow flickr changed my life Kristina Lerman USC Information Sciences Institute http://www.isi.edu/~lerman

  2. Web 1.0

  3. Web 2.0

  4. Elements of Social Web • Users contribute content • Images (Flickr, Zoomr), news stories (Digg, Reddit), bookmarks (Delicious, Bibsonomy), videos (YouTube, Vimeo), … • Users add metadata to content • Tags: annotate content with freely chosen keywords • Discussion: leave comments • Evaluation: active through voting or passive through views & favorites • Users create social networks • Add other users as friends/contacts • Sites provide an easy interface to track friends’ activities • Transparency • Publicly navigable content and metadata

  5. Flickr submitter tags discussion image stats

  6. User profile

  7. User’s tags • Tags are keyword-based metadata added to content • Help users organize their own data • Facilitate searching and browsing for information • Freely chosen by user

  8. User’s favorite images (by other photographers)

  9. So what? By exposing human activity, Social Web allows users to exploit the intelligence and opinions of others to solve problems • New way of interacting with information • Social Information Processing • Exploit collective effects • Word of mouth to amplify good information • Amenable to analysis • Design optimal social information processing systems Challenge for AI: harness the power of collective intelligence to solve information processing problems

  10. Outline for the rest of the talk User-contributed metadata can be used to solve following information processing problems • Discovery Collectively added tags used for information discovery • Personalization User-added metadata, in the form of tags and social networks, used to personalize search results • Recommendation Social networks for information filtering • Dynamics of collaboration Mathematical study of collaborative rating system

  11. Discovery personalization recommendation dynamics of collaboration with: Anon Plangrasopchok

  12. Information discovery • Goal: Automatically find resources that provide some functionality • weather conditions, flight tracking, geocoding, … • Simpler goal: Find resources that provide the same functionality as the seed, e.g., http://flytecomm.com • Improve robustness of information integration applications • Increase coverage of the applications • Approach: Leverage user-contributed tags to discover new resources similar to the seed

  13. Anatomy of Delicious resource popular tags user notes user tags

  14. Probabilistic approach • Find a compressed description of the source • Extract “latent topics” in a collection of sources, using Probabilistic Generative Model • Compute pair-wise similarity between the seed and a source using compressed description Sources Probabilistic Model Compressed description Tags Users Compute Source Similarity Similar sources (sorted)

  15. Alternative models U R I Z T Nt D ITM pLSA MWA Z R Z R U T T Nt Nb D [Plangrasopchok & Lerman, in IIWeb’07] [Hoffman, in UAI’99] [Wu+, in WWW’06]

  16. Datasets • Seed resources: flytecomm, geocoder, wunderground • For each seed, retrieve the 20 popular tags • For each tag, retrieve other resources annotated with same tag • For each resource, retrieve all resource-user-tag triples

  17. Experimental results • # of sources with similar functionality to the seed found • pLSA – ignores users • MWA – naïve Bayes • ITM – our model (user interests and source topics) • Google – ‘find similar pages’ Plangrasopchok & Lerman, “Exploiting Social Annotation for Resource Discovery” in AAAI IIWeb workshop, 2007

  18. Summary and future work • Exploit tagging activities of different users to find data sources similar to the seed • Future work • Extend the probabilistic model to learn topic hierarchies (aka folksonomies) • Travel • Flights • Booking • Status • Hotels • Booking • Reviews • Car rentals • Destinations

  19. discovery Personalization recommendation dynamics of collaboration with: Anon Plangrasopchok & Michael Wong

  20. Image search on Flickr Tag search finds all images tagged with a given keyword … It is prone to ambiguity • Beetle • Insect • Car model • Tiger • Panthera tigris • House cat • Shark (tiger shark) • Mac OS X • Flower (tiger lily) • Newborn • Baby • Kitten • Puppy • Etc…

  21. Plain tag search Relevance results for top 500 images retrieved by tag search (manually labeled using the first sense of each keyword)

  22. Personalizing search results Users express their tastes and preferences through the metadata they create • Contacts they add to their social networks • Tags they add to their own images • Images they mark as their favorite • Groups they join Use this metadata to improve image search results! • Personalizing by tags • Personalizing by contacts • Restrict results of image search to those images that were submitted by user u ‘s friends (Level 1 contacts)

  23. Personalizing by contacts results L1+L2: 9%-16% average improvement in precision

  24. Personalizing by tags • Users often add descriptive metadata to images • Tags • Titles • Image descriptions • Add image to groups • Personalizing by tags • Find (hidden) topics of interest to the user • Find images in the search results related to these topics

  25. Probabilistic topic model • Tagging as a stochastic process • User u posts an image i • Based on u’s interests, topics z are chosen • Tag t is selected based on z • Probabilistic topic model • Use EM to estimate p(t|z) and p(z|u) from data • To find topics in each search set of 4500 images U Z T Nt I

  26. p(t|z) “tiger” image set: 4500 images trained on 10 topics

  27. Personalizing by tags: Results newborn Precision of N top ranked search results, compared to plain search • 4 users chosen to be interested in the first sense of search term • Plain search – Flickr’s ordering of search results beetle Lerman et al., “Personalizing Image Search Results on Flickr” in AAAI ITWP workshop, 2007

  28. Summary & future work • Improve results of image search for an individual user as long as the user has expressed interest in the topic of search • Future work • Lots of other metadata to exploit • Favorites, groups, image titles and descriptions • Discover relevant synonyms to expand search • Topics that are new to the user? • Exploit collective knowledge to find communities of interest • Identify authorities within those communities

  29. discovery personalization Recommendation dynamics of collaboration with: Dipsy Kapoor

  30. Social News Aggregation on Digg • Users submit stories • Users vote on (digg) stories • Select stories promoted to the front page based on received votes • Collaborative front page emerges from the opinions of many users, not few editors • Users create social networks by adding others as friends • Friends Interface makes it easy to track friends’ activities • Stories friends submitted • Stories friends dugg (voted on)

  31. Top users • Digg ranks users Based on how many of their stories were promoted to front page • User with most stories is ranked #1, … • Top 1000 users data Collected by scraping Digg … now available through the API • Usage statistics • User rank • How many stories user submitted, dugg, commented on • Social networks • Friends: outgoing links A  B := B is a friend of A • Reverse friends: incoming links A  B := A is a reverse friend of B

  32. Digg datasets • To see how votes change in time • Tracked 2858 stories submitted over a period > day in May 2006 • Only 98 stories were promoted to the front page • To see how users vote on stories • For ~200 front page stories • Names of users who voted on (dugg) the story

  33. Dynamics of votes Top users’ stories

  34. `Interestingness’ distribution 50 stories from 14 users ave. max votes=600 48 stories from 45 users ave. max votes=1050 Top users are not submitting the most “interesting” stories

  35. Social filtering as recommendation Social filtering explains why top users are so successful • Users express their preferences by creating social networks • Use these networks – through the Friends Interface – to find new stories to read • Claim 1: Users digg stories their friends submit • Claim 2: Users digg stories their friends digg

  36. Social network on Digg Top 1000 Digg users

  37. How Friends interface works ‘see stories my friends submitted’ submitter ‘see stories my friends dugg’ … …

  38. Users digg stories submitted by friends Number of diggs coming from submitter’s friends num diggs from friends num reverse friends Probability that that many friends dugg a story by chance is P=0.005 Lerman, “Social Browsing & Information Filtering in Social Media” submitted to JCMC

  39. `Tyranny of the minority’ Top users submit lion’s share of front page stories • Explained by social filtering • Top users have bigger, more active social networks • Conspiracy: alternative explanation of top user success • Top users accused of colluding to automatically promote each other’s stories • Resulting uproar led Digg to change its story promotion algorithm … • To discount votes coming from friends • Led to greater front page diversity, but also unintended consequences

  40. Design of collaborative rating systems • Designing a collaborative rating system, which exploits the emergent behavior of many independent evaluators, is difficult • Small changes can have big consequences • Few tools to predict system behavior • Execution • Simulation • Can we explore the effects of promotion algorithms before they are implemented?

  41. discovery personalization recommendation Dynamics of collaboration with: Dipsy Kapoor

  42. Analysis as a design tool Mathematical analysis can help understand and predict the emergent behavior of collaborative information systems • Study the choice of the promotion algorithm before it is implemented • Effect of design choices on system behavior • story timeliness, interestingness, user participation, incentives to join social networks, etc.

  43. Dynamics of collaborative rating Story is characterized by • Interestingness r • probability a story will received a vote when seen by a user • Visibility • Visibility on the upcoming stories page • Decreases with time as new stories are submitted • Visibility on the front page • Decreases with time as new stories are promoted • Visibility through the friends interface • Stories friends submitted • Stories friends dugg (voted on)

  44. Mathematical model • Mathematical model describes how the number of votes m(t) changes in time • Solve equation • Solutions parametrized by S, r • Other parameters estimated from data

  45. Dynamics of votes model data Lerman, “Social Information Processing in Social News Aggregation” Internet Computing (in press) 2007

  46. Exploring the parameter space Minimum S required for the story to be promoted for a given r for a fixed promotion threshold Time taken for a story with r and S to be promoted to the front page for a fixed promotion threshold

  47. Dynamics of user influence • Digg ranked users according to how many front page stories they had • Model of the dynamics of user influence • Number of stories promoted to the front page F • User’s social network growth S user1 user2 user3 user4 user5 user6

  48. Model of rank dynamics • Number of stories promoted to the front page F • Number of stories M submitted over Dt=week • User’s promotion success rate ~ S(t) • User’s social network S grows as • Others discover him through new front page stories ~DF • Others discover him through the Top Users list ~g(F) • Solve equations • Estimate b, c, g(F) from data

  49. Solutions 1 user2 data user2 model user6 data user6 model Lerman, “Dynamics of Collaborative Rating of Information” in KDD/SNA workshop, 2007

  50. Solutions 2 user1 data user1 model user5 data user5 model Lerman, “Dynamics of Collaborative Rating of Information” in KDD/SNA workshop, 2007

More Related