ralf schenkel n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
„IP“ is not always „Internet Protocol“ A long and a very short example for IP problems in Web 2.0 research PowerPoint Presentation
Download Presentation
„IP“ is not always „Internet Protocol“ A long and a very short example for IP problems in Web 2.0 research

Loading in 2 Seconds...

play fullscreen
1 / 23

„IP“ is not always „Internet Protocol“ A long and a very short example for IP problems in Web 2.0 research - PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on

Ralf Schenkel. „IP“ is not always „Internet Protocol“ A long and a very short example for IP problems in Web 2.0 research. Joint work with Tom Crecelius, Mouna Kacimi, Sebastian Michel, Thomas Neumann, Josiane Parreira, Marc Spaniol, Gerhard Weikum. Social Tagging Networks. Common examples:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '„IP“ is not always „Internet Protocol“ A long and a very short example for IP problems in Web 2.0 research' - shel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ralf schenkel
Ralf Schenkel

„IP“ is not always „Internet Protocol“A long and a very short example for IP problems in Web 2.0 research

Joint work with Tom Crecelius, Mouna Kacimi, Sebastian Michel, Thomas Neumann, Josiane Parreira, Marc Spaniol, Gerhard Weikum

social tagging networks
Social Tagging Networks

Common examples:

  • Flickr (images)
  • YouTube (videos)
  • del.icio.us (bookmarks)
  • Librarything (books)
  • Discogs (CDs)
  • CiteULike (papers)
  • Facebook
  • Myspace (media)

Definition: Social Tagging Network

Website where people

  • publish + tag information
  • review + rate information
  • publish their interests
  • maintain network of friends
  • interact with friends

Dagstuhl Perspectives Workshop Web 2.0

some statistics
Some Statistics

Flickr: (as of Nov 2007)

  • 2+ billion photos

Facebook: (as of Apr 2007)

  • 1.8 billion photos
  • 31 million active users
  • 100,000 new users per day

Myspace: (as of Apr 2007)

  • 135 million users (6th largest country on Earth)
  • 2+ billion images (150,000 req/s), millions added daily
  • 25 million songs
  • 60TB videos

Huge volume of highly dynamic data

Dagstuhl Perspectives Workshop Web 2.0

showcase librarything com
Showcase: librarything.com

Tags

Ratings

Others

Books

Dagstuhl Perspectives Workshop Web 2.0

librarything com social interaction
librarything.com: Social Interaction

Similar Users

Comments

Explicit Friends

Dagstuhl Perspectives Workshop Web 2.0

librarything com tag clouds
librarything.com: Tag Clouds

Dagstuhl Perspectives Workshop Web 2.0

librarything com search
librarything.com: Search

Search results independent of the querying user(and the social context)

Dagstuhl Perspectives Workshop Web 2.0

outline
Outline
  • Introduction
  • Modelling Social Tagging Networks
    • Graph Model
    • Different Information Needs
  • Effective Query Scoring
  • Efficient Query Evaluation
  • Summary & Further Challenges

Dagstuhl Perspectives Workshop Web 2.0

social network model
Social Network Model

travelChina

queueingtheory

travelNorway

USERS

TAGS

ITEMS

Dagstuhl Perspectives Workshop Web 2.0

social network model1
Social Network Model

travelChina

queueingtheory

travelNorway

USERS

TAGS

ITEMS

Dagstuhl Perspectives Workshop Web 2.0

social network model2
Social Network Model

travel

queues

travel

probability

travel

probability

travel

tripvldb

travelChina

queueingtheory

travelNorway

USERS

TAGS

harrypotter

ITEMS

Dagstuhl Perspectives Workshop Web 2.0

information need 1 global
Information Need 1: Global

travel

queues

travel

probability

travel

probability

travel

tripvldb

travelChina

queueingtheory

travelNorway

USERS

harry potter

TAGS

harrypotter

ITEMS

Tags by all users equally important

Dagstuhl Perspectives Workshop Web 2.0

information need 2 similar users
Information Need 2: Similar Users

travel

queues

travel

probability

travel

probability

travel

tripvldb

travelChina

queueingtheory

?

travelNorway

USERS

travel

TAGS

harrypotter

Tags by users with similar tags/itemsmore important

ITEMS

Dagstuhl Perspectives Workshop Web 2.0

information need 3 trusted friends
Information Need 3: Trusted Friends

travel

queues

travel

probability

travel

probability

travel

tripvldb

travelChina

queueingtheory

?

travelNorway

USERS

probability

TAGS

harrypotter

ITEMS

Tags by closely related usersmore important

Dagstuhl Perspectives Workshop Web 2.0

wishlist for social aware social search
Wishlist for Social-Aware Social Search
  • Search results depend on
    • Global popularity of items
    • Collection context of the querying user (books, tags)
    • Social context of the querying user (trusted friends)
  • Automatic tag expansion (beyond synonyms)
  • Scalable query processing
  • Explanation of results

(similar wishlist for social recommendations)

Dagstuhl Perspectives Workshop Web 2.0

fast forward
Fast Forward…

Imagine a 20 minutes talk about

quantified friendship measures,

personalized scoring models,

dynamic tag expansion,

scalable query processing, …

  • Essence:
  • Context-aware personalized search
  • Tags from closely related users are more important
  • Different kinds of „relatedness“ possible

[SIGIR 2008]

Dagstuhl Perspectives Workshop Web 2.0

experimental evaluation effectiveness
Experimental Evaluation: Effectiveness

Systematic evaluation of result quality difficult

Three possible setups:

  • Manual queries + human assessments
  • Queries+assessments derived from external info (ex: DMOZ categories)
  • Automated assessments from context of user
    • Items tagged by friends
    • Items tagged in the future

?

Dagstuhl Perspectives Workshop Web 2.0

prototype implementation
Prototype Implementation

Not on the Web!

[SIGIR Demo 2008], [VLDB Demo 2008]

Dagstuhl Perspectives Workshop Web 2.0

preliminary user study
Preliminary User Study

LibraryThing user study: [Data Engineering Bulletin, June 2008]

  • 6 librarything users with reasonably large library and friend sets
  • Overall 49 queries
  • Crawled (part of) librarything: ~1,3 mio books, ~15 mio tags, ~12,000 users, ~18,000 friends
  • Measured NDCG[10]

Authors of the paper

(1-α) (content)

(1-α)

(graph)

Dagstuhl Perspectives Workshop Web 2.0

we need a benchmark collection but
We need a benchmark collection, but…
  • Everybody „has“ data from Flickr, librarything
  • Data contains private information by definition
  • Data cannot be successfully anonymized (AOL)
  • Data must not be anonymized(we need the users to assess results)
  • Data must be large scale(a few volunteers are not enough)
  • Collection must be completely offline availablefor stability of results (including images,…)

Dagstuhl Perspectives Workshop Web 2.0

online information is volatile
Online Information is Volatile
  • Huge amount of information available online only today
  • Easily lost (hardware failure, software failure, human failure, deletion, attack, …)
  • Easily unaccessible (anybody knows Interleaf?)
  • Easily manipulated
  • How will historians learn about the 21th century?

Strong need for long-term preservation

of the evolving Web

Dagstuhl Perspectives Workshop Web 2.0