Automated tag clustering improving search and exploration in the tagspace may 2006
Download
1 / 22

PPT presentation - PowerPoint PPT Presentation


  • 350 Views
  • Updated On :

Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006 Grigory Begelman Technion Israel Institute of Technology Computer Science Dpt [email protected] Philipp Keller Citrin Informatik GmbH [email protected] Frank Smadja RawSugar [email protected]

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'PPT presentation' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Automated tag clustering improving search and exploration in the tagspace may 2006 l.jpg

Collaborative Tagging Workshop, WWW2006

Automated Tag Clustering:Improving Search and Exploration in the TagSpaceMay 2006

Grigory BegelmanTechnion IsraelInstitute ofTechnologyComputer Science [email protected]

Philipp KellerCitrin Informatik [email protected]

Frank [email protected]


Problem 1 searching the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 1:Searching the TagSpace

How would

You tag this?

How would

You search

For it?

Tags: Ikura, Uni, Ebi, Sushi, Nigiri, Japanese food, lunch in Tokyo, Ezobafun-uni, Kitamurashiuni, Murasakiuni, Akazaebi, Tenagaebi, etc.


Problem 2 exploring the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 2: Exploring the TagSpace

Locations

Restaurant Type

morphology

Not a restaurant!


Problem 3 exploring the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 3: Exploring the TagSpace

Not usable !


What is missing tag relations l.jpg

Collaborative Tagging Workshop, WWW2006

What is Missing?Tag relations

  • “Tag Relations improve searchability and exploration.”

  • Similar tags:

  • Spelling and morphology: macos<->mac_os<->mac os; tagging <-> tags <->tagged,

  • Synonyms: macos <-> tiger; films <-> movies; new york <-> nyc;

  • Related: cooking <-> recipes, software development <-> programming,

  • Tag groups or subtags:

  • Location -> san francisco, london, new york, etc.

  • Food -> sushi, sashimi, pizza, etc.

  • Programming -> html, java, css, etc.

Goal : Discover them by Mining the tag space


Related work l.jpg

Collaborative Tagging Workshop, WWW2006

Related Work

Tagger’s nightmare!!

Top Down Predefined taxonomy

Rigid - Not scalable - Expensive



Rawsugar tag hierarchy guided navigation l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar – Tag HierarchyGuided Navigation

Food groups

Origins groups

Locations

groups


Rawsugar tag hierarchy l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar Tag Hierarchy

  • Key idea: Some users (4%) define tag hierarchies – (food>sushi, european>spanish, …)

  • We mine this tag space to learn simple tag-relations (ISA relations and RELATED) using probabilities.

  • At search time: We apply this learned knowledge to group tags from results.


Rawsugar guided search combining hierarchy fragments l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar –Guided Search Combining Hierarchy Fragments

User 3

User 1

food

europe

cooking

recipes

UK

Scotland

User4

Edinburgh

Spain

Asian

Chinese

Italy

Thai

User 2

User 5

food

Southwest

vegetarian

California

Sushi

Bay Area

San Francisco

Texas


Related work11 l.jpg

Collaborative Tagging Workshop, WWW2006

Related work

Rashmi Sinha: “Tag Sorting: Another tool in an information architect's toolbox” http://www.rashmisinha.com/archives/05_02/tag-sorting.html

Emanuele Quintarelli: “Hierarchical taxonomies from flat tag spaces” http://www.infospaces.it/wordpress/topics/information-architecture/91

Paul Heyman (Stanford): “Tag Hierarchies” http://i.stanford.edu/~heymann/taghierarchy.html

Brooks, Montanez, University of San Francisco: “Improved Annotation of the Blogopshere via Autotagging and Hierarchical Clustering ” http://www.cs.usfca.edu/~brooks/papers/brooks-montanez-www06.pdf

Siderean fac.etio.us: “Faceted search on delicious tags” http://www.siderean.com/delicious/facetious.jsp

Marti Hearst: “Clustering vs. Faceted Search”

http://bailando.sims.berkeley.edu/papers/cacm06.pdf

And more …


1 get tag metadata l.jpg
1. Get tag metadata

Collaborative Tagging Workshop, WWW2006


Slide13 l.jpg

2. Build tag relation graph

Collaborative Tagging Workshop, WWW2006


3 compute similarity l.jpg
3. Compute similarity

Collaborative Tagging Workshop, WWW2006


Slide15 l.jpg

4. Cluster

Collaborative Tagging Workshop, WWW2006


Slide16 l.jpg

Results/Problems: Definition of „internet“

Collaborative Tagging Workshop, WWW2006


Slide17 l.jpg

Results/Problems: Ambiguity

Collaborative Tagging Workshop, WWW2006


Slide18 l.jpg

Results/Problems: Clustering needs lot of tuning

Collaborative Tagging Workshop, WWW2006


Slide19 l.jpg

Possible application: Group popular bookmarks

Collaborative Tagging Workshop, WWW2006




Tags that belong to the same clusters l.jpg

Collaborative Tagging Workshop, WWW2006

Tags that belong to the same clusters -


ad