Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006 - PowerPoint PPT Presentation

Automated tag clustering improving search and exploration in the tagspace may 2006 l.jpg
Download
1 / 22

Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006 Grigory Begelman Technion Israel Institute of Technology Computer Science Dpt gbeg@cs.technion.ac.il Philipp Keller Citrin Informatik GmbH phred@citrin.ch Frank Smadja RawSugar frank@rawsugar.com

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Automated tag clustering improving search and exploration in the tagspace may 2006 l.jpg

Collaborative Tagging Workshop, WWW2006

Automated Tag Clustering:Improving Search and Exploration in the TagSpaceMay 2006

Grigory BegelmanTechnion IsraelInstitute ofTechnologyComputer Science Dptgbeg@cs.technion.ac.il

Philipp KellerCitrin Informatik GmbHphred@citrin.ch

Frank SmadjaRawSugarfrank@rawsugar.com


Problem 1 searching the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 1:Searching the TagSpace

How would

You tag this?

How would

You search

For it?

Tags: Ikura, Uni, Ebi, Sushi, Nigiri, Japanese food, lunch in Tokyo, Ezobafun-uni, Kitamurashiuni, Murasakiuni, Akazaebi, Tenagaebi, etc.


Problem 2 exploring the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 2: Exploring the TagSpace

Locations

Restaurant Type

morphology

Not a restaurant!


Problem 3 exploring the tagspace l.jpg

Collaborative Tagging Workshop, WWW2006

Problem 3: Exploring the TagSpace

Not usable !


What is missing tag relations l.jpg

Collaborative Tagging Workshop, WWW2006

What is Missing?Tag relations

  • “Tag Relations improve searchability and exploration.”

  • Similar tags:

  • Spelling and morphology: macos<->mac_os<->mac os; tagging <-> tags <->tagged,

  • Synonyms: macos <-> tiger; films <-> movies; new york <-> nyc;

  • Related: cooking <-> recipes, software development <-> programming,

  • Tag groups or subtags:

  • Location -> san francisco, london, new york, etc.

  • Food -> sushi, sashimi, pizza, etc.

  • Programming -> html, java, css, etc.

Goal : Discover them by Mining the tag space


Related work l.jpg

Collaborative Tagging Workshop, WWW2006

Related Work

Tagger’s nightmare!!

Top Down Predefined taxonomy

Rigid - Not scalable - Expensive


Flickr clusters l.jpg

Collaborative Tagging Workshop, WWW2006

Flickr – Clusters


Rawsugar tag hierarchy guided navigation l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar – Tag HierarchyGuided Navigation

Food groups

Origins groups

Locations

groups


Rawsugar tag hierarchy l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar Tag Hierarchy

  • Key idea: Some users (4%) define tag hierarchies – (food>sushi, european>spanish, …)

  • We mine this tag space to learn simple tag-relations (ISA relations and RELATED) using probabilities.

  • At search time: We apply this learned knowledge to group tags from results.


Rawsugar guided search combining hierarchy fragments l.jpg

Collaborative Tagging Workshop, WWW2006

RawSugar –Guided Search Combining Hierarchy Fragments

User 3

User 1

food

europe

cooking

recipes

UK

Scotland

User4

Edinburgh

Spain

Asian

Chinese

Italy

Thai

User 2

User 5

food

Southwest

vegetarian

California

Sushi

Bay Area

San Francisco

Texas


Related work11 l.jpg

Collaborative Tagging Workshop, WWW2006

Related work

Rashmi Sinha: “Tag Sorting: Another tool in an information architect's toolbox” http://www.rashmisinha.com/archives/05_02/tag-sorting.html

Emanuele Quintarelli: “Hierarchical taxonomies from flat tag spaces” http://www.infospaces.it/wordpress/topics/information-architecture/91

Paul Heyman (Stanford): “Tag Hierarchies” http://i.stanford.edu/~heymann/taghierarchy.html

Brooks, Montanez, University of San Francisco: “Improved Annotation of the Blogopshere via Autotagging and Hierarchical Clustering ” http://www.cs.usfca.edu/~brooks/papers/brooks-montanez-www06.pdf

Siderean fac.etio.us: “Faceted search on delicious tags” http://www.siderean.com/delicious/facetious.jsp

Marti Hearst: “Clustering vs. Faceted Search”

http://bailando.sims.berkeley.edu/papers/cacm06.pdf

And more …


1 get tag metadata l.jpg

1. Get tag metadata

Collaborative Tagging Workshop, WWW2006


Slide13 l.jpg

2. Build tag relation graph

Collaborative Tagging Workshop, WWW2006


3 compute similarity l.jpg

3. Compute similarity

Collaborative Tagging Workshop, WWW2006


Slide15 l.jpg

4. Cluster

Collaborative Tagging Workshop, WWW2006


Slide16 l.jpg

Results/Problems: Definition of „internet“

Collaborative Tagging Workshop, WWW2006


Slide17 l.jpg

Results/Problems: Ambiguity

Collaborative Tagging Workshop, WWW2006


Slide18 l.jpg

Results/Problems: Clustering needs lot of tuning

Collaborative Tagging Workshop, WWW2006


Slide19 l.jpg

Possible application: Group popular bookmarks

Collaborative Tagging Workshop, WWW2006


Slide20 l.jpg

Collaborative Tagging Workshop, WWW2006


Some good clusters found l.jpg

Collaborative Tagging Workshop, WWW2006

Some good Clusters found


Tags that belong to the same clusters l.jpg

Collaborative Tagging Workshop, WWW2006

Tags that belong to the same clusters -


  • Login