Automated tag clustering improving search and exploration in the tagspace may 2006
Download
1 / 22

Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006 - PowerPoint PPT Presentation


Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006 Grigory Begelman Technion Israel Institute of Technology Computer Science Dpt gbeg@cs.technion.ac.il Philipp Keller Citrin Informatik GmbH phred@citrin.ch Frank Smadja RawSugar frank@rawsugar.com

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Automated Tag Clustering: Improving Search and Exploration in the TagSpace May 2006

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Collaborative Tagging Workshop, WWW2006

Automated Tag Clustering:Improving Search and Exploration in the TagSpaceMay 2006

Grigory BegelmanTechnion IsraelInstitute ofTechnologyComputer Science Dptgbeg@cs.technion.ac.il

Philipp KellerCitrin Informatik GmbHphred@citrin.ch

Frank SmadjaRawSugarfrank@rawsugar.com


Collaborative Tagging Workshop, WWW2006

Problem 1:Searching the TagSpace

How would

You tag this?

How would

You search

For it?

Tags: Ikura, Uni, Ebi, Sushi, Nigiri, Japanese food, lunch in Tokyo, Ezobafun-uni, Kitamurashiuni, Murasakiuni, Akazaebi, Tenagaebi, etc.


Collaborative Tagging Workshop, WWW2006

Problem 2: Exploring the TagSpace

Locations

Restaurant Type

morphology

Not a restaurant!


Collaborative Tagging Workshop, WWW2006

Problem 3: Exploring the TagSpace

Not usable !


Collaborative Tagging Workshop, WWW2006

What is Missing?Tag relations

  • “Tag Relations improve searchability and exploration.”

  • Similar tags:

  • Spelling and morphology: macos<->mac_os<->mac os; tagging <-> tags <->tagged,

  • Synonyms: macos <-> tiger; films <-> movies; new york <-> nyc;

  • Related: cooking <-> recipes, software development <-> programming,

  • Tag groups or subtags:

  • Location -> san francisco, london, new york, etc.

  • Food -> sushi, sashimi, pizza, etc.

  • Programming -> html, java, css, etc.

Goal : Discover them by Mining the tag space


Collaborative Tagging Workshop, WWW2006

Related Work

Tagger’s nightmare!!

Top Down Predefined taxonomy

Rigid - Not scalable - Expensive


Collaborative Tagging Workshop, WWW2006

Flickr – Clusters


Collaborative Tagging Workshop, WWW2006

RawSugar – Tag HierarchyGuided Navigation

Food groups

Origins groups

Locations

groups


Collaborative Tagging Workshop, WWW2006

RawSugar Tag Hierarchy

  • Key idea: Some users (4%) define tag hierarchies – (food>sushi, european>spanish, …)

  • We mine this tag space to learn simple tag-relations (ISA relations and RELATED) using probabilities.

  • At search time: We apply this learned knowledge to group tags from results.


Collaborative Tagging Workshop, WWW2006

RawSugar –Guided Search Combining Hierarchy Fragments

User 3

User 1

food

europe

cooking

recipes

UK

Scotland

User4

Edinburgh

Spain

Asian

Chinese

Italy

Thai

User 2

User 5

food

Southwest

vegetarian

California

Sushi

Bay Area

San Francisco

Texas


Collaborative Tagging Workshop, WWW2006

Related work

Rashmi Sinha: “Tag Sorting: Another tool in an information architect's toolbox” http://www.rashmisinha.com/archives/05_02/tag-sorting.html

Emanuele Quintarelli: “Hierarchical taxonomies from flat tag spaces” http://www.infospaces.it/wordpress/topics/information-architecture/91

Paul Heyman (Stanford): “Tag Hierarchies” http://i.stanford.edu/~heymann/taghierarchy.html

Brooks, Montanez, University of San Francisco: “Improved Annotation of the Blogopshere via Autotagging and Hierarchical Clustering ” http://www.cs.usfca.edu/~brooks/papers/brooks-montanez-www06.pdf

Siderean fac.etio.us: “Faceted search on delicious tags” http://www.siderean.com/delicious/facetious.jsp

Marti Hearst: “Clustering vs. Faceted Search”

http://bailando.sims.berkeley.edu/papers/cacm06.pdf

And more …


1. Get tag metadata

Collaborative Tagging Workshop, WWW2006


2. Build tag relation graph

Collaborative Tagging Workshop, WWW2006


3. Compute similarity

Collaborative Tagging Workshop, WWW2006


4. Cluster

Collaborative Tagging Workshop, WWW2006


Results/Problems: Definition of „internet“

Collaborative Tagging Workshop, WWW2006


Results/Problems: Ambiguity

Collaborative Tagging Workshop, WWW2006


Results/Problems: Clustering needs lot of tuning

Collaborative Tagging Workshop, WWW2006


Possible application: Group popular bookmarks

Collaborative Tagging Workshop, WWW2006


Collaborative Tagging Workshop, WWW2006


Collaborative Tagging Workshop, WWW2006

Some good Clusters found


Collaborative Tagging Workshop, WWW2006

Tags that belong to the same clusters -


ad
  • Login