towards ontology learning from folksonomies
Download
Skip this Video
Download Presentation
Towards Ontology Learning from Folksonomies

Loading in 2 Seconds...

play fullscreen
1 / 23

Towards Ontology Learning from Folksonomies - PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on

Towards Ontology Learning from Folksonomies. Jie Tang * , Ho-fung Leung # , Qiong Luo + , Dewei Chen * , and Jibin Gong * * Dept. of Computer Science and Technology, Tsinghua University # Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Towards Ontology Learning from Folksonomies' - gagan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
towards ontology learning from folksonomies

Towards Ontology Learning from Folksonomies

Jie Tang*, Ho-fung Leung#, Qiong Luo+,

Dewei Chen*, and Jibin Gong*

*Dept. of Computer Science and Technology, Tsinghua University

#Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong

+Dept. of Computer Science, Hong Kong U. of Science and Technology

July. 14th 2009

motivation
Motivation
  • The Semantic Web aims to provide a Web environment in which each Web document is annotated with machine-readable metadata (e.g., concept from an ontology).
    • Manual annotation tool, e.g., Protégé (Noy, et al., IS’01)
    • Automatic annotation methods using ML, e.g., iASA (Tang, et al., JoDS’05), TCRF(Tang, et al., ISWC’06)
  • Folksonomy provides a way to annotate the Web…
    • , but a really free way……
    • It also poses a big challenge in reliability and consistency due to the lack of terminological control.
  • This work aims to learn ontology from folksonomies
motivating example1
Motivating Example
  • Several key challenges:
    • How to define this problem in a principled way?
    • How to model the synonym/hypernym/homonym between tags?
    • How to construct the hierarchical ontology according to the modeling results?
our solution
Our Solution

Divergence

  • Use topic to model tags and documents.
  • Define four divergence measures to estimate the difference between tags.
  • Present an algorithm to construct the hierarchical structure from the tags.

tags

documents

Topic

outline
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work

previous work
Previous Work
  • Ontology learning from text
  • WebOntEx (Han and Elmasri, 03);
  • Protégé plug-in (Buitelaar et al., 99);
  • (Maedche and Staab, 2001; Sleeman et al., 03); etc.
  • Folksonomy integration
  • Learning syno-/hyper-nym between tags(Li et al., 07);
  • Clustering tags (Specia and Motta, 2007);
  • Learning hierarchical relations between tags (Zhou et al., 07);
  • Non-taxonomic relations (Mori et al., 06); etc.

tags

Topic

documents

  • Topic models
  • PLSI (Hofmann, 1999); LDA (Blei et al., 03); Author-topic model (Steyvers et al., 04); etc.
outline1
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work

how to model tags and documents
How to model tags and documents?
  • Input: Assume that a tag tiis used to annotate multiple documents and a document d contains a vector wdof Ndwords. Then a set of tags with the annotated documents can be represented as
  • Modeling: how to represent each document and each tag? and how to characterize the relationship between documents and tags?

words

tags

Tag-Topic (TT) Models

topic

generative story of tagging
Generative Story of Tagging

Generative process

Document

Latent Dirichlet Co-clustering

We present a generative model for clustering documents and terms. Our model is a four hierarchical bayesian model. We present efficient inference techniques based on Markow Chain Monte Carlo. We report results in document modeling, document and terms clustering …

NLP

IR

mining 0.23

clustering 0.19

classification 0.17

….

P(w|z)

ML

clustering

DM

inference

Data mining

NLP

IR

model 0.23

learning 0.19

boost 0.17

….

P(w|z)

DM

ML

Tags: Data mining, clustering, probabilistic model

probabilistic model

……

tag topic tt models
Tag-Topic (TT) Models

Generative process:

words

tags

Topic

Tag-Topic (TT) Models

topic smoothing
Topic Smoothing

The new objective function:

with

Smoothing term

Log-likelihood of the tag-topic (TT) model.

divergence estimation
Divergence Estimation

Estimated topic distribution

  • Tag divergence
  • Hypernym-divergence
  • Merging-divergence
  • Keep-divergence

Posterior probability derived from the topic modeling results

hierarchical structure construction
Hierarchical Structure Construction

Correspond to a divergence

Penalty to the complex of the generated hierarchy

Step 1.

Step 2.

outline2
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work

data sets and evaluation measures
Data Sets and Evaluation Measures
  • Data sets
    • PAPER: 4,841 papers and their associated tags (8,071 unique tags and a total of 37,010 tags) from CITEULIKE
    • MOVIE: 4,009 movies and their tags (18,559 unique tags and a total of 142,498 tags) from IMDB
  • Evaluation Measures
    • Accuracy (against ODP or human judgement)
    • Case study
  • Baseline
    • Hierarchical clustering
case study movie
Case Study—Movie

By clustering

By TT

By TT with smoothing

case study paper
Case Study—Paper

By TT

By TT with smoothing

outline3
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work

conclusion
Conclusion
  • Formalize a novel problem of ontology learning from folksonomies.
  • Exploit a probabilistic topic model to model the tags and their annotated documents and propose four divergence measures.
  • Present an algorithm to construct the hierarchical structure from tags.
  • Experimental results on two different types of real-world data sets show that our method can effectively learn the ontological hierarchy from social tags.
future work
Future Work
  • Discover non-taxonomic relationship between tags
  • Ontology learning from noisy tags
  • Incremental ontology learning from the dynamic tagging space
  • Applications:
    • Personalized tag recommendation
    • Social tagging—guiding the tagging process
thanks

Thanks!

Q&A

HP: http://keg.cs.tsinghua.edu.cn/persons/tj/

Open resource will be available soon at:

http://arnetminer.org/resources

ad