Towards ontology learning from folksonomies
Download
1 / 23

Towards Ontology Learning from Folksonomies - PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on

Towards Ontology Learning from Folksonomies. Jie Tang * , Ho-fung Leung # , Qiong Luo + , Dewei Chen * , and Jibin Gong * * Dept. of Computer Science and Technology, Tsinghua University # Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Towards Ontology Learning from Folksonomies' - gagan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Towards ontology learning from folksonomies

Towards Ontology Learning from Folksonomies

Jie Tang*, Ho-fung Leung#, Qiong Luo+,

Dewei Chen*, and Jibin Gong*

*Dept. of Computer Science and Technology, Tsinghua University

#Dept. of Computer Science and Engineering, The Chinese U. of Hong Kong

+Dept. of Computer Science, Hong Kong U. of Science and Technology

July. 14th 2009


Motivation
Motivation

  • The Semantic Web aims to provide a Web environment in which each Web document is annotated with machine-readable metadata (e.g., concept from an ontology).

    • Manual annotation tool, e.g., Protégé (Noy, et al., IS’01)

    • Automatic annotation methods using ML, e.g., iASA (Tang, et al., JoDS’05), TCRF(Tang, et al., ISWC’06)

  • Folksonomy provides a way to annotate the Web…

    • , but a really free way……

    • It also poses a big challenge in reliability and consistency due to the lack of terminological control.

  • This work aims to learn ontology from folksonomies



Motivating example1
Motivating Example

  • Several key challenges:

    • How to define this problem in a principled way?

    • How to model the synonym/hypernym/homonym between tags?

    • How to construct the hierarchical ontology according to the modeling results?


Our solution
Our Solution

Divergence

  • Use topic to model tags and documents.

  • Define four divergence measures to estimate the difference between tags.

  • Present an algorithm to construct the hierarchical structure from the tags.

tags

documents

Topic


Outline
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work


Previous work
Previous Work

  • Ontology learning from text

  • WebOntEx (Han and Elmasri, 03);

  • Protégé plug-in (Buitelaar et al., 99);

  • (Maedche and Staab, 2001; Sleeman et al., 03); etc.

  • Folksonomy integration

  • Learning syno-/hyper-nym between tags(Li et al., 07);

  • Clustering tags (Specia and Motta, 2007);

  • Learning hierarchical relations between tags (Zhou et al., 07);

  • Non-taxonomic relations (Mori et al., 06); etc.

tags

Topic

documents

  • Topic models

  • PLSI (Hofmann, 1999); LDA (Blei et al., 03); Author-topic model (Steyvers et al., 04); etc.


Outline1
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work


How to model tags and documents
How to model tags and documents?

  • Input: Assume that a tag tiis used to annotate multiple documents and a document d contains a vector wdof Ndwords. Then a set of tags with the annotated documents can be represented as

  • Modeling: how to represent each document and each tag? and how to characterize the relationship between documents and tags?

words

tags

Tag-Topic (TT) Models

topic


Generative story of tagging
Generative Story of Tagging

Generative process

Document

Latent Dirichlet Co-clustering

We present a generative model for clustering documents and terms. Our model is a four hierarchical bayesian model. We present efficient inference techniques based on Markow Chain Monte Carlo. We report results in document modeling, document and terms clustering …

NLP

IR

mining 0.23

clustering 0.19

classification 0.17

….

P(w|z)

ML

clustering

DM

inference

Data mining

NLP

IR

model 0.23

learning 0.19

boost 0.17

….

P(w|z)

DM

ML

Tags: Data mining, clustering, probabilistic model

probabilistic model

……


Tag topic tt models
Tag-Topic (TT) Models

Generative process:

words

tags

Topic

Tag-Topic (TT) Models


Topic smoothing
Topic Smoothing

The new objective function:

with

Smoothing term

Log-likelihood of the tag-topic (TT) model.


Divergence estimation
Divergence Estimation

Estimated topic distribution

  • Tag divergence

  • Hypernym-divergence

  • Merging-divergence

  • Keep-divergence

Posterior probability derived from the topic modeling results


Hierarchical structure construction
Hierarchical Structure Construction

Correspond to a divergence

Penalty to the complex of the generated hierarchy

Step 1.

Step 2.


Outline2
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work


Data sets and evaluation measures
Data Sets and Evaluation Measures

  • Data sets

    • PAPER: 4,841 papers and their associated tags (8,071 unique tags and a total of 37,010 tags) from CITEULIKE

    • MOVIE: 4,009 movies and their tags (18,559 unique tags and a total of 142,498 tags) from IMDB

  • Evaluation Measures

    • Accuracy (against ODP or human judgement)

    • Case study

  • Baseline

    • Hierarchical clustering



Case study movie
Case Study—Movie

By clustering

By TT

By TT with smoothing


Case study paper
Case Study—Paper

By TT

By TT with smoothing


Outline3
Outline

Related Work

Our Approach

Modeling Folksonomy

Divergence Estimation

Hierarchical Structure Construction

Experiments

Conclusion & Future Work


Conclusion
Conclusion

  • Formalize a novel problem of ontology learning from folksonomies.

  • Exploit a probabilistic topic model to model the tags and their annotated documents and propose four divergence measures.

  • Present an algorithm to construct the hierarchical structure from tags.

  • Experimental results on two different types of real-world data sets show that our method can effectively learn the ontological hierarchy from social tags.


Future work
Future Work

  • Discover non-taxonomic relationship between tags

  • Ontology learning from noisy tags

  • Incremental ontology learning from the dynamic tagging space

  • Applications:

    • Personalized tag recommendation

    • Social tagging—guiding the tagging process


Thanks

Thanks!

Q&A

HP: http://keg.cs.tsinghua.edu.cn/persons/tj/

Open resource will be available soon at:

http://arnetminer.org/resources