A metric based framework for automatic taxonomy induction
Download
1 / 46

A Metric-based Framework for Automatic Taxonomy Induction - PowerPoint PPT Presentation


  • 385 Views
  • Uploaded on

Hui Yang and Jamie Callan Language Technologies Institute Carnegie Mellon University ACL2009, Singapore. A Metric-based Framework for Automatic Taxonomy Induction. Roadmap. Introduction Related Work Metric-Based Taxonomy Induction Framework The Features Experimental Results Conclusions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Metric-based Framework for Automatic Taxonomy Induction' - Melvin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A metric based framework for automatic taxonomy induction l.jpg

Hui Yang and Jamie Callan Language Technologies InstituteCarnegie Mellon UniversityACL2009, Singapore

A Metric-based Framework for Automatic Taxonomy Induction


Roadmap l.jpg
Roadmap

  • Introduction

  • Related Work

  • Metric-Based Taxonomy Induction Framework

  • The Features

  • Experimental Results

  • Conclusions


Introduction l.jpg
Introduction

  • Semantic taxonomies, such as WordNet, play an important role in solving knowledge-rich problems

  • Limitations of Manually-created Taxonomies

    • Rarely complete

    • Difficult to include new terms from emerging/changing domains

    • Time-consuming to create; May make it unfeasible for specialized domains and personalized tasks


Introduction4 l.jpg
Introduction

  • Automatic Taxonomy Induction is a solution to

    • Augment existing resources

    • Quickly produce new taxonomies for specialized domains and personalized tasks

  • Subtasks in Automatic Taxonomy Induction

    • Term extraction

    • Relation formation

  • This paper focuses on Relation Formation


Related work l.jpg
Related Work

  • Clustering-based Approaches

  • Hierarchically cluster terms based on similarities of their meanings usually represented by a feature vector

  • Have only been applied to extract is-a and sibling relations

  • Strength: Allowing discovery of relations which do not explicitly appear in text; higher recall

  • Weaknesses: Generally fail to produce coherent cluster for small corpora [Pantel and Pennacchiotti 2006]; Hard to label non-leaf nodes

Pattern-based Approaches

Define lexical-syntactic patterns for relations, and use these patterns to discover instances

Have been applied to extract Is-a, part-of, sibling, synonym, causal, etc, relations

Strength: Highly accurate

Weakness: Sparse coverage of patterns


A unified solution l.jpg
A unified solution

Metric-based Taxonomy Induction

  • Combine strengths of both approaches in a unified framework

    • Flexibly incorporate heterogeneous features

    • Use lexical-syntactic patterns as one types of features in a clustering framework


The framework l.jpg
THE FRAMEWORK

  • A novel framework, which

    • Incrementally clusters terms

    • Transforms taxonomy induction into a multi-criteria optimization

    • Using heterogeneous features

  • Optimization based on two criteria

    • Minimization of taxonomy structures 

      Minimum Evolution Assumption

    • Modeling of term abstractness 

      Abstractness Assumption


Let s begin with some important definitions l.jpg
Let’s Begin with Some Important Definitions

  • A Taxonomy is a data model

T = (C,R | D)

Concept Set

Relationship Set

Domain


More definitions l.jpg
More Definitions

A Full Taxonomy:

Game Equipment

AssignedTermSet={game equipment, ball, table, basketball, volleyball, soccer, table-tennis table, snooker table}

UnassignedTermSet={}

ball

table


More definitions10 l.jpg
More Definitions

A Partial Taxonomy

Game Equipment

AssignedTermSet={game equipment, ball, table, basketball, volleyball}

UnassignedTermSet={soccer, table-tennis table, snooker table}

ball

table


More definitions11 l.jpg
More Definitions

Ontology Metric

d( , ) = 2

distance = 1.5

distance = 2

ball

distance =1

d( , ) = 1

distance =1

d( , ) = 4.5

table


Assumptions l.jpg
Assumptions

Minimum Evolution Assumption: The Optimal Ontology is One that Introduces Least Information Changes!


Illustration l.jpg
Illustration

Minimum Evolution Assumption


Illustration14 l.jpg
Illustration

Minimum Evolution Assumption


Illustration15 l.jpg
Illustration

Minimum Evolution Assumption

ball


Illustration16 l.jpg
Illustration

Minimum Evolution Assumption

ball

table


Illustration17 l.jpg
Illustration

Minimum Evolution Assumption

Game Equipment

ball

table


Illustration18 l.jpg
Illustration

Minimum Evolution Assumption

Game Equipment

ball

table


Illustration19 l.jpg
Illustration

Minimum Evolution Assumption

Game Equipment

ball

table


Assumptions20 l.jpg
Assumptions

Abstractness Assumption: Each abstraction level has its own Information function


Assumptions21 l.jpg
Assumptions

Abstractness Assumption

Info3(.)

Game Equipment

ball

Info2(.)

table

Info1(.)


Multiple criterion optimization l.jpg
Multiple Criterion Optimization

Minimum Evolution

objective function

Abstractness

objective function

Scalarization variable


Estimating ontology metric l.jpg
Estimating Ontology Metric

  • Assume ontology metric is a linear interpolation of some underlying feature functions

  • Ridge Regression to estimate and predict the ontology metric


The features l.jpg
THE FEATURES

  • Our framework allows a wide range of features to be used

  • Input for the Feature Functions: Two terms

  • Output: A numeric score to measure semantic distance between these two terms

  • We can use the following types of feature functions, but not restricted to only these:

    • Contextual Features

    • Term Co-occurrence

    • Lexical-Syntactic Patterns

    • Syntactic Dependency Features

    • Word Length Difference

    • Definition Overlap, etc


Experimental results l.jpg
Experimental Results

  • Task: Reconstruct taxonomies from WordNet and ODP

    • Not the entire WordNet or ODP, but fragments of WordNet or ODP

  • Ground Truth: 50 hypernym taxonomies from WordNet; 50 hypernym taxonomies from ODP; 50 meronym taxonomies from WordNet.

  • Auxiliary Datasets: 1000 Google documents per term or per term pair; 100 Wikipedia documents per term.

  • Evaluation Metrics: F1-measure (averaged by Leave-One-Out Cross Validation).



Performance of taxonomy induction l.jpg
Performance of taxonomy induction

  • Compare our system (ME) with other state-of-the-art systems

    • HE: 6 is-a patterns [Hearst 1992]

    • GI: 3 part-of patterns [Girju et al. 2003]

    • PR: a probabilistic framework [Snow et al. 2006]

    • ME: our metric-based framework


Performance of taxonomy induction28 l.jpg
Performance of taxonomy induction

  • Our system (ME) consistently gives the best F1 for all three tasks.

  • Systems using heterogeneous features (ME and PR) achieve a significant absolute F1 gain (>30%)


Features vs relations l.jpg
Features vs. relations

  • This is the first study of the impact of using different features on taxonomy induction for different relations

  • Co-occurrence and lexico-syntactic patterns are good for is-a, part-of, and sibling relations

  • Contextual and syntactic dependency features are only good for sibling relation


Features vs abstractness l.jpg
Features vs. abstractness

  • This is the first study of the impact of using different features on taxonomy induction for terms at different abstraction levels

  • Contextual, co-occurrence, lexical-syntactic patterns, and syntactic dependency features work well for concrete terms;

  • Only co-occurrence works well for abstract terms


Conclusions l.jpg
Conclusions

  • This paper presents a novel metric-based taxonomy induction framework, which

    • Combines strengths of pattern-based and clustering-based approaches

    • Achieves better F1 than 3 state-of-the-art systems

  • The first study on the impact of using different features on taxonomy induction for different types of relations and for terms at different abstraction levels


Conclusions32 l.jpg
Conclusions

  • This work is a general framework, which

    • Allows a wider range of features

    • Allows different metric functions at different abstraction levels

  • This work has a potential to learn more complex taxonomies than previous approaches




Formal formulation of taxonomy induction l.jpg
FORMAL FORMULATION OF TAXONOMY INDUCTION

  • The Task of Taxonomy Induction:

    • The construction of a full ontology T given a set of concepts C and an initial partial ontology T0

    • Keeping adding concepts in C into T0

      • Note T0 could be empty

    • Until a full ontology is formed


Goal of taxonomy induction l.jpg
GOAL OF TAXONOMY INDUCTION

  • Find the optimal full ontology s.t. the information changes since T0 are least , i.e.,

  • Note that this is by the Minimum Evolution Assumption


Get to the goal l.jpg
Get to the Goal

  • Goal:

Since the optimal set of concepts is always C

Concepts are added incrementally


Get to the goal38 l.jpg
Get to the Goal

Plug in definition of information change

Transform into a minimization problem

Minimum Evolution

objective function


Explicitly model abstractness l.jpg
Explicitly Model Abstractness

  • Model Abstractness for each Level by Least Square Fit

Plug in definition of amount of information for an abstraction level

Abstractness

objective function



More definitions41 l.jpg
More Definitions

Information in an Taxonomy T

d( , ) = 2

distance = 1.5

distance = 2

ball

distance =1

d( , ) = 1

distance =1

d( , ) = 4.5

table


More definitions42 l.jpg
More Definitions

Information in a Level L

d( , ) = 2

ball

d( , ) = 1

ball

d( , ) = 1


Examples of features l.jpg
EXAMPLES OF FEATURES

  • Contextual Features

    • Global Context KL-Divergence = KL-Divergence(1000 Google Documents for Cx, 1000 Google Documents for Cy);

    • Local Context KL-Divergence = KL-Divergence(Left two and Right two words for Cx, Left two and Right two words for Cy).

  • Term Co-occurrence

    • Point-wise Mutual Information (PMI)

    • = # of sentences containing the term(s);

      or # of documents containing the term(s);

      or n as in “Results 1-10 of about n for …” in Google.


Examples of features44 l.jpg
EXAMPLES OF FEATURES

  • Syntactic Dependency Features

  • Minipar Syntactic Distance = Average length of syntactic paths in syntactic parse trees for sentences containing the terms;

  • Modifier Overlap = # of overlaps between modifiers of the terms; e.g., red apple, red pear;

  • Object Overlap = # of overlaps between objects of the terms when the terms are subjects; e.g., A dog eats apple; A cat eats apple;

  • Subject Overlap = # of overlaps between subjects of the terms when the terms are objects; e.g., A dog eats apple; A dog eats pear;

  • Verb Overlap = # of overlaps between verbs of the terms when the terms are subjects/objects; e.g., A dog eats apple; A cat eats pear.


Examples of features45 l.jpg
EXAMPLES OF FEATURES

  • Lexical-Syntactic Patterns


Examples of features46 l.jpg
EXAMPLES OF FEATURES

  • Miscellaneous Features

    • Definition Overlap = # of non-stopword overlaps between definitions of two terms.

    • Word Length Difference


ad