Ontologies
Download
1 / 127

- PowerPoint PPT Presentation


  • 430 Views
  • Uploaded on

Ontologies. German Rigau i Claramunt http://www.lsi.upc.es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya. WordNet (Miller et al. 90, Fellbaum 98) EuroWordNet (Vossen et al. 98) Spanish WordNet

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - salena


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Ontologies

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Ontologies outline l.jpg

WordNet (Miller et al. 90, Fellbaum 98)

EuroWordNet (Vossen et al. 98)

Spanish WordNet

Combining Methods (Atserias et al. 97)

Mapping hierarchies (Daudé et al. 01)

Mikrokosmos (Viegas et al. 96)

Cyc (Malesh et al. 96)

WordNet 2 (Harabagiu 98)

MindNet (Richardson et al. 97)

ThoughtTreasure (Mueller 00)

Meaning ...

Ontologies Outline


Slide3 l.jpg

WordNet & EuroWordNet

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Wordnet eurowordnet wordnet l.jpg

Universidad de Princeton (Miller et al. 1990)

Conceptos lexicalizados (parabras, lexíes)

Relacionados entre sí por relaciones semánticas

sinonimia

antonimia

hiperonimia-hiponimia

meronimia

implicación

causa

...

WordNet & EuroWordNetWordNet


Wordnet eurowordnet relaciones sem nticas de wn1 5 l.jpg

Sinonimia

Conceptos Lexicalizados (SYNSETS)

Noción débil de sinonimia: Sinonimia en contexto

Synset: Conjunto de palabras o lexías que en un contexto dado expresan un concepto

Hiperonimia / Hiponimia

Relación de clase a subclase

WordNet & EuroWordNetRelaciones Semánticas de WN1.5


Slide6 l.jpg

Meronimias

Parte componente

{mano}{brazo}

Elemento de colectividad

{persona}{gente}

Sustancia

{periódico}{papel}

WordNet & EuroWordNetRelacions Semàntiques de WN1.5


Slide7 l.jpg

Antonimia

{grande}{pequeño}

Causa

{matar}{morir}

Implicación

{divorciarse}{casarse}

Derivación

{presidencial}{presidente}

Similitud

{bueno}{positivo}

WordNet & EuroWordNetRelaciones Semánticas de WN1.5


Slide8 l.jpg

WordNet & EuroWordNetEjemplo WordNet

<conveyance>

<vehicle>

<doorlock>

<car door>

<motor vehicle, automovile,...>

<cruiser, squad car, patrol car, ...>

<cruiser, squad car, patrol car, ...>

<cab, taxi, hack, ...>


Slide9 l.jpg

Proyecto LE-2 4003

Telematics Application Programme de la UE

Redes semánticas de diversas lenguas

Integradas e interconectadas

Inglés Universidad de Sheffield

Holandés Univ. de Amsterdam

Italiano I.L.C. de Pisa

Español UB, UPC, UNED.

Computers and the Humanities

(Vol.monográfico,1998)

http://www.hum.uva.nl/~ewn/

WordNet & EuroWordNetEuroWordNet


Slide10 l.jpg

EWN2

Alemán, Francés, Checo, Sueco, Estonio

Proyecto ITEM

Castellano, Catalán, Vasco

CREL (Centre de Referència d’Enginyeria Lingüística)

Catalán (UB, UPC)

WordNet & EuroWordNetExtensiones EuroWordNet


Slide11 l.jpg

Desarrollo de recursos Básicos

Tratamiento interlingüístico de la información

- Sistemas multilingües de recuperación de información (p.e., Internet)

- Módulo léxico-semántico de los sistemas de ingeniería lingüística

 Extracción de información

 Traducción automática

WordNet & EuroWordNetAplicaciones


Slide12 l.jpg

Preservación de las relaciones semánticas específicas de cada lengua

Máxima compatibilidad entre los diferentes recursos

Relativa independencia de los WordNets

en el proceso de construcción

en el resultado final

WordNet & EuroWordNetRequisitos de Diseño


Slide14 l.jpg

Núcleo cada lengua

El ILI

La Top Concept Ontology (TCO)

Ontología de dominios (DO)

Periferia

WordNets específicos

WordNet & EuroWordNetComponentes de EuroWordNet


Slide15 l.jpg

Colección no estructurada de elementos cada lengua

Ligados con

al menos, un synset de un EWN

un elemento de la TCO o DO

Asociados a synsets de WN 1.5

WordNet & EuroWordNetInterlingual Index of EuroWordNet


Slide16 l.jpg

Jerarquía de conceptos independientes de la lengua cada lengua

distinciones semánticas: objeto, lugar, dinámico, …

abstracta (no léxica)

Superpuesta al ILI

Tres tipos de entidades:

Primer orden: entidades concretas

Segundo orden: situaciones estáticas o dinámicas

Tercer orden: proposiciones abstractas

WordNet & EuroWordNetTop Concept Ontology of EuroWordNet


Slide17 l.jpg

WordNet & EuroWordNet cada lenguaTop Concept Ontology of EuroWordNet


Slide18 l.jpg

Jerarquía de etiquetas de dominio cada lengua

Reducción de la polisemia

Dominios:

Tráfico:

Tráfico rodado, tráfico aéreo

Información Internacional

Micología

Medicina

WordNet & EuroWordNetDomain Ontology of EuroWordNet


Slide19 l.jpg

Riqueza superior a WN cada lengua

Entre:

synsets (módulos monolingües)

registros ILI (multilingües):

{actuar-1} EQ-SYNONYM {‘behave in a certain manner’}

registros ILI y TCO o OD

WordNet & EuroWordNetRelaciones de EuroWordNet


Slide20 l.jpg

WordNet & EuroWordNet cada lenguaRelaciones Interlingüísticas de EuroWordNet


Slide21 l.jpg

WordNet & EuroWordNet cada lenguaRelaciones de EuroWordNet


Spanish wordnet building process l.jpg

Spanish WordNet: cada lenguaBuilding Process

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Spanish wordnet general methodology l.jpg

1) cada lenguaMapping to WN1.5

manual work

automatic derivation of equivalents, using bi-lingual dictionaries

2) Manual correction

3) Re-structuring

Spanish WordNetGeneral Methodology


Spanish wordnet main steps first core manual translation l.jpg

Nouns: cada lengua

A) WN1.5’s Tops File plus first level of hyponyms (about 800 synsets).

B) The rest of EWN’s Common Base Concepts (which were not in our set).

C) Manual translation of synsets intermediate between (A) and (B) following WN1.5 hyerarchy ¾thus building a compact taxonomy equivalent to WN1.5 without gaps¾

Verbs:

Manual translation of EWN’s Base Concepts (about 150 synsets)

Spanish WordNetMain Steps: First Core (Manual Translation)


Spanish wordnet main steps subset 1 semi automatic l.jpg

N cada lenguaouns:

Applying authomatic methods using bi-lingual dictionaries

Manual validation of several subsets to check if the link is correct

Deriving a Confidence Score (CS) for every authomatic method (heuristic)

Selecting pairs synset-word above 85% CS

Some manual correction of this Subset 1 (mainly, filling gaps)

Verbs:

3600 English verbs connected to WN1.5 senses and ambiguously translated to Spanish are manually inspected and disambiguated

Spanish WordNetMain Steps: Subset 1 (Semi-automatic)


Spanish wordnet main steps subset 1 results 1 l.jpg
Spanish WordNet cada lenguaMain Steps: Subset 1 (Results 1)


Spanish wordnet main steps subset 1 results 2 l.jpg
Spanish WordNet cada lenguaMain Steps: Subset 1 (Results 2)


Spanish wordnet main steps subset 2 l.jpg

Main goals cada lengua

enhance the quality of the Subset 1 by manual revision

extend it by manual building of synsets

4 Sub-tasks

Spanish WordNetMain Steps: Subset 2


Spanish wordnet main steps subset 229 l.jpg

1) Covering manually those gaps in the hyponymy chains covered by other languages

2) Manual cleaning of some automatically-generated variants.

(a) pairs of synsets which are adjacent in the hyponymy chain and share at least one variant.

deleting redundant variants

re-locating to either pre-existant or newly created synsets

(b) multi-word expressions present in synsets.

Deleting non-lexicalized

Spanish WordNetMain Steps: Subset 2


Spanish wordnet main steps subset 230 l.jpg

3) Manual addition of new vocabulary which has been considered relevant.

It mainly comes from the Catalan WordNet: since we are building both wordnets in parallell, we detected those synsets which were built for Catalan and not for Spanish

4) Manual addition of cross-part of speech relations between nominal and verbal synsets.

This work has been based mainly on noun-verb pairs obtained by means of morphological criteria. (Work carried out by UNED –Madrid-)

Spanish WordNetMain Steps: Subset 2


Spanish wordnet main steps subset 2 results l.jpg
Spanish WordNet considered relevant. Main Steps: Subset 2 (Results)


Spanish wordnet main steps subset 2 results32 l.jpg
Spanish WordNet considered relevant. Main Steps: Subset 2 (Results)


Spanish wordnet main steps beyond subset 2 l.jpg

Massive Manual Checking (from Nov’98) considered relevant.

Using WEI

Variants automatically generated

Filling gaps in the hierachy

New vocabulary

New Adjectives

Spanish WordNetMain Steps: Beyond Subset 2


Spanish wordnet main steps beyond subset 235 l.jpg
Spanish WordNet considered relevant. Main Steps: Beyond Subset 2


Spanish wordnet main steps beyond subset 236 l.jpg
Spanish WordNet considered relevant. Main Steps: Beyond Subset 2


Spanish wordnet main steps parole coverage l.jpg
Spanish WordNet considered relevant. Main Steps: Parole Coverage


Spanish wordnet current figures l.jpg

Spanish, Catalan, Basque, (English) considered relevant.

http://nipadio.lsi.upc.es/wei2.html

Spanish WordNetCurrent Figures


Combining multiple methods for the automatic construction of multilingual wordnets l.jpg

Combining Multiple Methods for the Automatic Construction of Multilingual WordNets

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Combining multiple methods outline l.jpg

Ten class methods Multilingual WordNets

Four monosemic criteria

Four polysemic criteria

two hybrid criteria

Three conceptual distance methods

CD1: using pairwise word coocurrences

CD2: using headword and genus

CD3: using bilingual Spanish entries with multiple translations

Combining Multiple Methods ...Outline


Combining multiple methods ten class methods l.jpg

Four Classes Multilingual WordNets

SW

EW

SW

EW

EW

SW

EW

SW

SW

EW

SW

EW

Combining Multiple Methods ...Ten class methods


Combining multiple methods ten class methods42 l.jpg

Four monosemic criteria Multilingual WordNets

SW

EW

EW

SW

EW

Synset

SW

EW

Synset

Combining Multiple Methods ...Ten class methods

SW

EW

Synset

Synset

Synset

SW

EW

Synset

SW


Combining multiple methods ten class methods43 l.jpg

Four polysemic criteria Multilingual WordNets

SW

EW

EW

SW

EW

SW

Combining Multiple Methods ...Ten class methods

SW

EW

Synset+

Synset+

Synset+

Synset+

SW

EW

Synset+

SW

EW

Synset+


Combining multiple methods ten class methods44 l.jpg

Variant criterion Multilingual WordNets

Field criterion

Combining Multiple Methods ...Ten class methods

<..., EW, ..., EW, ...>

SW

<..., headword-EW, ..., Ind-EW, ...>

SW


Combining multiple methods ten class methods45 l.jpg

Results Multilingual WordNets

Combining Multiple Methods ...Ten class methods


Combining multiple methods conceptual distance methods l.jpg

Conceptual Distance (Agirre et al. 94) Multilingual WordNets

length of the shortest path

specificity of the concepts

Combining Multiple Methods ...Conceptual Distance methods

  • using WordNet

  • Bilingual dictionary


Combining multiple methods conceptual distance methods47 l.jpg

Three conceptual distance methods Multilingual WordNets

CD1: using pairwise word coocurrences

CD2: using headword and genus

CD3: using bilingual Spanish entries with multiple translations

Combining Multiple Methods ...Conceptual Distance methods


Combining multiple methods conceptual distance methods example cd2 l.jpg

<structure, construction> Multilingual WordNets

<building, edifice>

<place of worship, ...>

<church, church building>

<abbey>

<monastery>

<convent>

<abbey>

<abbey>

Combining Multiple Methods ...Conceptual Distance methods (Example CD2)

<entity>

<object, ...>

<artifact, artefact>

<house, lodging>

<religious residence, cloiser>

abadía_1_2 Iglesia o monasterio regido por un abad o abadesa

(abbey, a church or a monastery ruled by an abbot or an abbess)


Combining multiple methods conceptual distance methods example cd249 l.jpg

<monastery> Multilingual WordNets

<convent>

<abbey>

<abbey>

Combining Multiple Methods ...Conceptual Distance methods (Example CD2)

<entity>

<object, ...>

<artifact, artefact>

<structure, construction>

<house, lodging>

<building, edifice>

<place of worship, ...>

<religious residence, cloiser>

<church, church building>

<abbey> 06 ARTIFACT

abadía_1_2 Iglesia o monasterio regido por un abad o abadesa

(abbey, a church or a monastery ruled by an abbot or an abbess)


Combining multiple methods three cd methods l.jpg

Results Multilingual WordNets

Combining Multiple Methods ...Three CD methods


Combining multiple methods combining methods l.jpg

Results Multilingual WordNets

Combining Multiple Methods ...Combining methods


Combining multiple methods resulting spanish wordnets l.jpg
Combining Multiple Methods ... Multilingual WordNetsResulting Spanish WordNets


Mapping conceptual hierarchies using relaxation labelling l.jpg

Mapping Multilingual WordNetsConceptual Hierarchies Using Relaxation Labelling

German Rigau i Claramunt

TALP Research Center

UPC


Mapping conceptual hierarchies using relaxation labelling outline l.jpg

Setting Multilingual WordNets

Relaxation Labelling Algorithm

Constraints

Experiments & Results I (multilingual)

Experiments & Results II (monolingual)

Further work

Mapping Conceptual Hierarchies using Relaxation LabellingOutline


Mapping conceptual hierarchies using relaxation labelling setting l.jpg

C1 Multilingual WordNets

C2

C3

C4

C5

C6

Mapping Conceptual Hierarchies using Relaxation LabellingSetting


Mapping conceptual hierarchies using relaxation labelling setting56 l.jpg
Mapping Conceptual Hierarchies using Relaxation Labelling Multilingual WordNetsSetting

C1

C2

C3

C4

C5

C6


Mapping conceptual hierarchies using relaxation labelling setting57 l.jpg

Connecting already existing Hierarchies Multilingual WordNets

Relaxattion labelling Algorithn

Constraints

Between

Spanish taxonomy automatically derived from an MRD (Rigau et al. 98)

WordNet

using a bilingual MRD

Mapping Conceptual Hierarchies using Relaxation LabellingSetting


Mapping conceptual hierarchies using relaxation labelling setting58 l.jpg
Mapping Conceptual Hierarchies using Relaxation Labelling Multilingual WordNetsSetting

animal

(Tops <animal, animate_being, ...>)

(person <beast, brute, ...>)

(person <dunce, blockhead, ...>)

ave

(animal <bird>)

(artifact <bird, shuttle, ...>)

(food <fowl, poultry, ...>)

(person <dame, doll, ...>)

faisán

(animal <pheasant>)

(food <pheasant>)

rapaz

(animal <bird>)

(artifact <bird, shuttle, ...>)

(food <fowl, poultry, ...>)

(person <dame, doll, ...>)


Mapping conceptual hierarchies using relaxation labelling outline59 l.jpg

Setting Multilingual WordNets

Relaxation Labelling Algorithm

Constraints

Experiments & Results I (multilingual)

Experiments & Results II (monolingual)

Further work

Mapping Conceptual Hierarchies using Relaxation LabellingOutline


Mapping conceptual hierarchies using relaxation labelling relaxation labelling algorithm l.jpg

Iterative algorithm for function optimization based on local information

it can deal with any kind of constraints

variables (senses of the taxonomy)

labels (synsets)

Finds a weight assignment for each possible label for each variable

weights for the labels of the same variable add up to one

weigth assignation satisfies -to the maximum possible extent- the set of constraints

Mapping Conceptual Hierarchies using Relaxation LabellingRelaxation Labelling Algorithm


Mapping conceptual hierarchies using relaxation labelling relaxation labelling algorithm61 l.jpg

1) Start with a random weight assigment information

2) Compute the support value for each label of each variable (according to the constraints)

3) Increase the weights of the labels more compatible with context and decrease those and decrease those of the less compatible labels.

4) If a stopping/convergence is satisfied, stop,

otherwiese go to step 2.

Mapping Conceptual Hierarchies using Relaxation LabellingRelaxation Labelling Algorithm


Mapping conceptual hierarchies using relaxation labelling outline62 l.jpg

Setting information

Relaxation Labelling Algorithm

Constraints

Experiments & Results I (multilingual)

Experiments & Results II (monolingual)

Further work

Mapping Conceptual Hierarchies using Relaxation LabellingOutline


Mapping conceptual hierarchies using relaxation labelling constraints l.jpg

Rely on the taxonomy structure information

Coded with three characters

X: Spanish Taxonomy, I (immediate),

Y: English Taxonomy, A (ancestor)

X: Relation, E (hypernym), O (hyponym), B (both)

Examples:

Mapping Conceptual Hierarchies using Relaxation LabellingConstraints

IIE

AAB

+

+

+

+


Mapping conceptual hierarchies using relaxation labelling hierarchical constraints l.jpg

II Constraints information

IIE

IIO

IIB

Mapping Conceptual Hierarchies using Relaxation LabellingHierarchical Constraints

NAACL’2001


Mapping conceptual hierarchies using relaxation labelling hierarchical constraints65 l.jpg

AI Constraints information

Mapping Conceptual Hierarchies using Relaxation LabellingHierarchical Constraints

+

+

+

+

AIE

AIO

AIB

NAACL’2001


Mapping conceptual hierarchies using relaxation labelling hierarchical constraints66 l.jpg

IA Constraints information

Mapping Conceptual Hierarchies using Relaxation LabellingHierarchical Constraints

+

+

+

+

IAE

IAO

IAB

NAACL’2001


Mapping conceptual hierarchies using relaxation labelling hierarchical constraints67 l.jpg

AA Constraints information

Mapping Conceptual Hierarchies using Relaxation LabellingHierarchical Constraints

+

+

+

+

+

+

+

+

AAE

AAO

AAB

NAACL’2001


Mapping conceptual hierarchies using relaxation labelling outline68 l.jpg

Setting information

Relaxation Labelling Algorithm

Constraints

Experiments & Results I (multilingual)

Experiments & Results II (monolingual)

Further work

Mapping Conceptual Hierarchies using Relaxation LabellingOutline


Combining multiple methods ranlp 97 eight class methods l.jpg

Four monosemic criteria information

SW

EW

EW

SW

EW

Synset 85% 4%

SW

EW

Synset

Combining Multiple Methods ...RANLP’97Eight class methods

Cov.

Prec.

SW

EW

Synset 92% 5%

Synset 89% 1%

Synset

SW

EW

Synset 89% 2%

SW


Combining multiple methods ranlp 97 eight class methods70 l.jpg

Four polysemic criteria information

SW

EW

EW

SW

EW

SW

Combining Multiple Methods ...RANLP’97Eight class methods

Prec.

Cov.

SW

EW

Synset+ 80% 8%

Synset+ 75% 2%

Synset+

Synset+ 58% 17%

SW

EW

Synset+ 61% 60%

SW

EW

Synset+


Combining multiple methods ranlp 97 experiments results l.jpg

Poly TOK, FOK TOK, FNOK total information

animal 279 (90%) 30 (91%) 209 (90%)

food 166 (94%) 3 (100%) 169 (94%)

cognition 198 (67%) 27 (90%) 225 (69%)

communication 533 (77%) 40 (97%) 573 (78%)

all TOK, FOK TOK, FNOK total

animal 424 (93%) 62 (95%) 486 (90%)

food 166 (94%) 83 (100%) 249 (96%)

cognition 200 (67%) 245 (90%) 445 (82%)

communication 536 (77%) 234 (97%) 760 (81%)

Combining Multiple Methods ...RANLP’97 Experiments & Results


Combining multiple methods ranlp 97 experiments results72 l.jpg
Combining Multiple Methods ...RANLP’97 informationExperiments & Results

piel

(substance <skin, fur, peel>)

marta

(substance <sable, marte, coal_back>)

visón

(substance <mink, mink_coat>)


Mapping conceptual hierarchies using relaxation labelling outline73 l.jpg

Setting information

Relaxation Labelling Algorithm

Constraints

Experiments & Results I (multilingual)

Experiments & Results II (monolingual)

Further work

Mapping Conceptual Hierarchies using Relaxation LabellingOutline


Slide74 l.jpg

All Relationships information

also-see, similar-to, attribute, antonym, etc.

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01Generalized Constraints

R

R


Slide75 l.jpg

Non-structural constraints information

W: number of word coincidences

G: word coincidences in glosses

F: number of frame coincidences (verbs)

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01Generalized Constraints


Slide76 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationPOS mapping depencences

Nouns

Adjectives

Adverbs

Verbs


Slide77 l.jpg

Structural constraints information

hyper/hyponymy

antonymy

also-see

Non-structural constraints

W, G and F

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01Constraints for Verbs


Slide78 l.jpg

Structural constraints information

Adj-to-Adj

antonymy, similar-to and also-see

Adj-to-Verb

participle-of

Adj-to-Noun

pertains and attribute

Non-structural constraints

W and G

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01ConstraintsAdjectives


Slide79 l.jpg

Structural constraints information

Adv-to-Adv

antonymy

Adv-to-Adj

derived

Non-structural constraints

W and G

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01ConstraintsAdverbs


Slide80 l.jpg

A Complete... ACL’00, NAACL’01 informationExample extra-POS

WN1.6

00843344a

evangelical evangelistic

WN1.5

Similar to

02025107a

evangelical evangelistic

00842521a

enthusiastic

pertainym

02025107a

evangelical

04237485n

Gospel Gospels evangel

pertainym

04853575n

Gospel Gospels evangel


Slide81 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationExample extra-POS

WN1.5

WN1.6

00057615r

impossibly absurdly

00294844r

impossibly

derived from

derived from

antonym

01393725a

impossible

01752468a

impossible

00294658a

possibly


Slide82 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationResults

  • Basic constraint set: structural constraints

    • Nouns: AA hyper/hyponym

    • Verbs: AA hyper/hyponym, II also-see

    • Adjectives: II antonymy, similar-to, also-see

    • Adverbs: II antonymy


Slide83 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationResults

Coverage

Ambigous

Overall

N

99.7%

94.9% - 99.6%

97.6% - 99.8%

V

96.9%

93.5% - 99.2%

94.6% - 99.2%

A

94.1%

82.8% - 98.9%

89.5% - 99.4%

R

80.8%

97.5% - 100%

99.0% - 100%

  • Basic constraint set: structural constraints

Precision - recall


Slide84 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationResults

Coverage

Ambigous

Overall

N

99.9%

97.5% - 97.7 %

98.8% - 98.9%

V

99.8%

99.4% - 99.7%

99.3% - 99.6%

A

98.9%

96.5% - 98.8%

97.9% - 99.3%

R

99.5%

97.5% - 100%

99.0% - 100%

  • Basic constraint set + W, G and F for verbs

Precision - recall


Slide85 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationResults

Coverage

Ambigous

Overall

N

-

-

-

V

-

-

-

A

95.8%

95.8% - 98.9%

90.9% - 99.4%

R

88.0%

69.2% - 94.2%

97.9% - 98.1%

  • Basic + extra-POS relationships

Precision - recall


Slide86 l.jpg

A Complete WN1.5 to WN1.6 Mapping ... ACL’00, NAACL’01 informationResults

Coverage

Ambigous

Overall

N

99.9%

97.5% - 97.7 %

98.8% - 98.9%

V

99.8%

99.4% - 99.7%

99.3% - 99.6%

A

99.0%

96.5% - 99.1%

97.9% - 99.5%

R

99.6%

98.3% - 100%

99.3% - 100%

  • Basic + extra-POS relationships + WGF

Precision - recall


Mapping conceptual hierarchies using relaxation labelling conclusions l.jpg

First complete mapping between Wordnet versions information

Combining structural and non-structural information

Robust approach based on local information, but with global effects

Incremental POS approach

http://www.lsi.upc.es/~nlp

90 downloads (since November 2000)

Mapping Conceptual Hierarchies using Relaxation LabellingConclusions


Mapping conceptual hierarchies using relaxation labelling further work l.jpg

mapping other structures information

WN-EDR, WN-LDOCE, etc.

Other language taxonomies to EuroWordNet

SpanishEWN to WN1.6

symmetrical philosophy rather than source-target

Mapping Conceptual Hierarchies using Relaxation LabellingFurther Work


Slide89 l.jpg

Mikrokosmos information

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Mikrokosmos outline l.jpg

Introduction information

Representational Issues

The Lexicon

The Ontology

Acquisition Process

Lexicon Acquisition

Guidelines

Ontology/Lexicon Trade-off

Semantics in Action

Mikrokosmos Outline


Mikrokosmos introduction l.jpg

Knowledge Base Machine Translation (KBMT) information

CRL, NMSU

5,000 concepts

Events

Objects

Properties

7,000 Spanish word senses

40,000 word senses

after expansion with productive Lexical Rules

comprar -> comprador, comprable, ...

Text Meaning Representation

Mikrokosmos Introduction


Mikrokosmos representational issues the lexicon l.jpg

Typed Feature Structures (Pollard and Sag 87) information

language-dependant

10 zones

phonology

orthography

morphology

Syntactic (subcategorization)

Semantic (Lexical Semantic Representation)

syntax-semantic linking

stylistics

paradigmatic

syntacmatic

Mikrokosmos Representational Issues: The Lexicon


Mikrokosmos representational issues the lexicon93 l.jpg

Adquirir-V1 information

syn: subj: cat: NP

obj: cat: NP

sem: acquire

agent: HUMAN

theme: OBJECT

Adquirir-V2

syn: subj: cat: NP

obj: cat: NP

sem: acquire

agent: HUMAN

theme: INFORMATION

Mikrokosmos Representational Issues: The Lexicon


Mikrokosmos representational issues the ontology l.jpg

Taxonomic multi-hierarchical information

14 local or inherited links in average

language-impartial

EVENTS, OBJECTS, PROPERTIES

Methodology & Guidelines

Mikrokosmos Representational Issues: The Ontology


Mikrokosmos representational issues the ontology95 l.jpg

ACQUIRE information

DEFINITION “The transfer of possession event where the

agent transfers an object to its possession”

IS - A TRANSFER-POSSESSION

SOURCE HUMAN PLACE

THEME OBJECT (NOT HUMAN)

AGENT ANIMAL (DEFAULT HUMAN)

DESTINATION ANIMAL PLACE (DEFAULT HUMAN)

INHERITED

BENEFICIARY HUMAN

Mikrokosmos Representational Issues: The Ontology


Mikrokosmos acquisition process the lexicon l.jpg

Multi-lingual information

French, English, Japanese, Russian, Spanish, etc.

Multi-media

Multi-process

Analysis

Generation (mono and multilingual)

MT

Summarization

IE

Speech Processing

Tools

corpus-search, lookup dictionary, ontology browser

Mikrokosmos Acquisition Process: The Lexicon


Mikrokosmos acquisition process the ontology l.jpg

Guidelines information

1) Do not add instances as concepts

Instances do not have their own instances

Concepts do not have fixed position in space/time

2) Do not decompose concepts further

3) Use close concepts

4) Do not add EVENTs with particular arguments

5) Do not add concepts with instance-specific aspects,

temporal relations

6) Do not add language-specific concepts

7) Do not add ontologycal concepts for collections

Mikrokosmos Acquisition Process: The Ontology


Mikrokosmos acquisition process ontology lexicon trade off l.jpg

Daily negociations information

lexicon acquirers

ontology acquirers

Possibilities

one-to-one mapping

lexicon unspecification

lexicon ontology balance

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off


Mikrokosmos acquisition process ontology lexicon trade off99 l.jpg

one-to-one mapping information

Problems

Lexical: every word in a language is a concept

conceptual: cuire in french is not ambiguous

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off

PREPARE-FOOD

INST: COOKING-EQUIPMENT

COOK

INST: STOVE

BAKE

INST: OVEN

cook : cuire sur le feu

bake : cuire ou four


Mikrokosmos acquisition process ontology lexicon trade off100 l.jpg

Lexicon Unspecification information

Problems

BAKE is not in the ontology

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off

PREPARE-FOOD

INST: COOKING-EQUIPMENT

bake : cuire ou four

INST: OVEN

cook : cuire sur le feu


Mikrokosmos acquisition process ontology lexicon trade off101 l.jpg

Lexicon-Ontology Balance information

Mikrokosmos Acquisition Process: Ontology/Lexicon Trade-off

PREPARE-FOOD

INST: COOKING-EQUIPMENT

BAKE

INST: OVEN

FRY

INST: STOVE

INST: FRYING-PAN

cook : cuire

bake


Mikrokosmos semantics in action l.jpg

El grupo Roche, a través de su compañía en España, adquirió Doctor Andreu.

El grupo Roche adquirió Doctor Andreu a través de su compañía en España.

La adquisición de Doctor Andreu por el grupo Roche fue hecha a través de su compañía en España.

ACQUIRE-1 Agent: ORGANIZATION-1

Theme: ORGANIZATION-2

Instrument: ORGANIZATION-3

ORGANIZATION-1 Object-Name: Grupo Roche

ORGANIZATION-2 Object-Name: Doctor Andreu

ORGANIZATION-3 Location: España

Mikrokosmos Semantics in Action


Mikrokosmos semantics in action103 l.jpg

Onto-Search: Ontological search mechanism to check constraints

check-onto(ACQUIRE, EVENT) = 1

since ACQUIRE is a type of EVENT

check-onto(ORGANIZATION, HUMAN) = 0.9

since ORGANIZATION HAS-MEMBER HUMAN

Mikrokosmos Semantics in Action


Mikrokosmos semantics in action104 l.jpg

1) constraintsa-través-deINSTRUMENT, LOCATION

adquirir require PHYSICAL-OBJECT

2) enLOCATION, TEMPORAL

España is not a TEMPORAL-OBJECT

3) adquirirACQUIRE, LEARN

Doctor Andreu is not an INFORMATION

4) Doctor AndreuORGANIZATION, HUMAN

the Theme of ACQUIRE is not HUMAN

5) compañíaCORPORATION, SOCIAL-EVENT

ORGANIZATIONs typically fill the INSTRUMENT slot of ACQUIRE acts

Mikrokosmos Semantics in Action


Mikrokosmos experiment wsd l.jpg

Text 1 2 3 4 Mean constraints

words 347 385 370 353 364

words/sentence 16.5 24.0 26.4 20.8 21.4

open-class words 183 167 177 177 176

ambiguous words 57 42 57 35 48

syntax 21 19 20 12 18

correct 51 41 45 34 43

% 97 99 93 99 97

Mikrokosmos Experiment: WSD


Mikrokosmos experiment wsd106 l.jpg

Text Mean Mean Unseen constraints

words 364 390

words/sentence 21.4 26

open-class words 176 104

ambiguous words 48 26

syntax 18 9

correct 43 23

% 97 97

Mikrokosmos Experiment: WSD


Slide107 l.jpg

WordNet2 constraints

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Wordnet2 outline l.jpg

Introduction constraints

Text Inferences

Defining Features

Plausible inferences

Inference Rules

Semantic Paths

What WordNet cannot do

WordNet2 Outline


Wordnet2 introduction l.jpg

(Harabagiu 98) constraints

Commonse reasoning requires extensive knowledge

~ 100 millions of concepts and relations

WordNet

represents almost all English words

100.000 synsets

linked by semantic relations

WordNet2

each synset has a gloss that, when disambiguated may increase the number of relations

WordNet glosses into semantic networks

NEW RELATIONS

WordNet2 Introduction


Wordnet2 text inferences l.jpg

German was hungry constraints

He opened the refrigerator

hungry (feeling a need or desire to eat)

eat (take in solid food)

refrigerator (an appliance in which foods can be stored at low temperature)

WordNet2 Text Inferences


Wordnet2 defining features l.jpg

Transform each concept’s gloss into a graph where concepts are nodes and lexical relations are links

<culture> (all the knowledge shared by society)

<share> --AGENT--> <society>

<doctor> (licensed medical practitioner)

<medical practitioner> --ATRIBUTTE--> <licensed>

WordNet2 Defining Features


Wordnet2 defining features112 l.jpg
WordNet2 concepts are nodes and lexical relations are links Defining Features

ship

OBJECT

guide

PURPOSE

LOCATION

pilot

person

water

GLOSS

ATTRIBUTE

ATTRIBUTE

difficult

qualified


Wordnet2 inference rules l.jpg

Rule 1 Rule 2 concepts are nodes and lexical relations are links

VC1 IS-A VC2 VC1 IS-A VC2

VC2 IS-A VC3 VC2 ENTAIL VC3

------------------------- -------------------------

VC1 IS-A VC3 VC1 ENTAIL VC3

Rule 3 Rule 2

VC1 IS-A VC2 VC1 IS-A VC2

VC2 R_IS-A VC3 VC2 R_ENTAIL VC3

------------------------- -------------------------

VC1 PLAUSIBLE (not VC3) VC1 EXPLAINS VC3

16 + 1 regles

WordNet2 Inference Rules


Wordnet2 semantic paths l.jpg

0) Create and load the KB concepts are nodes and lexical relations are links

1) Place markers on KB concepts

2) Propagate markers

The algorithm avoids cycles

3) Detect collisions

To each marker collision it corresponds a path

4) Extract Inferences

WordNet2 Semantic Paths


Wordnet2 semantic paths115 l.jpg

Inference sequence concepts are nodes and lexical relations are links

German was hungry

German felt a desire to eat

German felt a desire to take in food

COLLISION: German=he felt a desire to take food, stored in an appliance, which he opened

He opened an appliance where food is stored

He opened the refrigerator

WordNet2 Semantic Paths


Wordnet2 what wordnet cannot do l.jpg

Major WordNet limitations: concepts are nodes and lexical relations are links

1) The lack of compound concepts

2) The small number of causation and entailment relations

3) the lack of preconditions for verbs

4) the absence of case relations

WordNet2 What WordNet cannot do


Slide117 l.jpg

ThoughtTreasure concepts are nodes and lexical relations are links

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Thoughttreasure overview l.jpg

a comprehensive platform for concepts are nodes and lexical relations are links

NLP English, French

commonsense reasoning

A hotel room has a bed, night table, ...

People has fingernails

soda is a drink

one hangs up at the end of a phone call

the sky is blue

dogs bark

someone who is 16 years old is a teenager

ThoughtTreasure Overview


Thoughttreasure overview119 l.jpg

25,000 concepts organized into a hierarchy concepts are nodes and lexical relations are links

EVIAN -> FLAT-WATER -> DRINKING-WATER

55,000 words (English, French)

food <-> aliment <-> FOOD

50,000 asertions about concepts

green-pea is green

100 scripts

ThoughtTreasure Overview


Thoughttreasure overview120 l.jpg

Text Agents for recognizing names, phones, etc concepts are nodes and lexical relations are links

mechanisms for learning new words

X-phile is someone who likes X

a syntactic parser

a NL generator

a semantic parser

an anaphoric parser

planning agents for achieving goals

understanding agents

ThoughtTreasure Overview


Thoughttreasure example l.jpg

Who created Bugs Bunny? concepts are nodes and lexical relations are links

1.0 (create human-interrogative-pronoun Bugs-Bunny)

0.9 (create rock-group-the-Who Bugs-Bunny)

1.0 (create Tex-Avery Bugs-Bunny)

0.1 (not (create rock-group-the-Who Bugs-Bunny))

ThoughtTreasure Example


Slide122 l.jpg

Meaning concepts are nodes and lexical relations are links

German Rigau i Claramunt

http://www.lsi.upc.es/~rigau

TALP Research Center

Departament de Llenguatges i Sistemes Informàtics

Universitat Politècnica de Catalunya


Meaning overview l.jpg

Bases de Conocimiento concepts are nodes and lexical relations are links

Enriquecimiento automático de EWN (modelos verbales, etc.)

Aproximación mixta (KB + ML)

Q/A

Problema

ambigüedad estructural y léxica

Aproximación

localizar automáticamente ejemplos de sentidos(Leacock et al. 98, Mihalcea y Moldovan 99)

WSD a gran escala (Boosting, SVM, transductivos …)

Acquisición Conocimiento (Ribas 95, McCarthy 01)

Meaning Overview


Meaning exploiting ewn semantic relations l.jpg

< concepts are nodes and lexical relations are linksevento>

<agrupación grupo colectivo>

<evento social>

<grupo_social>

<competición, concurso>

<organización>

<partido_1>

<partido_2, partido_político>

<semifinal>

<cuartos_de_final>

<partido_laborista>

MeaningExploiting EWN Semantic Relations


Meaning exploiting ewn semantic relations125 l.jpg
Meaning concepts are nodes and lexical relations are linksExploiting EWN Semantic Relations

partido 1

Todos los partidos piden reformas legales para TV3.

La derecha planea agruparse en un partido.

El diputado reiteró que ni él ni UDC, “como partido”, han recibido dinero de Pellerols.

partido 2

Pero España puso al partido intensidad, ritmo y coraje.

El seleccionador cree que el partido de hoy contra Italia dará la medida de España

El Racing no gana en su campo desde hace seis partidos.


Meaning exploiting ewn semantic relations126 l.jpg
Meaning concepts are nodes and lexical relations are linksExploiting EWN Semantic Relations

partido 1

No negociaremos nunca com un partido político que sea partidario de la independencia de Taiwan.

Una vez más es noticia la desviación de fondos destinadoss a la formación ocupacional hacia la financiación de un partido político.

Estas lleyess fueron votadas gracias a un consenso general de los partidos políticos.

partido 2

Rivera pide el suporte de la afición para encarrilar las semifinales.

Sólo el equipo de Valero Ribera puede sentenciar una semifinal como lo hizo ayer en un Palau Blaugrana completamente entregado.

El Racing ganó los cuartos de final en su campo.


Meaning arquitecture l.jpg
Meaning concepts are nodes and lexical relations are links Arquitecture

English

Web Corpus

Italian

Web Corpus

WSD

WSD

English

EWN

Italian

EWN

ACQ

UPLOAD

UPLOAD

ACQ

Multilingual

Central Repository

PORT

PORT

PORT

PORT

Spanish

EWN

Basque

EWN

ACQ

ACQ

UPLOAD

UPLOAD

Spanish

Web Corpus

Catalan

EWN

Basque

Web Corpus

WSD

Catalan

Web Corpus

WSD


ad