An Ontology for Domain-oriented
Download
1 / 18

AT, 26.02.2003 - PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on

An Ontology for Domain-oriented Semantic Similarity Search On XML Data. Anja Theobald University of the Saarland, Germany [email protected] http://www-dbs.cs.uni-sb.de/. (BTW) February 25 – 28, 2003 Leipzig, Germany. Motivation. movie. astronomy. sports. Query on Web Data:.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'AT, 26.02.2003' - johana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

An Ontology for Domain-oriented

Semantic Similarity Search

On XML Data

Anja Theobald

University of the Saarland, Germany

[email protected]

http://www-dbs.cs.uni-sb.de/

(BTW)

February 25 – 28, 2003

Leipzig, Germany


Motivation

movie

astronomy

sports

Query on Web Data:

 Ranking based on content data and structure (XML,…)

 Using Ontologies for similarity search

 Grouping results by their topics


Outline

0. Why we need Ranked Retrieval and Ontologies?

1. XXL Search Engine

2. Ontologies - a Linguistic Challenge

3. Graph-based Ontology

4. Quantification: Edge Weights

5. Similarity of Ontology Nodes

6. Ontology-based Query Processing


XXL Search Engine

… XML Document

<galaxy>

<object>

<description>sun</>

<appearance>…light and heat…</>

<location>…</>

</object>

<history> … </>

</galaxy>

Crawler

EPI

Handler

Path

Indexer

EPI

ECI

Handler

Content

Indexer

Visual

XXL

ECI

Query

Processor

WWW

Name

Ontology

Indexer

Name

Ontology

Handler

NOI

Content

Ontology

Indexer

Content

Ontology

Handler

COI

XXL Query:

SELECT * FROM INDEX

WHERE #.~universe AS U

AND U.#.~appearance AS A

AND U.#.S ~ „star“


sense: ...a celestial

body of hot gases...

refers to

symbolized

word:

star

object:

stands for

Ontologies – a linguistic challenge

 ontology:

...representational vocabulary of words including hier-

archical relationships and associative relationships

between these words [Gruber93]...


Word – Sense – Synset

words w Σ*

+ word senses

 U = {(w,s) | w Σ*, s  S: word w has sense s}

+ synonym relationship

 synset(s) = { w | (w,s)  U}


synset(s) = { w | (w,s)  U}

// U = {(w,s) | word w has sense s}

abstraction

entity, physical thing

attribute

object, physical object

shape, form

natural object

figure

celestial body, heavenly body

plane figure, 2-dim. figure

star

synset(s):

star

sense s:

sense 4: a plane figure

with 5 or more points…

sense 1: (astronomy) a celestial body of hot gases…

Disambiguation: Synset – Category

+ hypernym relationship

 category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)}


Disambiguation: Synset – Category

 synset(s) = { w | (w,s)  U}

// U = {(w,s) | word w has sense s}

+ hypernym relationship

 category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)}

abstraction

entity, physical thing

attribute

object, physical object

shape, form

natural object

figure

celestial body, heavenly body

plane figure, 2-dim. figure

star

synset(s):

star

sense s:

sense 4: a plane figure

with 5 or more points…

sense 1: (astronomy) a celestial body of hot gases…


Example Ontology

entity, physical thing

[entity, physical thing]

group, grouping

[group, grouping]

abstraction

[abstraction]

[0. 71]

food

[substance, matter]

universe, cosmos

[collection,...]

[0.83]

[0.94]

star

[plane figure, 2-dim figure]

milk

[foodstuff, ...]

natural object

[object,...]

galaxy, ...

[collection,...]

cows‘milk

[milk]

star

[celestial body,...]

hexagram

[star]

milky way

[galaxy,...]

Beta Centauri

[star]

sun

[star]


x = (synset(s), category(s)) V

e = (x,y, type, weight) E

  • word:

... extracted from a document

... extracted from an existing thesaurus

(interchangable!!!)

  • category, type:

  • weight:

... expresses semantic similarity of connected words

  • sim:

... expresses semantic similarity of ontology nodes

Graph-based Ontology

 Ontology G=(V,E)

 Construction:

 Use:


semantic similarity of connected synsets according to their concepts

 vector space measures / probabilistic measures

 DICE coefficient:

…using web search engines for

word frequencies…

galaxy, extragalactic nebula

[collection,aggregation,accumulation,assemblage]

X := (coll  …  ass)  (galaxy  extr…)

Y := (cel  heav)  (star)

[0.172]

X  Y := X  Y

star

[celestial body,heavenly body]

[0.113]

sun

[star]

Quantification: Edge Weight


entity

[entity]

group

[group]

[0.1]

protein

[macromolecule]

universe

[collection]

sim(milky way, sun)

[0.1]

|p|=3:

3/3 0.6 +

2/3 0.5 +

1/3 0.8 = 1.2

[0.3]

milk

[liquid]

natural object

[object]

galaxy

[collection]

[0.6]

[0.2]

[0.5]

[0.6]

star

[celestial body]

cows‘ milk

[milk]

milky way

[galaxy]

[0.8]

Beta Centauri

[star]

sun

[star]

Similarity of Ontology Nodes


entity

[entity]

group

[group]

[0.1]

protein

[macromolecule]

universe

[collection]

sim(milky way, sun)

[0.1]

|p|=3:

3/3 0.6 +

3/3 0.8 +

2/3 0.5 +

2/3 0.5 +

1/3 0.6 = 1.3

1/3 0.8 = 1.2

[0.3]

milk

[liquid]

natural object

[object]

galaxy

[collection]

[0.6]

[0.2]

[0.5]

[0.6]

star

[celestial body]

cows‘ milk

[milk]

milky way

[galaxy]

[0.8]

Beta Centauri

[star]

sun

[star]

Similarity of Ontology Nodes


entity

[entity]

group

[group]

[0.1]

protein

[macromolecule]

universe

[collection]

sim(milky way, sun)

[0.1]

|p|=3:

3/3 0.6 +

3/3 0.8 +

2/3 0.5 +

2/3 0.5 +

1/3 0.6 = 1.3

1/3 0.8 = 1.2

[0.3]

milk

[liquid]

natural object

[object]

galaxy

[collection]

[0.6]

[0.2]

[0.5]

[0.6]

sim(milky way, sun) = 0.42

star

[celestial body]

cows‘ milk

[milk]

milky way

[galaxy]

sim(milky way, cows‘ milk) = 0.2

[0.8]

Beta Centauri

[star]

sun

[star]

Similarity of Ontology Nodes


XXL Query:

XML Documents:

<galaxy>

<object>

<description>sun</>

<appearance>…light and heat…

</appearance>

<location>…</>

</object>

<history> … </>

</galaxy>

... WHERE #.~universe AS U

AND U.#.~appearance AS A

AND U.#.S ~ „star“

XXL Query Representation:

~universe

%

%

~appearance

~ “star”

Ontology-based Query Processing


XXL Query:

XML Data Graph:

... WHERE #.~universe AS U

AND U.#.~appearance AS A

AND U.#.S ~ „star“

galaxy

0.94

XXL Query Representation:

1.0

sim(universe, galaxy)

object

history

~universe

description

location

1.0

appearance

1.0

%

%

sim(app, app)

~appearance

“…light and heat…”

sun

0.43

~ “star”

sim(star, sun) * tfidf(sun)

Ontology-based Query Processing


XXL Query:

XML Data Graph:

... WHERE #.~universe AS U

AND U.#.~appearance AS A

AND U.#.S ~ „star“

galaxy

0.94

XXL Query Representation:

1.0

sim(universe, galaxy)

object

history

~universe

description

location

1.0

appearance

1.0

%

%

sim(app, app)

~appearance

“…light and heat…”

sun

0.43

~ “star”

sim(star, sun) * tfidf(sun)

Ontology-based Query Processing

(result graph) = 0.4


- ENDE -

Vielen Dank!

Gibt es etwa noch Fragen?


ad