1 / 18

Anja Theobald University of the Saarland, Germany theobald@cs.uni-sb.de http://www-dbs.cs.uni-sb.de/

An Ontology for Domain-oriented Semantic Similarity Search On XML Data. Anja Theobald University of the Saarland, Germany theobald@cs.uni-sb.de http://www-dbs.cs.uni-sb.de/. (BTW) February 25 – 28, 2003 Leipzig, Germany. Motivation. movie. astronomy. sports. Query on Web Data:.

johana
Download Presentation

Anja Theobald University of the Saarland, Germany theobald@cs.uni-sb.de http://www-dbs.cs.uni-sb.de/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Ontology for Domain-oriented Semantic Similarity Search On XML Data Anja Theobald University of the Saarland, Germany theobald@cs.uni-sb.de http://www-dbs.cs.uni-sb.de/ (BTW) February 25 – 28, 2003 Leipzig, Germany

  2. Motivation movie astronomy sports Query on Web Data:  Ranking based on content data and structure (XML,…)  Using Ontologies for similarity search  Grouping results by their topics

  3. Outline 0. Why we need Ranked Retrieval and Ontologies? 1. XXL Search Engine 2. Ontologies - a Linguistic Challenge 3. Graph-based Ontology 4. Quantification: Edge Weights 5. Similarity of Ontology Nodes 6. Ontology-based Query Processing

  4. XXL Search Engine … XML Document <galaxy> <object> <description>sun</> <appearance>…light and heat…</> <location>…</> … </object> <history> … </> … </galaxy> … Crawler EPI Handler Path Indexer EPI ECI Handler Content Indexer Visual XXL ECI Query Processor WWW Name Ontology Indexer Name Ontology Handler NOI Content Ontology Indexer Content Ontology Handler COI XXL Query: SELECT * FROM INDEX WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“

  5. sense: ...a celestial body of hot gases... refers to symbolized word: star object: stands for Ontologies – a linguistic challenge  ontology: ...representational vocabulary of words including hier- archical relationships and associative relationships between these words [Gruber93]...

  6. Word – Sense – Synset words w Σ* + word senses  U = {(w,s) | w Σ*, s  S: word w has sense s} + synonym relationship  synset(s) = { w | (w,s)  U}

  7.  synset(s) = { w | (w,s)  U} // U = {(w,s) | word w has sense s} abstraction entity, physical thing attribute object, physical object shape, form natural object figure celestial body, heavenly body plane figure, 2-dim. figure star synset(s): star sense s: sense 4: a plane figure with 5 or more points… sense 1: (astronomy) a celestial body of hot gases… Disambiguation: Synset – Category + hypernym relationship  category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)}

  8. Disambiguation: Synset – Category  synset(s) = { w | (w,s)  U} // U = {(w,s) | word w has sense s} + hypernym relationship  category(s) = { synset(s‘) | synset(s‘) is hypernym of synset(s)} abstraction entity, physical thing attribute object, physical object shape, form natural object figure celestial body, heavenly body plane figure, 2-dim. figure star synset(s): star sense s: sense 4: a plane figure with 5 or more points… sense 1: (astronomy) a celestial body of hot gases…

  9. Example Ontology entity, physical thing [entity, physical thing] group, grouping [group, grouping] abstraction [abstraction] [0. 71] food [substance, matter] universe, cosmos [collection,...] [0.83] [0.94] star [plane figure, 2-dim figure] milk [foodstuff, ...] natural object [object,...] galaxy, ... [collection,...] cows‘milk [milk] star [celestial body,...] hexagram [star] milky way [galaxy,...] Beta Centauri [star] sun [star]

  10. x = (synset(s), category(s)) V e = (x,y, type, weight) E • word: ... extracted from a document ... extracted from an existing thesaurus (interchangable!!!) • category, type: • weight: ... expresses semantic similarity of connected words • sim: ... expresses semantic similarity of ontology nodes Graph-based Ontology  Ontology G=(V,E)  Construction:  Use:

  11.  semantic similarity of connected synsets according to their concepts  vector space measures / probabilistic measures  DICE coefficient: …using web search engines for word frequencies… galaxy, extragalactic nebula [collection,aggregation,accumulation,assemblage] X := (coll  …  ass)  (galaxy  extr…) Y := (cel  heav)  (star) [0.172] X  Y := X  Y star [celestial body,heavenly body] [0.113] sun [star] Quantification: Edge Weight

  12. entity [entity] group [group] [0.1] protein [macromolecule] universe [collection] sim(milky way, sun) [0.1] |p|=3: 3/3 0.6 + 2/3 0.5 + 1/3 0.8 = 1.2 [0.3] milk [liquid] natural object [object] galaxy [collection] [0.6] [0.2] [0.5] [0.6] star [celestial body] cows‘ milk [milk] milky way [galaxy] [0.8] Beta Centauri [star] sun [star] Similarity of Ontology Nodes

  13. entity [entity] group [group] [0.1] protein [macromolecule] universe [collection] sim(milky way, sun) [0.1] |p|=3: 3/3 0.6 + 3/3 0.8 + 2/3 0.5 + 2/3 0.5 + 1/3 0.6 = 1.3 1/3 0.8 = 1.2 [0.3] milk [liquid] natural object [object] galaxy [collection] [0.6] [0.2] [0.5] [0.6] star [celestial body] cows‘ milk [milk] milky way [galaxy] [0.8] Beta Centauri [star] sun [star] Similarity of Ontology Nodes

  14. entity [entity] group [group] [0.1] protein [macromolecule] universe [collection] sim(milky way, sun) [0.1] |p|=3: 3/3 0.6 + 3/3 0.8 + 2/3 0.5 + 2/3 0.5 + 1/3 0.6 = 1.3 1/3 0.8 = 1.2 [0.3] milk [liquid] natural object [object] galaxy [collection] [0.6] [0.2] [0.5] [0.6] sim(milky way, sun) = 0.42 star [celestial body] cows‘ milk [milk] milky way [galaxy] sim(milky way, cows‘ milk) = 0.2 [0.8] Beta Centauri [star] sun [star] Similarity of Ontology Nodes

  15. XXL Query: XML Documents: … <galaxy> <object> <description>sun</> <appearance>…light and heat… </appearance> <location>…</> … </object> <history> … </> … </galaxy> … ... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“ XXL Query Representation: ~universe % % ~appearance ~ “star” Ontology-based Query Processing

  16. XXL Query: XML Data Graph: ... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“ galaxy 0.94 XXL Query Representation: 1.0 sim(universe, galaxy) object history ~universe description location 1.0 appearance 1.0 % % sim(app, app) ~appearance “…light and heat…” sun 0.43 ~ “star” sim(star, sun) * tfidf(sun) Ontology-based Query Processing

  17. XXL Query: XML Data Graph: ... WHERE #.~universe AS U AND U.#.~appearance AS A AND U.#.S ~ „star“ galaxy 0.94 XXL Query Representation: 1.0 sim(universe, galaxy) object history ~universe description location 1.0 appearance 1.0 % % sim(app, app) ~appearance “…light and heat…” sun 0.43 ~ “star” sim(star, sun) * tfidf(sun) Ontology-based Query Processing (result graph) = 0.4

  18. - ENDE - Vielen Dank! Gibt es etwa noch Fragen?

More Related