slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies PowerPoint Presentation
Download Presentation
Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies

Loading in 2 Seconds...

play fullscreen
1 / 22

Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies - PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on

Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies. Euripides G.M. Petrakis Giannis Varelas Angelos Hliaoutakis Paraskevi Raftopoulou. Semantic Similarity.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies' - mike_john


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies

Euripides G.M. Petrakis

Giannis Varelas

Angelos Hliaoutakis

Paraskevi Raftopoulou

WMS'06, Chania, Crete

semantic similarity
Semantic Similarity
  • Relates to computing the conceptual similarity between terms which are not necessarily lexicacally similar
    • “car”-“automobile”-“vehicle”,
    • “drug”- “medicine”
  • Tool for making knowledge commonly understandable in applications such as IR, information communication in general

WMS'06, Chania, Crete

methodology
Methodology
  • Terms from different communicating sources are represented by ontologies
  • Map two terms to an ontology and compute their relationship in that ontology
  • Terms from different ontologies: Discover linguistic relationships or affinities between terms in different ontologies

WMS'06, Chania, Crete

contributions
Contributions
  • We investigate several Semantic Similarity Methods and we evaluate their performance
    • http://www.intelligence.tuc.gr/similarity
  • We propose a novel semantic similarity measure for comparing concepts from different ontologies

WMS'06, Chania, Crete

ontologies
Ontologies
  • Tools of information representation on a subject
  • Hierarchical categorization of terms from general to most specific terms
    • object  artifact  construction  stadium
  • Domain Ontologies representing knowledge of a domain
    • e.g., MeSH medical ontology
  • General Ontologies representing common sense knowledge about the world
    • e.g., WordNet

WMS'06, Chania, Crete

wordnet
WordNet
  • A vocabulary and a thesaurus offering a hierarchical categorization of natural language terms
    • More than 100,000 terms
  • Nouns, verbs, adjectives and adverbs are grouped into synonym sets (synsets)
  • Synsets represent terms or concepts with similar meaning
    • stadium, bowl, arena, sports stadium – (a large structure for open-air sports or entertainments)

WMS'06, Chania, Crete

wordnet hierarchies
WordNet Hierarchies
  • The synsets are also organized into senses
    • Senses: Different meanings of the same term
  • The synsets are related to other synsets higher or lower in the hierarchy by different types of relationships e.g.
    • Hyponym/Hypernym (Is-A relationships)
    • Meronym/Holonym (Part-Of relationships)
  • Nine noun and several verb Is-A hierarchies

WMS'06, Chania, Crete

slide9
MeSH
  • MeSH: ontology for medical and biological terms by the N.L.M.
  • Organized in IS-A hierarchies
    • More than 15 taxonomies, more than 22,000 terms
  • No part-of relationships
  • The terms are organized into synsets called “entry terms’’

WMS'06, Chania, Crete

semantic similarity methods
Semantic Similarity Methods
  • Map terms to an ontology and compute their relationship in that ontology
  • Four main categories of methods:
    • Edge counting: path length between terms
    • Information content: as a function of their probability of occurrence in a corpus
    • Feature based: similarity between their properties (e.g., definitions) or based on their relationships to other similar terms
    • Hybrid: combine the above ideas

WMS'06, Chania, Crete

example
Example
  • Edge counting distance between “conveyance” and “ceramic” is 2
  • An information content method, would associate the two terms with their common subsumer and with their probabilities of occurrence in a corpus

WMS'06, Chania, Crete

x similarity
X-Similarity
  • Relies on matching between synsets and set description sets
  • A,B: synsets or term description sets
  • Do the same with all IS-A, Part-Of relationships and take their maximum

WMS'06, Chania, Crete

example14
Example
  • S(Hypothyroidism, Hyperthyroidism) = 0.387

WMS'06, Chania, Crete

evaluation
Evaluation
  • The most popular methods are evaluated
  • All methods applied on a set of 38 term pairs
  • Their similarity values are correlated with scores obtained by humans
  • The higher the correlation of a method the better the method is

WMS'06, Chania, Crete

evaluation on wordnet
Evaluation on WordNet

WMS'06, Chania, Crete

evaluation on mesh
Evaluation on MeSH

WMS'06, Chania, Crete

cross ontology measures
Cross Ontology Measures
  • We used 40 MeSH terms pairs
  • One of the terms is a also a WordNet term
  • We measured correlation with scores obtained by experts

WMS'06, Chania, Crete

comments
Comments
  • Edge counting/Info. Content methods work by exploiting structure information
  • Good methods take the position of the terms into account
    • Higher similarity for terms which are close together but lower in the hierarchy e.g., [Li et.al. 2003]
  • X – Similarity performs at least as good as other Feature-Based methods
  • Outperforms other Cross-Ontology methods

WMS'06, Chania, Crete

conclusions
Conclusions
  • Semantic similarity methods approximated the human notion of similarity reaching correlation up to 83%
  • Cross ontology similarity is a difficult problem that required further investigation
  • Work towards integrating Sem. Sim within IntelliSearch information Retrieval System for Web documents
    • http://www.intelligence.tuc.gr/intellisearch

WMS'06, Chania, Crete

try our system on the web
Try our system on the Web

http://www.intelligence.tuc.gr/similarity

Implementation:

Giannis Varelas

Spyros Argyropoulos

WMS'06, Chania, Crete