1 / 24

Large Scale Integration of Senses for the Semantic Web

Large Scale Integration of Senses for the Semantic Web. Jorge Gracia , Mathieu d’Aquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS) University of Zaragoza , Spain Knowledge Media Institute (KMi) Open University , United Kingdom.

afric
Download Presentation

Large Scale Integration of Senses for the Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Large Scale Integration of Sensesfor the Semantic Web Jorge Gracia, Mathieu d’Aquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS) University of Zaragoza, Spain Knowledge Media Institute (KMi) Open University, United Kingdom 18th International World Wide Web Conference Madrid, Spain, 20th-24th April 2009

  2. Outline • Introduction • Method • Optimization study • Experiments • Conclusions WWW 2009

  3. Introduction • Current Semantic Web • Favoured by the increasing amount of online ontologies already available on the Web • Hampered by the high heterogeneity that this growing semantic content introduces • The redundancy problem • Excess of different semantic descriptions, coming from different sources, to describe the same intended meaning • Our proposal • A method to cluster the ontology terms that one can find on the Semantic Web, according to the meaning that they intend to represent WWW 2009

  4. Introduction WWW 2009

  5. Introduction WWW 2009

  6. Introduction • Redundancy problem: many representations of the same meanings apple Watson ? The Semantic Web WWW 2009

  7. Introduction The Tree The Fruit The Company • Proposed solution: pool of cross-ontology integrated senses apple “clustered” Watson The Semantic Web WWW 2009

  8. Introduction Question Answering Scarlet Ontology Matching Folksonomy Enrichment Watson Semantic Browsing QueryGen Semantic Query Generation Multiontology Semantic Disambiguator Ontology Evolution The Semantic Web WWW 2009

  9. Method Synonym expansion Keyword maps Synonym maps (each synonym map) Ontology terms Sense clustering Watson OFF-LINE Extraction Similarity Computation CIDER Similarity > threshold? no yes yes more ont. terms? integration no Modify integration? Senses yes Modify integration degree rise threshold? no yes RUN-TIME Integration Disintegration Senses Clustering WWW 2009

  10. Method apple apple apple apple apple apple apple apple apple apple apple apple • Keyword maps: ontology terms with identical label Watson WWW 2009

  11. Method manzana apple Apple Inc. apple apple apple apple apple Apple Inc. apple apple apple tree apple apple apple tree apple apple • Synonym maps: ontology terms with synonym labels Watson WWW 2009

  12. Method • Agglomerative clustering a’’ a’ c a a a d d CIDER b . . . b b c e c d e e WWW 2009

  13. Method • Sense maps: semantically equivalent terms grouped The Tree The Fruit apple apple apple apple CIDER apple apple tree apple tree apple apple Apple Inc. apple manzana Apple Inc. apple apple The Company apple apple WWW 2009

  14. Method Falling threshold (Integration) Rising threshold (Disintegration) Optimal threshold WWW 2009

  15. Optimization study • Integration level varies with similarity threshold Integration Level = 1 - # finalSenses / # initialOntologyTerms WWW 2009

  16. Optimization study • Which similarity threshold is the best one? • Three exploration ways: • Experimenting with ontology matching benchmarks • Obtained 0.13 lower bound for optimal threshold • Contrasting with human opinion • Range of good values between 0.2 and 0.3 • Optimizing time response. Because: • It will reduce the response time of the overall system • Compatible with the other two ways • It is not always feasible to have a large enough number of humans to ask or reference alignments WWW 2009

  17. Optimization study • Response time varies with threshold • Optimal value around 0.22 WWW 2009

  18. Experiments • Scalability study • 9156 keywords, 73169 different ontology terms to be clustered, • Processing time is linear with number of ontology terms WWW 2009

  19. Experiments • Scalability study • Processing time is independent of ontology size WWW 2009

  20. Experiments • Illustrative example • Keyword = turkey • Synonym map = turkey, Türkei, Türkiye • Nº ontology terms = 58 • Nº Integrated senses = 9 (threshold = 0.27) WWW 2009

  21. Experiments • More examples(threshold = 0.19) WWW 2009

  22. Experiments • Positive facts • Terms from different versions of the same ontology are easily detected • Very different meanings are not wrongly integrated (e.g., “plant” as “living organism” with “plant” as “industrial buildings”) • Negative facts • Hard to obtain a total integration of the same meanings (caused by very different semantic descriptions) WWW 2009

  23. Conclusions • Conclusions • Redundancyof semantic descriptions on the Web can be significantly reduced • Our integration technique scales when used on a large body of knowledge • The proposed method is flexible enough to configure and adapt our integration level to the necessities of client applications • Future work • More advanced prototype • More extensive human-based evaluation • Study and evaluation of impact on other systems WWW 2009

  24. END of presentation Thank you! WWW 2009

More Related