1 / 55

Leveraging Data and Structure in Ontology Integration

Leveraging Data and Structure in Ontology Integration. Octavian Udrea 1 Lise Getoor 1 Renée J. Miller 2 1 University of Maryland College Park 2 University of Toronto. Contents. Motivation and goals Short overview of OWL Lite The ILIADS method Experimental evaluation. ILIADS. Goal:

sienna
Download Presentation

Leveraging Data and Structure in Ontology Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Leveraging Data and Structure in Ontology Integration Octavian Udrea1 Lise Getoor1 Renée J. Miller2 1University of Maryland College Park 2University of Toronto

  2. Contents • Motivation and goals • Short overview of OWL Lite • The ILIADS method • Experimental evaluation

  3. ILIADS • Goal: • Produce high-quality integration via a flexible method able to adapt to a wide variety of ontology sizes and structures • Method: • Combining statistical and logical inference • Use schema (structure) and data (instances) effectively • Solution: • Integrated Learning In Alignment of Data and Schema (ILIADS)

  4. Contributions • Show how to combine statistical and logical inference effectively • Show that a small amount of inference yields high qualitative gain • Show that parameters needed to perform inference over data and structure are robust • Provide a thorough evaluation on 30 pairs of real-world ontologies (with ground truth)

  5. Contents • Motivation and goals • Short overview of OWL Lite • The ILIADS method • Experimental evaluation

  6. Example OWL Lite ontologies (discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty) (discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty) (resultsF rom, rdfs:subPropertyOf, associatedWith)

  7. Example OWL Lite ontologies • Anentitycan be a: • Class (discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty) (discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty) (resultsF rom, rdfs:subPropertyOf, associatedWith)

  8. Example OWL Lite ontologies • Anentitycan be a: • Class • Instance (discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty) (discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty) (resultsF rom, rdfs:subPropertyOf, associatedWith)

  9. Example OWL Lite ontologies • Anentitycan be a: • Class • Instance • Property (discoveredBy, owl:inverseOf, discoverer); (discoveredBy, owl:type, owl:FunctionalProperty) (discoveredBy, owl:inverseOf, discoverer); (associatedWith, owl:type, owl:TransitiveProperty) (resultsF rom, rdfs:subPropertyOf, associatedWith)

  10. Example OWL Lite ontologies (discoveredBy, owl:inverseOf, discoverer) (discoveredBy, owl:type, owl:FunctionalProperty) (discoveredBy, owl:inverseOf, discoverer) (associatedWith, owl:type, owl:TransitiveProperty) (resultsF rom, rdfs:subPropertyOf, associatedWith)

  11. Inference in OWL Lite

  12. Inference in OWL Lite

  13. Inference in OWL Lite

  14. The integration problem

  15. The integration problem

  16. The integration problem

  17. The integration problem

  18. Contents • Motivation and goals • Short overview of OWL Lite • The ILIADS method • Experimental evaluation

  19. State of the art • Robust statistical methods • Well-known similarity measures • Used for matching data (entities) and schema • May use graph structure of schema • Logical inference • Not combined with statistical inference • Basis for most schema mapping and ontology integration methods • Approaches integrate schema, but not data

  20. Issues • How to combine statistical inference with logical inference • Takes into account data, structure, etc. so it’s no longer obvious • In particular, how to quantify the results of logical inference into a similarity-like form? • How to do logical inference in a tractable manner • For OWL-Lite, EXPTIME-complete for the worst case for the entire ontology

  21. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Select relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  22. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Select relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  23. Computing local similarities • simlexical: Jaro-Winkler and Wordnet • simstructural: Jaccard for neighborhoods • simextensional: Jaccard on extensions • parameters: λx, λs, λe • different for classes, instances and properties

  24. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Select relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  25. Selecting promising candidates • Select candidates with sim(e,e’) > λt • Use a policy based on entity type to order, e.g.: • Class alignments first • Instance alignments first • Alternate between classes and instances

  26. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Select relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  27. Selecting relationship • Must decide on relation type • subClassOf vs. equivalentClass • subPropertyOf vs. equivalentProperty • Determination is difficult, especially under the OWL open-world semantics • Use a simple extension based technique based on a threshold λr

  28. Selecting relationship

  29. Selecting relationship

  30. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Represent candidate relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  31. Performing logical inference For the candidate pair (e,e’): • Select an axiom to apply • The logical consequences are the pairs of entities (e(i), e(j)) that have just become equivalent • Repeat a small number of times (5) to maintain tractability

  32. Performing logical inference

  33. Performing logical inference

  34. Performing logical inference

  35. Performing logical inference (TheodorEscherich, owl:sameAs, T.S. Escherich) is a logical consequence of the candidate (E-ColiPoisoning, owl:sameAs, E-Coli)

  36. The ILIADS algorithm repeat until no more candidates • Compute local similarities • Select promising candidates • For each candidate • Represent candidate relationship • Perform logical inference • Update score with the inference similarity • Select the candidate with the best score end

  37. Updating score For the candidate pair (e,e’): • Initial local similarity sim(e,e’) • Inference similarity over all consequences: • Updated similarity:

  38. Updating score

  39. Updating score

  40. Consistency • The constructed alignment is not guaranteed to be consistent • ILIADS can only detect inconsistencies that appear in the few logical inference steps • Pellet used to check consistency after ILIADS • Experimentally, inconsistent ontologies in less than .5% of runs

  41. Contents • Motivation and goals • Short overview of OWL Lite • The ILIADS method • Experimental evaluation

  42. Experimental framework • 30 pairs of real-world ontologies • From 194 to over 20,000 triples • From a variety of domains: medical, geographical, economical, biological • Ground truth provided by human reviewers • Multiple iterations to ensure the best human-provided alignment • Datasets available: • http://www.cs.umd.edu/linqs/projects/iliads

  43. Experimental framework • Evaluation: precision, recall and F1 quality • F1 = 2 * Precision * Recall / (Precision + Recall) • 7 independent runs • ILIADS Variations: • ILIADS-tailored uses the best set of parameters for each pair of ontologies • ILIADS-fixed uses one set of parameters for all pairs of ontologies • Used to evaluate robustness of the parameters

  44. Experimental framework • ILIADS compared to two leading tools: • FCA-merge [Stumme and Maedche, IJCAI 2001] • uses formal concept analysis and an external document corpus • COMA++ [Aumueller et al., SIGMOD 2005] • implements multiple match strategies, including fragment and reuse-based matching

  45. Precision/recall

  46. Precision/recall

  47. Precision/recall

  48. Precision/recall

  49. Precision/recall comparison

  50. Precision/recall for ontologies with substantial instance data

More Related