1 / 18

Inexact Matching of Ontology Graphs Using Expectation-Maximization

Inexact Matching of Ontology Graphs Using Expectation-Maximization. Prashant Doshi , Christopher Thomas LSDIS Lab, Dept. of Computer Science, University of Georgia. Motivating Example. Weapons ontology 2 (data). Weapons ontology 1 (model). Candidate Ontology Match. Motivating Example.

nreis
Download Presentation

Inexact Matching of Ontology Graphs Using Expectation-Maximization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inexact Matching of Ontology Graphs Using Expectation-Maximization Prashant Doshi, Christopher Thomas LSDIS Lab, Dept. of Computer Science, University of Georgia

  2. Motivating Example Weapons ontology 2 (data) Weapons ontology 1 (model) Candidate Ontology Match

  3. Motivating Example Weapons ontology 2 (data) Weapons ontology 1 (model) Candidate Ontology Match

  4. Ontology Matching • Problem: Match nodes and edges (if labeled) of different ontologies • Essential step in ontology engineering Types of Match • Exact matches – Isomorphisms with edge consistency • Bijection • E.g. GLUE (Doan02), BayesOWL (Ding05), FALCON-AO(Hu05), OMEN(Mitra05) • Inexact matches – Homomorphisms with edge consistency • Many-one or Many-Many • E.g. This approach (Many-one)

  5. Match Quality Space of Matches Overview of Our Approach • Exploit structural and lexical similarity • Graph structure • Node and edge labels • Formulation within the iterative Expectation-Maximization (EM) scheme • Suitable for taxonomies but can be used for edge-labeled ontologies using reification May converge to local maxima

  6. Edge-Labeled Ontology Graphs Reification • Reified bipartite graph (Hayes&Gutierrez04) • Distinct edge label is a node • Dummy nodes are introduced to preserve the relations. Edge-labeledgraph

  7. Background: EM • Developed by Dempster, Laird and Rubin (1977) • Maximum likelihood estimate of an underlying model from observeddata (X) in the presence of missing values (Y) • E-step • Evaluate the likelihood of different models (Mn+1) given a seed model (Mn) M-step • Choose the best model and use it in the next iteration Generalized M-step • Select a model that is better than the current one

  8. Graph Matching Using GEM • Treat the match assignments as the model • Mixture model • Given a data node, the correspondence with some model node is a hidden variable

  9. E-Step becomes • Above equation is simplified considerably • Involves finding the lexical similarity between • the node labels • We use the generalized M-step

  10. String Similarity Measures • String distance metrics (Cohen et al. 03): • Exact string match • Substring match • N-Gram score • Sequence alignment score (Smith&Waterman81) S1: Modern Naval Ship 000000 11111 0001111 S2: Naval Warship

  11. Model Sampling • Model space is large: • Random sampling from the model space • Combine sampling with intuitive heuristics Mn+1 Map-Parent Heuristic Mn+1 Mn+1 Mn

  12. Simple Example Q(M1’|M0) = 52.56 M1’ Q(M1|M0) = 51.57 M0 M1

  13. Computational Complexity • Complexity of the E step is O([|Vd||Vm|]2) • In the M step, if we generate K samples within a sample set, the worst case complexity is O(K[|Vd||Vm|]2)

  14. Performance Weapons ontologies from the I3CON repository Matching heuristics speed up the converge

  15. Recall = 77.8% Precision = 63.6% Lexical Match

  16. Recall = 100% Precision = 90% GEM Match

  17. Discussion • A principled technique for inexact matching of ontology schemas using Generalized EM • Considers structural and label similarity • Produces the most likely match • Many-one correspondence allows mapping between clusters of different semantic granularity • Computational complexity is a issue • More efficient ways to cover the model space

  18. Thank you Questions

More Related