Valerie Cross, Cosmin Stroe Xueheng Hu , Pramit Silwal , Maryam Panahiazar , Isabel F. Cruz,

Aligning the Parasite Experiment Ontologyand the Ontology for Biomedical InvestigationsUsing AgreementMaker Valerie Cross, CosminStroe XuehengHu, PramitSilwal, MaryamPanahiazar, Isabel F. Cruz, Priti Parikh, AmitSheth crossv@muohio.edu July 29 , 2011 ICBO @ Buffalo NY

Outline • Task: Align PEO and OBI Ontologies • OAEI Investigation • AgreementMaker Overview • Enhancements to AgreementMaker • Experimental Results • Conclusions and Future Work

Parasite Experiment Ontology (PEO)http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology • models provenance metadata associated with experiment protocols used in parasite research. • extends the upper-level Provenir ontology (http://knoesis.wright.edu/provenir/provenir.owl) • PEO (v 1.0) includes Proteome, Microarray, Gene Knockout, and Strain Creation experiment terms along with other terms that are used in pathway. • 110 classes & 27 properties, uses concepts in Parasite Life Cycle ontology Snapshot of PEO

Ontology for Biomedical Investigations(OBI)http://purl.obolibrary.org/obo/obi • describes biological and clinical investigations. • includes a set of 'universal' terms applicable across various biological and technological domains, and domain-specific terms relevant only to a given domain. • support the consistent annotation of biomedical investigations, regardless of the particular field of study. • represent the design of an investigation, the protocols and instrumentation used, the material used, the data generated and the type analysis performed on it. • being built under the Basic Formal Ontology (BFO).

Ontology Alignment Evaluation Initiative (OAEI) http://oaei.ontologymatching.org • Annual international competition to evaluate ontology alignment techniques with multiple tracks • Benchmark tests • Biomedical track (Mouse and NCI Human Anatomies) • Conference track (15 ontologies) • “Side effect” of the competition are published ontology sets consists of two ontologies and correct mappings as determined by experts • Results measured by • Recall, precision, and F-measure (combines recall and precision) • Runtime • Other

OAEI 2010http://oaei.ontologymatching.org/2010/results/anatomy/index.html#corrections

OAEI Anatomy Track • #1 The matcher has to be applied with its standard settings. • #2 An alignment has to be generated that favors precision over recall. • #3 An alignment has to be generated that favors recall over precision. • #4 A partial reference alignment has to be used as additional input.

AgreementMaker - OA SystemUniv. of Illinois Chicago, ADVIS Lab, Dr. Isabel F. Cruz and CosminStroe • Motivation • Automatic methods are required to match large ontologies • Several features of the ontologies have to be considered • Users need to trust the mappings and to be directly involved in the loop • System’s capabilities • Wide range of matching methods • Capability to smartly combine multiple strategies • Multi-purpose user interface to allow evaluation and manual interaction with the matchings • Extensible architecture to allow reuse and composition of the matching modules

Architecture of a Matcher

Existing Matchers • First layer (conceptual) • BSM (Basic Similarity Matcher) • PSM (Parametric String-Based Matcher) • ASM (Advanced Similarity Matcher) • VMM (Vector-based Multi-term Matcher) • Second layer (structural) • DSI Descendent Similarity Inheritance • SSC Sibling Similarity Contribution • Third Layer (aggregation) • LWC Linear Weighted Combination

LWC

Lexicon Extensions to Matchers • AgreementMaker version 0.22 extended these string-based matchers by integrating two lexicons (2010 OAEI): • the Ontology Lexicon, built from synonym and definition annotations existing in the ontologies themselves, and • the WordNet Lexicon, created by starting with the ontology lexicon and adding any non-duplicated synonyms/definitions found in WordNet • Result: BSMlex, PSMlex, and VMMlex.

Initial Experiments • AgreementMaker (ver. 0.22) with the OAEI 2010 anatomy configuration resulted in only two mappings • Found inconsistency in entity descriptions of PEO and OBI. • Identifiers: PEO URIs use a textual fragment identifier (http://knoesis.wright.edu/ParasiteExperiment owl#transfection), while OBI's entities use numerical identifiers (e.g., http:// purl.obolibrary.org/obo/OBI_0600060). • Labels: PEO's use of the rdfs:label field (on 19.1% of classes) does not follow the specification guidelines since it contains a PLO identifier. OBI uses the rdfs:label field to contain a descriptive string on almost 100% of its classes. • Comments: PEO uses on 99% of its classes and provides a definition. OBI only uses the comment field on about 4% of its classes. • Some common annotations exist between PEO and OBI BUT either PEO or OBI has low coverage • OBI has high coverage for label annotations • PEO has high coverage for comment annotations. • This heterogeneity and matchers matching the same annotations to each other (i.e., class ID with class ID, label with label, etc.) resulted in almost no alignment.

Annotation Profiling • allow the user to select and combine different annotations of the source or target ontology to be used in the alignment process.

Provenance Information Added

Customization of Lexicon Matchers • The lexicon builders for BSMlex, PSMlex, and VMMlex lexicon use a fixed name for the synonym and definition annotations (hasSynonym and hasDefinition). • Lexicon builder modified to exploit the synonym annotations in PEO and OBI by having the user choose the annotation names used to create the lexicons. • OBI does not use hasSynonym but uses IAO annotation properties IAO 0000111 (“editor preferred term") and IAO 0000118 (“alternative term") which serve the same function as synonyms for the OBI. • The PEO does not use synonyms but uses the comment annotation for a definition in most cases. • Result: BSMlex+, PSMlex+, and VMMlex+.

BioPortal Mappings http://bioportal.bioontology.org/mappings http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology

Experimental Results

Overlapping of Matchers

Conclusions and Future Work • Experimental results in the biomedical domain demonstrate the problem of heterogeneous annotations of ontologies. • Validated past approach of extending matching algorithms using lexicons, showing the best results produced by matchers that use lexicons BSMlex+ • Investigate including more lexicons such as UMLS to achieve better result • Heterogeneity managed by increasing the flexibility of state of the art matching algorithms, i.e.,, annotation profiling, mapping provenance information and custom lexicons which supports a domain expert in this process • relies on the user to select relevant annotations to be used in the matching process. • More work needs to be done specifically to automatically identify semantically compatible annotations by applying established ontology evaluation metrics • Already have added a wide variety of semantic similarity measures to AgreementMaker for future use in semantic matching, not just lexical matching of concepts between ontologies. • .

THANK YOU!QUESTIONS?

Valerie Cross, Cosmin Stroe Xueheng Hu , Pramit Silwal , Maryam Panahiazar , Isabel F. Cruz,

Valerie Cross, Cosmin Stroe Xueheng Hu , Pramit Silwal , Maryam Panahiazar , Isabel F. Cruz,

Presentation Transcript

Isabel Allende

Valerie Gordon

VALERIE!

Valerie

Valerie Adams

Valerie Adams

Maryam Astaraie-Imani

Original cross of Blue to Wild Santa Isabel

Valerie F. Reyna and Frank Farley

Corobea Cosmin

Valerie Decapite

Valerie Adams

Sugar Free Maryam Dates

MARYAM

Distributor Roti Maryam