Ontology-driven Provenance Management in eScience:
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

Ontology-driven Provenance Management in eScience: An Application in Parasite Research PowerPoint PPT Presentation


  • 52 Views
  • Uploaded on
  • Presentation posted in: General

Ontology-driven Provenance Management in eScience: An Application in Parasite Research. Satya S. Sahoo 1 , D. Brent Weatherly 2 , Raghava Mutharaju 1 , Pramod Anantharam 1 , Amit Sheth 1 , Rick L. Tarleton 2. 1 Kno.e.sis Center, Wright State University;

Download Presentation

Ontology-driven Provenance Management in eScience: An Application in Parasite Research

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ontology driven provenance management in escience an application in parasite research

Ontology-driven Provenance Management in eScience:

An Application in Parasite Research

Satya S. Sahoo1, D. Brent Weatherly2, Raghava Mutharaju1, Pramod Anantharam1, Amit Sheth1, Rick L. Tarleton2

1Kno.e.sis Center, Wright State University;

2 Center for Tropical and Emerging Diseases, University of Georgia

ODBASE2009

Vilamoura, Algarve-Portugal

November 05, 2009


Parasite research creation of parasite strains

Parasite Research: Creation of Parasite Strains

New Parasite Strains


Provenance in parasite research

Provenance in Parasite Research

Gene Name

Related Queries from Biologists

  • Q2: List all groups in the lab that used a Target Region Plasmid?

  • Q3: Which researcher created a new strain of the parasite (with ID = 66)?

  • An experiment was not successful – has this experiment been conducted earlier? What were the results?

Gene Knockout and Strain Creation*

Sequence

Extraction

3‘ & 5’

Region

Drug Resistant Plasmid

Gene Name

Plasmid

Construction

Knockout Construct Plasmid

T.Cruzi sample

?

Transfection

Transfected Sample

Drug

Selection

Cloned Sample

Selected Sample

Cell

Cloning

Cloned

Sample

*T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of Georgia


Provenance management in science

Provenance Management in Science

  • Provenance from the French word “provenir” describes the lineage or history of a data entity

  • For Verification and Validation of Data Integrity, Process Quality, and Trust

  • Issues in Provenance Management

    • Provenance Modeling

    • A Dedicated Query Infrastructure

    • Practical Provenance Management Systems


Outline

Outline

  • Provenance Modeling: Provenir →Parasite Experiment ontology

  • Provenance Query Infrastructure

  • Provenance Query Engine

  • Evaluation Results

  • Query Optimization: Materialized Provenance Views


Ontologies for provenance modeling

Ontologies for Provenance Modeling

  • Advantages of using Ontologies

    • Formal Description: Machine Readability, Consistent Interpretation

    • Use Reasoning: Knowledge Discovery over Large Datasets

  • Problem: A gigantic, monolithic Provenance Ontology! – not feasible

  • Solution: Modular Approach using a Foundational Ontology

FOUNDATIONAL

ONTOLOGY

PARASITE

EXPERIMENT

GLYCOPROTEIN

EXPERIMENT

OCEANOGRAPHY


Provenir ontology

Provenir Ontology

Gene Name

Sequence

Extraction

3‘ & 5’

Region

Drug Resistant Plasmid

AGENT

Plasmid

Construction

Knockout Construct Plasmid

T.Cruzi sample

has_agent

Transfection

Transfection Machine

DATA

Transfected Sample

Drug

Selection

participates_in

Selected Sample

PROCESS

Cell

Cloning

Cloned

Sample


Provenir ontology schema

Provenir Ontology Schema

SPATIAL

THEMATIC

TEMPORAL

is_a

is_a

is_a

located_in

PARAMETER

DATA COLLECTION

is_a

is_a

AGENT

has_temporal_value

DATA

participates_in

has_agent

PROCESS

preceded_by


Domain specific provenance parasite experiment ontology

Domain-specific Provenance: Parasite Experiment ontology

PROVENIR

ONTOLOGY

agent

has_agent

is_a

is_a

data

parameter

has_participant

is_a

data_collection

is_a

process

is_a

spatial_parameter

temporal_parameter

domain_parameter

is_a

is_a

is_a

is_a

is_a

is_a

transfection_machine

location

is_a

drug_selection

is_a

is_a

sample

has_participant

Time:DateTimeDescritption

transfection

cell_cloning

is_a

transfection_buffer

strain_creation_

protocol

Tcruzi_sample

PARASITE

EXPERIMENT

ONTOLOGY

has_parameter

*Parasite Experiment ontology available at: http://wiki.knoesis.org/index.php/Trykipedia


Outline1

Outline

  • Provenance Modeling: Provenir →Parasite Experiment ontology

  • Provenance Query Infrastructure

  • Provenance Query Engine

  • Evaluation Results

  • Query Optimization: Materialized Provenance Views


Provenance query classification

Provenance Query Classification

Classified Provenance Queries into Three Categories

  • Type 1: Querying for Provenance Metadata

    • Example: Which gene was used create the cloned sample with ID = 66?

  • Type 2: Querying for Specific Data Set

    • Example: Find all knockout construct plasmids created by researcher Michelle using “Hygromycin” drug resistant plasmid betweenApril 25, 2008 and August 15, 2008

  • Type 3: Operations on Provenance Metadata

    • Example: Were the two cloned samples 65 and 46 prepared under similar conditions – compare the associated provenance information


Provenance query operators

Provenance Query Operators

Four Query Operators – based on Query Classification

  • provenance () – Closure operation, returns the complete set of provenance metadata for input data entity

  • provenance_context() - Given set of constraints defined on provenance, retrieves datasets that satisfy constraints

  • provenance_compare () - adapt the RDF graph equivalence definition

  • provenance_merge () - Two sets of provenance information are combined using the RDF graph merge


Answering provenance queries using provenance operator

Answering Provenance Queries using provenance () Operator


Outline2

Outline

  • Provenance Modeling: Provenir →Parasite Experiment ontology

  • Provenance Query Infrastructure

  • Provenance Query Engine

  • Evaluation Results

  • Query Optimization: Materialized Provenance Views


Provenance query engine

Provenance Query Engine

  • Available as API for integration with provenance management systems

  • Layer on top of a RDF Data Store (Oracle 10g), requires support for:

    • Rule-based reasoning

    • SPARQL query execution

  • Input:

    • Type of provenance query operator : provenance ()

    • Input value to query operator: cloned sample 66

    • User details to connect to underlying RDF store


Outline3

Outline

  • Provenance Modeling: Provenir →Parasite Experiment ontology

  • Provenance Query Infrastructure

  • Provenance Query Engine

  • Evaluation Results

  • Query Optimization: Materialized Provenance Views


Evaluation results

Evaluation Results

  • Queries expressed in SPARQL

  • Datasets using real experiment data


Evaluation results1

Evaluation Results


Outline4

Outline

  • Provenance Modeling: Provenir →Parasite Experiment ontology

  • Provenance Query Infrastructure

  • Provenance Query Engine

  • Evaluation Results

  • Query Optimization: Materialized Provenance Views


Query optimization materialized provenance views

Query Optimization: Materialized Provenance Views

  • Materializes a single logical unit of provenance

  • Does not require query-rewriting

  • View updates: addressed by characteristics of provenance

  • Created using a memoization approach


Provenance query engine architecture

Provenance Query Engine Architecture

QUERY OPTIMIZER

TRANSITIVE CLOSURE


Evaluation results using materialized provenance views

Evaluation Results using Materialized Provenance Views


Provenance management system for parasite research

Provenance Management System for Parasite Research


Acknowledgement

Acknowledgement

  • Flora Logan – The Wellcome Trust Sanger Institute, Cambridge, UK

  • Priti Parikh – Kno.e.sis Center, Wright State University

  • Roger Barga– Microsoft Research, Redmond

  • Jonathan Goldstein – Microsoft Research, Redmond


Thank you

Thank you

Contact email: [email protected]

Web:

http://knoesis.wright.edu/researchers/satya/


  • Login