1 / 17

Provenance Management Framework

Provenance Management Framework. Satya S. Sahoo Kno.e.sis Center, Wright State University In Collaboration with Tarleton Lab, University of Georgia and Microsoft Research http://knoesis.wright.edu/research/semsci/application_domain/sem_prov /. Outline. Provenance – Introduction

wyman
Download Presentation

Provenance Management Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Provenance Management Framework Satya S. Sahoo Kno.e.sis Center, Wright State University In Collaboration with Tarleton Lab, University of Georgia and Microsoft Research http://knoesis.wright.edu/research/semsci/application_domain/sem_prov/

  2. Outline • Provenance – Introduction • Provenance Representation • Classification of Provenance Queries & Query Operators • Provenance Query Engine • T.cruzi SPSE Provenance Management System

  3. Provenance – Introduction Gene Name • Provenance from the French word “provenir” describes the lineage or history of a data entity • For Verification and Validation of Data Integrity, Process Quality, and Trust • Application of Provenance Metadata beyond verification and validation –eScience Data Management Gene Knockout and Strain Creation* Sequence Extraction 3‘ & 5’ Region Drug Resistant Plasmid Gene Name Plasmid Construction Knockout Construct Plasmid T.Cruzi sample ? Transfection Transfected Sample Drug Selection Cloned Sample Selected Sample Cell Cloning Cloned Sample *T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of Georgia

  4. Outline • Provenance – Introduction • Provenance Representation • Classification of Provenance Queries & Query Operators • Provenance Query Engine • T.cruzi SPSE Provenance Management System

  5. Provenir ontology Gene Name • A Common Provenance Model defined in OWL-DL – Provenir ontology • Provenance Metadata as RDF – allows use of Semantic Web Reasoning Framework • A Suite of Domain-specific Provenance ontologies - Provenir as Common Reference Model • Three Base Classes – 8 specialized Sub-classes, Eleven Foundational Relations – reuse of Relation Ontology Sequence Extraction contained_in 3‘ & 5’ Region Drug Resistant Plasmid AGENT Plasmid Construction Knockout Construct Plasmid T.Cruzi sample has_agent Transfection Transfection Machine DATA Transfected Sample Drug Selection participates_in Selected Sample PROCESS Cell Cloning Cloned Sample

  6. Domain-specific Provenance: Parasite Experiment ontology PROVENIR ONTOLOGY agent has_agent is_a is_a data parameter has_participant is_a data_collection is_a process is_a spatial_parameter temporal_parameter domain_parameter is_a is_a is_a is_a is_a is_a transfection_machine location is_a drug_selection is_a is_a sample has_participant Time:DateTimeDescritption transfection cell_cloning is_a transfection_buffer strain_creation_ protocol Tcruzi_sample PARASITE EXPERIMENT ONTOLOGY has_parameter *Parasite Experiment ontology available at: http://wiki.knoesis.org/index.php/Trykipedia

  7. Outline • Provenance – Introduction • Provenance Representation • Classification of Provenance Queries & Query Operators • Provenance Query Engine • T.cruzi SPSE Provenance Management System

  8. Provenance Query Classification Classified Provenance Queries into Three Categories • Type 1: Querying for Provenance Metadata • Example: Which gene was used create the cloned sample with ID = 65? • Type 2: Querying for Specific Data Set • Example: Find all knockout construct plasmids created by researcher Michelle using “Hygromycin” drug resistant plasmid betweenApril 25, 2008 and August 15, 2008 • Type 3: Operations on Provenance Metadata • Example: Were the two cloned samples 65 and 46 prepared under similar conditions – compare the associated provenance information

  9. Provenance Query Operators Four Query Operators – based on Query Classification • provenance () – Closure operation, returns the complete set of provenance metadata for input data entity • provenance_context() - Given set of constraints defined on provenance, retrieves datasets that satisfy constraints • provenance_compare () - adapt the RDF graph equivalence definition • provenance_merge () - Two sets of provenance information are combined using the RDF graph merge

  10. Outline • Provenance – Introduction • Provenance Representation • Classification of Provenance Queries & Query Operators • Provenance Query Engine • T.cruzi SPSE Provenance Management System

  11. Provenance Query Engine • Support Provenance Query Operators over a RDF store • Provenance Query Engine based on Jena plug-in for Oracle RDF store (support for SPARQL specification) • Developed as an API, compatible with any RDF store with support for Rules • Maps Query Operators to Domain-specific Provenance ontology – uses RDFS Entailment Rules • Query Optimization: Defined a new class of materialized views called Materialized Provenance Views (MPV) • MPV defined by Provenir ontology

  12. Outline • Provenance – Introduction • Provenance Representation • Classification of Provenance Queries & Query Operators • Provenance Query Engine • T.cruzi SPSE Provenance Management System

  13. T.cruzi SPSE Provenance Management System

  14. Conclusions • A Common Model of Provenance – Interoperable, Consistent Interpretation and well-defined Semantics • Categorization of Provenance Queries – Query Operators • Provenance Query Engine • Application of Provenance Metadata beyond Verification and Validation – eScience Data Management PROVENANCE ALGEBRA MATERIALIZED PROVENANCE VIEW

  15. Acknowledgement • D. Brent Weatherly – Tarleton Lab, University of Georgia • Flora Logan – The Wellcome Trust Sanger Institute • Roger Barga– Microsoft Research • Jonathan Goldstein – Microsoft Research • RaghavaMutharaju– Kno.e.sis Center, Wright State University • PramodAnantharam- Kno.e.sis Center, Wright State University

  16. More Resources at: • Satya S. Sahoo et. al, "Where did you come from...Where did you go?" An Algebra and RDF Query Engine for Provenance, (http://knoesis.wright.edu/library/resource.php?id=00706) • Trykipedia: A Wiki-based public resource for Parasite Researchers http://knoesis.wright.edu/trykipedia • Provenance Management Framework: http://knoesis.wright.edu/research/semsci/application_domain/sem_prov/ • T.cruzi Semantic Problem Solving Environment: http://knoesis.wright.edu/research/semsci/application_domain/sem_life_sci/tcruzi_pse/

More Related