1 / 18

Research Traceability using Provenance Services for Biomedical Analysis

This presentation discusses the use of provenance services for traceability in biomedical analysis. It covers the requirements from users, the bigger picture of user expectations, and the implementation of a provenance service called CRISTAL. The presentation highlights the importance of a robust provenance system in ensuring confidence in research results.

dharness
Download Presentation

Research Traceability using Provenance Services for Biomedical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Traceability using Provenance Services for Biomedical Analysis Dr Peter Bloodsworth CCCS Research Centre UWE, Bristol, UK peter.bloodsworth@cern.ch HealthGrid Presentation: 29th of June 2010

  2. Talk Structure • The neuGRID Project. • Requirements from Users. • The Bigger Picture. • A Provenance Service. • CRISTAL. • Conclusion. HealthGrid Presentation: 29th of June 2010

  3. The neuGRID Consortium Provincia Lombardo Veneta Fatebenefratelli, ITALY Neuralyse Europe (Prodema Medical), SWITZERLAND University of the West of England, Bristol, UK Maat Gknowledge, SPAIN Vrije Universiteit Medical Centre, THE NETHERLANDS Karolinska institutet, SWEDEN HealthGrid, FRANCE CF consulting s.r.l., ITALY HealthGrid Presentation: 29th of June 2010

  4. Project Objectives • To build a new user-friendly Grid-based research e-Infrastructure. • Collection/archiving of large amounts of imaging data. • Paired with computationally intensive data analyses. • To enable EU neuroscientists to carry out cutting-edge research. • Imaging of degenerative brain diseases. HealthGrid Presentation: 29th of June 2010

  5. HealthGrid Presentation: 29th of June 2010

  6. neuGRID Provenance Requirements 1 2 3 Provenance in neuGRID relates to: Data provenance (source, quality control applied and other facets.) Workflow provenance (author, versioning, certification, etc.) Analysis Result provenance (data set, workflow chosen, settings, errors, etc.) HealthGrid Presentation: 29th of June 2010

  7. The Bigger Picture • Real-world end users care about doing their research and getting their results. • They don’t care about the grid / certificates or virtual organisations. • They don’t want to learn grid-speak. • They don’t all want to do the same things in the same way. • They expect services that help them to do their work. • They expect a high-level of integration between services and reliability. HealthGrid Presentation: 29th of June 2010

  8. The neuGRID Provenance Service HealthGrid Presentation: 29th of June 2010

  9. The Provenance Architecture • Provenance API • Translator • CRISTAL Core • Provenance DB HealthGrid Presentation: 29th of June 2010

  10. Service Wrapper • Provides a web service-based interface to the Provenance Service • Consists of methods for • Creating workflows • Creating workflow instances • Storing workflow provenance • Retrieving workflow provenance HealthGrid Presentation: 29th of June 2010

  11. Translator • To prevent lock-in to a specific workflow format, the Provenance Service consists of an adaptor-based translator for converting user workflows into CRISTAL workflow format • Acts as bridge between users and CRISTAL core CRISTAL Core • Provenance management is handled internally by CRISTAL. • Workflow needs to be translated between user format and CRISTAL format. HealthGrid Presentation: 29th of June 2010

  12. CRISTAL was designed to track the development of LHC detector components at CERN HealthGrid Presentation: 29th of June 2010

  13. CRISTAL in neuGRID Overview CRISCRISTALTAL Workflow steps Analysis data Histories CRISTAL Process & Data Tracking Analysis Suite Researcher Provenance Data Input Data LORIS Derived Data A Complete Analysis Knowledge Base

  14. CRISTAL Main Functions • Complete capture of system functionality in workflows. • As every action is represented by a workflow activity, every operation is recorded and stored in a replayable way. • Every piece of data, including descriptions, is versioned, so all previous states of items are available. • Several interfaces exist to bridge to other components for database storage, job distribution, definition management, etc.

  15. Service Architecture

  16. Further Developments • Composite jobs. If some tasks are clustered together, they should be executed by CRISTAL as a composite activity. • In composite jobs, each sub-job should send the feedback to CRISTAL as soon as it completes its execution. • The Glueing Service should have user related information to map users to jobs and provenance data. • The Querying Service should query both CRISTAL provenance and LORIS data • The translation component in the pipeline service should map the user workflows to CRISTAL workflows. The translation should be two way. HealthGrid Presentation: 29th of June 2010

  17. Conclusions • A robust provenance system is necessary if users are to have confidence in and use the neuGRID infrastructure for their research. • Provenance is important throughout neuGRID, from data input through to analysis output. • Errors that occur at any stage may effect the final results. • It can be thought of as a chain of evidence and spans: • Data provenance (source, quality control applied and other facets.) • Workflow provenance (source, versioning, certification, etc.) • Analysis Result provenance (data set, workflow chosen, settings, errors, etc.) • We need CRISTAL which is a resource that is both powerful and flexible in the way that it captures provenance data. HealthGrid Presentation: 29th of June 2010

  18. Question Time None like this please!! HealthGrid Presentation: 29th of June 2010

More Related