1 / 21

Metadata Management in the Taverna Workflow System

Metadata Management in the Taverna Workflow System. Khalid Belhajjame 1 , Katy Wolstencroft 1 , Franck Tanoh 1 , Alan Williams 1 , Tom Oinn 2 and Carole Goble 1 1 University of Manchester Manchester, UK 2 EMBL European Bioinformatics Institute, Hinxton, UK. Outline. Taverna

calum
Download Presentation

Metadata Management in the Taverna Workflow System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata Management in the Taverna Workflow System Khalid Belhajjame1, Katy Wolstencroft1, Franck Tanoh1, Alan Williams1, Tom Oinn2 and Carole Goble1 1University of Manchester Manchester, UK 2 EMBL European Bioinformatics Institute, Hinxton, UK

  2. Outline • Taverna • Metadata for Describing Workflow Entities • Metadata for Describing Workflow Provenance • Metadata Curation • Applications

  3. Taverna • Taverna allows scientists to access analysis tools hosted on a variety of different platforms in a unified fashion • It interoperates these tools through a dataflow model • Interoperate between : • Different scales – local machine, web service, grid • Different implementations – EGEE, Naregi etc. • Hide mechanics from the end user where possible

  4. Workflow Model

  5. Dispatch logic Taverna 2 opens up the per-processor dispatch logic. Dispatch layers can ignore, pass unmodified, block, modify or act on any message and can communicate with adjacent layers. Each processor contains a single stack of arbitrarily many dispatch layers. Job Queue & Service List Single Job & Service List Single dispatch layer Dispatch layer composition allows for complex control flow within a given processor. DispatchLayer is an extensibility point. Use it to implement dynamic binding, caching, recursive behaviour…? DispatchLayer implementation Job specification messages from layer above Fault Message Result Message Data and error messages from layer below

  6. Dispatch Stack Model

  7. Dispatch Stack: Example Instance

  8. Data Management

  9. Metadata “something” Metadata: data that describe other data to enhance its usefulness” Les Carr, Wendy Hall, Sean Bechhofer and Carole A. Goble: Conceptual linking: ontology-based open hypermedia. WWW 2001: 334-342 Metadata in Taverna • Metadata for describing workflow entities • What is the value added of a given workflow? • What is the task a given service performs? • What are the services that can be associated with a processor? • Metadata for describing workflow provenance • How did the execution of a given workflow go? • What this the semantics of a data product? • How many invocations of a given service failed?

  10. Metadata for Describing Workflow Entities (cont.)

  11. Workflow Instance Process/ Workflow Data products

  12. Process • Extend the provenance model to capture the internal behaviour of processors’ enactments

  13. Data products • Data: There are four kinds of data entities that can be input/output by a workflow processor: literals, data documents composed of a collection of reference schemes, error documents and list of data entities.

  14. Data products (cont.)

  15. Metadata Curation

  16. Applications • Workflows artefacts • Guiding the design of workflows • Detecting and resolving Mismatches • Workflow discovery • Abstracting workflow specification • Provenance • Semantics based querying and browsing of workflow executions • Smart storage

  17. Conclusions • Taverna • New extensions • Metadata • For describing workflow entities • For describing provenance • Applications • Example applications that will benefit from collected metadata

  18. Mismatch Detection in Workflows Automatic detection of mismatches and support for retrieving the mapping appropriate for their correction

  19. myExperiment.org • A community social network • A market place • A gateway to other publishing environments • A platform for launching workflows • Encapsulated myExperiment Objects • Mindful publication

  20. Abstracting Workflow Specifications

  21. Ontology Metadata for Describing Workflow Entities • Workflow entities • Workflow/subworkflow specifications • Processors that compose workflows • Services that performs the task • Encoding descriptions • Free text description or Keywords (tags) • Semantic concepts This workflow augments protein identification results with GO terms Protein identification Homology search Gene ontology term

More Related