1 / 48

Paul FisherUniversity of Manchester

Service discovery with Feta and BioCatalogue. Later Practical session for hands-on ... Mark Greenwood, Yikun Guo, Ananth Krishna, Phillip Lord, Darren Marvin, Simon Miles, Luc ...

Kelvin_Ajay
Download Presentation

Paul FisherUniversity of Manchester

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    Slide 1:Paul Fisher University of Manchester

    An Introduction to Web Services and Scientific Workflows

    Slide 2:Overview

    Current analysis techniques Issues with manual analyses Web Services Workflows as Scientific protocols Workflow sharing, re-use, and repurposing in myExperiment Service discovery with Feta and BioCatalogue Later – Practical session for hands-on

    Slide 3:Manual analysis techniques

    Nucleic Acids Research (2009) - over 1170 databases Specialist software applications Navigating between software resources Cut and Paste of data Screen scraping of Web pages Scripting in Perl / Java / Python / C++ [insert another language here] Well over 200 from there. 139 different database for biopathways alone. Lincoln Stein describes it a as a bionation – like italy in the 19th Century.Well over 200 from there. 139 different database for biopathways alone. Lincoln Stein describes it a as a bionation – like italy in the 19th Century.

    Slide 4:Manual Methods of data analysis

    Many web resources There are also issues with the many and varied resources used to identify and characterise the candidate genes The web is a massive resource of information, with access to databases and applications Many resources have to be used in the characterisation of these genes, each with its own user interface and data Data from one resource is regularly cut and paste into another web interface for processing A typical search can encompass 20 resources For all the 200+ genes this means a lot of wasted time, effort, and can also lead to a lot of errors being encountered through human error Hyperlinks are also regularly used in bioinformatics, through what is called ‘link-integration’ Researchers can quickly progress through multiple web pages without recording what they have done or where they have been This leads to the problem of recording experimental provenanceMany web resources There are also issues with the many and varied resources used to identify and characterise the candidate genes The web is a massive resource of information, with access to databases and applications Many resources have to be used in the characterisation of these genes, each with its own user interface and data Data from one resource is regularly cut and paste into another web interface for processing A typical search can encompass 20 resources For all the 200+ genes this means a lot of wasted time, effort, and can also lead to a lot of errors being encountered through human error Hyperlinks are also regularly used in bioinformatics, through what is called ‘link-integration’ Researchers can quickly progress through multiple web pages without recording what they have done or where they have been This leads to the problem of recording experimental provenance

    Slide 5:Issues in analysis techniques

    Slide 6:Manual Methods of data analysis

    Navigating through hyperlinks No explicit methods Human error Tedious and repetitive Many web resources There are also issues with the many and varied resources used to identify and characterise the candidate genes The web is a massive resource of information, with access to databases and applications Many resources have to be used in the characterisation of these genes, each with its own user interface and data Data from one resource is regularly cut and paste into another web interface for processing A typical search can encompass 20 resources For all the 200+ genes this means a lot of wasted time, effort, and can also lead to a lot of errors being encountered through human error Hyperlinks are also regularly used in bioinformatics, through what is called ‘link-integration’ Researchers can quickly progress through multiple web pages without recording what they have done or where they have been This leads to the problem of recording experimental provenanceMany web resources There are also issues with the many and varied resources used to identify and characterise the candidate genes The web is a massive resource of information, with access to databases and applications Many resources have to be used in the characterisation of these genes, each with its own user interface and data Data from one resource is regularly cut and paste into another web interface for processing A typical search can encompass 20 resources For all the 200+ genes this means a lot of wasted time, effort, and can also lead to a lot of errors being encountered through human error Hyperlinks are also regularly used in bioinformatics, through what is called ‘link-integration’ Researchers can quickly progress through multiple web pages without recording what they have done or where they have been This leads to the problem of recording experimental provenance

    Slide 7:Implicit methods

    Implicit methodologies One of the major problems with current bioinformatics investigations is the lack of recording experimental methods These include the software applications used, the parameters used, and as mentioned before using hyperlinks in web pages This has been a big issue not just in Bioinformatics but also Biology, and as we can see here, Nature is fed up of researchers not publishing their methods By not stating how the experiments were run, they cannot be replicatedImplicit methodologies One of the major problems with current bioinformatics investigations is the lack of recording experimental methods These include the software applications used, the parameters used, and as mentioned before using hyperlinks in web pages This has been a big issue not just in Bioinformatics but also Biology, and as we can see here, Nature is fed up of researchers not publishing their methods By not stating how the experiments were run, they cannot be replicated

    Slide 8:Huge amounts of data

    200+ Genes Region on chromosome Microarray 1000+ Genes How do I look at ALL the genes systematically? Vast amounts of data We now have technologies to help with identification of candidate genes But, there is still a problem with the amount of data generated by these high-throughput methods QTL based investigation can easily produce over 200 genes per chromosome region, if there are 5 regions then there are a lot of genes to look at Microarray gene expression studies map entire genomes to the chip (mouse 23,000 genes) Those genes that show a change in their expression under the studied state are chosen These can be in the 1000’s Numbers of genes quickly overwhelms researchers How do researches look at these candidate genes then? Vast amounts of data We now have technologies to help with identification of candidate genes But, there is still a problem with the amount of data generated by these high-throughput methods QTL based investigation can easily produce over 200 genes per chromosome region, if there are 5 regions then there are a lot of genes to look at Microarray gene expression studies map entire genomes to the chip (mouse 23,000 genes) Those genes that show a change in their expression under the studied state are chosen These can be in the 1000’s Numbers of genes quickly overwhelms researchers How do researches look at these candidate genes then?

    Slide 9:Hypothesis-Driven Analyses

    200 genes Pick the genes involved in immunological process 40 genes Pick the genes that I am most familiar with 2 genes Biased view ‘Cherry Pick’ genes Hypothesis based approach Researches follow a hypothesis based approach This typically involves filtering or triaging the genes based on a predefined hypothesis We found that they tended to filter the gene lists because they were focussed on their “favourite” genes Immunologists picked the immunology genes In fact this triaging can lead to expert-driven hypothesis generation Some of the crucial data can be missed as it is not known that other processes other than those hypothesises are involved in a major way Use case Tryps (going to discuss later) researchers assumed it had to do with immune response Know known that other processes are involved that would not have been picked up using this method, including a cholesterol responseHypothesis based approach Researches follow a hypothesis based approach This typically involves filtering or triaging the genes based on a predefined hypothesis We found that they tended to filter the gene lists because they were focussed on their “favourite” genes Immunologists picked the immunology genes In fact this triaging can lead to expert-driven hypothesis generation Some of the crucial data can be missed as it is not known that other processes other than those hypothesises are involved in a major way Use case Tryps (going to discuss later) researchers assumed it had to do with immune response Know known that other processes are involved that would not have been picked up using this method, including a cholesterol response

    Issues with current approaches Scale of analysis task overwhelms researchers – lots of data User bias and premature filtering of datasets – cherry picking Hypothesis-Driven approach to data analysis Constant changes in data - problems with re-analysis of data Implicit methodologies (hyper-linking through web pages) Error proliferation from any of the listed issues – notably human error Solution Automate

    Slide 10:Issues with current approaches However there are issues with the current approaches, which are: The sheer scale of data is overwhelming for researchers to re-run experiments or to use full datasets Bias is introduced from two sources: Premature filtering of datasets Biased introduced by filed experts Constant change in data means constant re-runs of experiments to check for new and updated information Implicit methodologies due to number of resources used in such investigations Because of these issues, omissions and slips are regularly encountered, especially from human error Automation a solution – using web services and workflows myGrid and Taverna are the software options chosen Issues with current approaches However there are issues with the current approaches, which are: The sheer scale of data is overwhelming for researchers to re-run experiments or to use full datasets Bias is introduced from two sources: Premature filtering of datasets Biased introduced by filed experts Constant change in data means constant re-runs of experiments to check for new and updated information Implicit methodologies due to number of resources used in such investigations Because of these issues, omissions and slips are regularly encountered, especially from human error Automation a solution – using web services and workflows myGrid and Taverna are the software options chosen

    Slide 11:Automate using the Two W’s

    Web Services Technology and standard for exposing code and data resources by an means that can be consumed by a third party remotely Describes how to interact with it, e.g. service parameters Workflows General technique for describing and executing a process Describes what you want to do, including the services to use The Two W’s Web services are basically computer programs that have been exposed to the internet so that they can be executed from any location Many ways of exposing these programs, and many ways of providing access to them through XML based schemas Not going to go into these here Web services once exposed can be connected together in program languages like Java to perform multiple tasks This connection of web services is what we term workflows Workflows are the explicit declaration of a task, including what is to be achieved and the means of achieving (in the context of the services used) Workflows also capture what resources were used, what programs were run with what parameters and how the data is analysedThe Two W’s Web services are basically computer programs that have been exposed to the internet so that they can be executed from any location Many ways of exposing these programs, and many ways of providing access to them through XML based schemas Not going to go into these here Web services once exposed can be connected together in program languages like Java to perform multiple tasks This connection of web services is what we term workflows Workflows are the explicit declaration of a task, including what is to be achieved and the means of achieving (in the context of the services used) Workflows also capture what resources were used, what programs were run with what parameters and how the data is analysed

    Slide 12:Web Services

    Client Application Client Application SOAP WSDL Remote Application HTTP Request HTTP Response HTTP Request HTTP Response

    Slide 13:Web Service Description Language

    Web Service Description Language (WSDL) is used to provide a computer program with enough information on how to execute or provide data to a remote resource XML based language Can be used for most industry programming languages including Perl, C++, Java Tells external programs how to call a remote service – exposes function calls

    Slide 14:Programmatic Interfaces to Services (Web Services not Web Sites)

    Your Script Service Registry Web Service SeqFetch Service BLAT Service BLAST Service SeqFetch Service GO Service Your Workflow Your Application Interface Description Document WSDL WADL European Bioinformatics Institute: API submissions has risen to 3,166,901 for 2007 Technology and standard for exposing code or a database or any kind of application with an API that can be consumed by a third party remotely. Description of how to interact with it. Not just Web Services described by SOAP – Java applications, REST, PERL scripts … Could also be codes…. Can easily be chained together Has a description interface Increasing number of web services. Many are REST over http. Local or in the cloud Can be very scruffy indeed. European Bioinformatics Institute: API submissions has risen to 3,166,901 for 2007 Technology and standard for exposing code or a database or any kind of application with an API that can be consumed by a third party remotely. Description of how to interact with it. Not just Web Services described by SOAP – Java applications, REST, PERL scripts … Could also be codes…. Can easily be chained together Has a description interface Increasing number of web services. Many are REST over http. Local or in the cloud Can be very scruffy indeed.

    Slide 15:What types of service?

    WSDL Web Services BioMart R-processor BioMoby Soaplab Local Java services Beanshell Workflows

    Slide 16:Workflows

    Collection of tasks chained together to perform one overall operation – e.g. the ‘morning ritual’ workflow Get up Have a wash Get dressed Eat breakfast Clean teeth Go to lectures High level description of your experiment Inputs, programs, outputs (and intermediate inputs and outputs) Workflow is the model of experiment Methods section in your publication

    Slide 18: What is a Workflow?

    Workflows provide a general technique for describing and enacting a process Describes what you want to do, and how you want to do it Specifies how bioinformatics processes fit together Processes are represented as web services Remove repeats Find genes Find orthalogues

    Taverna

    Slide 20:The Taverna Workflow Workbench

    Slide 21:What is Taverna?

    “Taverna enables the interoperation between databases and tools by providing a toolkit for composing, executing and managing workflow experiments”. – Someone (sometime) OR “Allows you to build and run workflows”. – Paul Fisher (2009) Access to local and remote resources and analysis tools Automation of data flow between services Iteration over large data sets…….. And so on http://www.mygrid.org.uk/

    Taverna Workflow Workbench

    Slide 23:Who uses Taverna?

    Over 60,000 downloads Systems biology Medical image analysis Heart simulations High throughput screening Genotype/Phenotype studies Health Informatics Astronomy Chemoinformatics NOT FOR BOLOGISTS!!!! Prof. Andy Brass Designed for informaticians – computer savvy people Commercial use by Apple Corp + BioTeam Users across USA and Europe EU EMBRACE NoE Soaplab services supported by European Bioinformatics Institute Service providers BioMOBY, BluePrint, EMBOSS, BioMART … Middleware developers Semantic Grid technologies, workflow SIMDAT,EGEE, GridSphere, SCEC Life Science users and tool developers Virginia Bioinformatics Institute, SDSC, UC Davis, USC, VL-e, Purdue, caBIG, EMBRACE Commercial Lexicon Genetics, Apple, BioTeam High profile – three citations in Science May 2004 Distributed Computing issue International invitations Commercial use by Apple Corp + BioTeam Users across USA and Europe EU EMBRACE NoE Soaplab services supported by European Bioinformatics Institute Service providers BioMOBY, BluePrint, EMBOSS, BioMART … Middleware developers Semantic Grid technologies, workflow SIMDAT,EGEE, GridSphere, SCEC Life Science users and tool developers Virginia Bioinformatics Institute, SDSC, UC Davis, USC, VL-e, Purdue, caBIG, EMBRACE Commercial Lexicon Genetics, Apple, BioTeam High profile – three citations in Science May 2004 Distributed Computing issue International invitations

    Slide 24:What do Scientists use Taverna for?

    Data gathering and annotating Distributed data and knowledge Building models and knowledge management Populating SBML or hypothesis generation Data analysis Distributed analysis tools and high throughput

    Slide 25:Data Gathering

    Collecting evidence from lots of places Accessing local and remote databases, extracting info and displaying a unified view to the user Lots of outputs!!

    Slide 26:Annotation Pipelines

    Genome annotation pipelines Workflow assembles evidence for predicted genes / potential functions Human expert can ‘review’ this evidence before submission to the genome database Data warehouse pipelines e-Fungi – model organism warehouse ISPIDER – proteomics warehouse Annotating the up/down regulated genes in a microarray experiment

    Slide 27:Systems Biology Model Construction

    Automatic reconstruction of genome-scale yeast metabolism from distributed data in the life sciences to create and manipulate Systems Biology Markup Models. Want to change something, just do it! Want to change something, just do it!

    Read enzyme names from SBML Query maxdLoad2 using enzyme names Calculate colours based on gene expn level Create new SBML model with new colour nodes Integration of microarray data with SBML

    Slide 29:Data Analysis

    Access to local and remote analysis tool You start with your own data / public data of interest You need to analyse it to extract biological knowledge

    http://www.genomics.liv.ac.uk/tryps/trypsindex.html

    Slide 30:Trypanosomiasis in Africa

    Andy Brass Steve Kemp + many Others Wellcome Trust In order to test our hypotheses we need some use case data, which is where the Wellcome Trust funded Trypanosomiasis project comes in This is project set up as a collaboration between a number of Universities and Research centres, including: Manchester, Liverpool, Edinburgh, and the International Livestock Research Institute in Nairobi. So far we have evaluated the workflows with a use case: Investigating African Trypanosomiasis in the mouse model Genetic mapping in the lab had identified 3 QTL regions which contribute to the resistance of these mice to trypanosomiasis We have so far run our workflows over 1 of these QTL regions We also obtained extremely good quality microarray data from the same project, looking into the early response to infection in the mice Wellcome Trust In order to test our hypotheses we need some use case data, which is where the Wellcome Trust funded Trypanosomiasis project comes in This is project set up as a collaboration between a number of Universities and Research centres, including: Manchester, Liverpool, Edinburgh, and the International Livestock Research Institute in Nairobi. So far we have evaluated the workflows with a use case: Investigating African Trypanosomiasis in the mouse model Genetic mapping in the lab had identified 3 QTL regions which contribute to the resistance of these mice to trypanosomiasis We have so far run our workflows over 1 of these QTL regions We also obtained extremely good quality microarray data from the same project, looking into the early response to infection in the mice

    A Systematic Strategy for Large-Scale Analysis of Genotype-Phenotype Correlations: Identification of candidate genes involved in African Trypanosomiasis Fisher et al., (2007) Nucleic Acids Research doi:10.1093/nar/gkm623 Explicitly discusses the methods we used for the Trypanosomiasis use case Discussion of the results for Daxx and shows mutation Sharing of workflows for re-use, re-purposing

    Slide 31:Shameless Plug This is the paper we had recently published in NAR. It describes: all of the issues found from carrying out a systematic analysis of QTL and microarray data results we obtained from the Trypanosomiasis project Links to the workflows themselves, held in the myGrid repository (and in the new myExperiment social scientist site) Shameless Plug This is the paper we had recently published in NAR. It describes: all of the issues found from carrying out a systematic analysis of QTL and microarray data results we obtained from the Trypanosomiasis project Links to the workflows themselves, held in the myGrid repository (and in the new myExperiment social scientist site)

    Slide 32:Trichuris muris

    Mouse whipworm infection - parasite model of the human parasite - Trichuris trichuria Understanding Phenotype Comparing resistant vs. susceptible strains – Microarrays Understanding Genotype Mapping quantitative traits – Classical genetics QTL (regions of chromosome) Worm Lady's name is Joanne Pennock and as far as I know she works for Prof. Richard K.Grencis. DescriptionTrichuris muris - the mouse whipworm is a useful parasite model of the human parasite - Trichuris trichuria. Whipworms derive their name from their characteristic morphology. Adults occupy the large intestine with their anterior ends embedded in the cells lining the intestine. Transmission occurs by ingestion of contaminated material. Jo didn’t know about the tools; she didn’t know how to do it properly. REUSE Identified sex-dependant biological pathways involved in mouse model. The correlation of sex depandance and the ability of mice to expel the parasite had previously been hypothesised, however, had not been verified using conventional manual analysis techniques. Worm Lady's name is Joanne Pennock and as far as I know she works for Prof. Richard K.Grencis. DescriptionTrichuris muris - the mouse whipworm is a useful parasite model of the human parasite - Trichuris trichuria. Whipworms derive their name from their characteristic morphology. Adults occupy the large intestine with their anterior ends embedded in the cells lining the intestine. Transmission occurs by ingestion of contaminated material. Jo didn’t know about the tools; she didn’t know how to do it properly. REUSE Identified sex-dependant biological pathways involved in mouse model. The correlation of sex depandance and the ability of mice to expel the parasite had previously been hypothesised, however, had not been verified using conventional manual analysis techniques.

    Slide 33:Recycling, Reuse, Repurposing

    Here’s the Science! Identified a candidate gene (Daxx) for Trypanosomiasis resistance. Manual analysis on the microarray and QTL data failed to identify this gene as a candidate. Unbiased analysis. Confirmed by the wet lab. Here’s the e-Science! Trypanosomiasis mouse workflow reused without change in Trichuris muris infection in mice Identified biological pathways involved in sex dependence Previous manual two year study of candidate genes had failed to do this. Workflows now being run over Colitis/ Inflammatory Bowel Disease in Mice (without change) Recycling, Reuse and Repurposing From using workflows we were able to improve the way people perform the science on combined QTL and microarray analyses. That is the science part The e-Science part is the building, sharing, reusing, and repurposing of the workflows Just to illustrate this, we subsequently ran some additional use case data through exactly the same workflow built for Sleeping sickness. This was Trichuris muris infection in Mice It is an intestinal parasite that causes inflammation in the intestine and other symptoms Also a good model for looking into Colitis and Inflammatory Bowel Disease This analysis involved taking 5 QTL regions, which took roughly 45 minutes again to run The important note here is that the same workflow was re-run over separate data, without the workflow ever being need to be changed This is a major benefit to other scientists who are interested in QTL or microarray analyses and who work on the mouse model Workflows can also be built by changing a single service (biomart) to handle different organismz (e.g. also built one for cow and human using same pathway driven approach). Single service change gives an explicit, and systematic method for QTL and microarray data analysis for any species held in biomartRecycling, Reuse and Repurposing From using workflows we were able to improve the way people perform the science on combined QTL and microarray analyses. That is the science part The e-Science part is the building, sharing, reusing, and repurposing of the workflows Just to illustrate this, we subsequently ran some additional use case data through exactly the same workflow built for Sleeping sickness. This was Trichuris muris infection in Mice It is an intestinal parasite that causes inflammation in the intestine and other symptoms Also a good model for looking into Colitis and Inflammatory Bowel Disease This analysis involved taking 5 QTL regions, which took roughly 45 minutes again to run The important note here is that the same workflow was re-run over separate data, without the workflow ever being need to be changed This is a major benefit to other scientists who are interested in QTL or microarray analyses and who work on the mouse model Workflows can also be built by changing a single service (biomart) to handle different organismz (e.g. also built one for cow and human using same pathway driven approach). Single service change gives an explicit, and systematic method for QTL and microarray data analysis for any species held in biomart

    Scale of analysis task overwhelms researchers – lots of data Handled by computers User bias and premature filtering of datasets – cherry picking All data processed systematically Hypothesis-Driven approach to data analysis Computers know nothing of hypotheses and so process the data independent of any prior judgments Constant changes in data - problems with re-analysis of data Saved workflow can be re-run at any point, over new data sets Implicit methodologies (hyper-linking through web pages) Methodology has been captured in the workflow itself Was the Workflow Approach Successful?

    Slide 34:Issues with current approaches However there are issues with the current approaches, which are: The sheer scale of data is overwhelming for researchers to re-run experiments or to use full datasets Bias is introduced from two sources: Premature filtering of datasets Biased introduced by filed experts Constant change in data means constant re-runs of experiments to check for new and updated information Implicit methodologies due to number of resources used in such investigations Because of these issues, omissions and slips are regularly encountered, especially from human error Automation a solution – using web services and workflows myGrid and Taverna are the software options chosen Issues with current approaches However there are issues with the current approaches, which are: The sheer scale of data is overwhelming for researchers to re-run experiments or to use full datasets Bias is introduced from two sources: Premature filtering of datasets Biased introduced by filed experts Constant change in data means constant re-runs of experiments to check for new and updated information Implicit methodologies due to number of resources used in such investigations Because of these issues, omissions and slips are regularly encountered, especially from human error Automation a solution – using web services and workflows myGrid and Taverna are the software options chosen

    Slide 35:Social Networking for Scientists

    Slide 36:Recycling, Reuse, Repurposing

    http://www.myexperiment.org/ Share Search Re-use Re-purpose Execute Communicate Record myExperiment Now, this is where the selfish scientist in me comes out. I want to get recognition for my work and any project I am a part of I want people to use my methods, and workflows So, in order to do that I need to share them with the scientific community To do this I can use myExperiment (and paper publication), to do such a task This is a new project that enables scientists to share, re-use, edit, and run workflows, and also to communicate with one another It is a social networking site for scientistsmyExperiment Now, this is where the selfish scientist in me comes out. I want to get recognition for my work and any project I am a part of I want people to use my methods, and workflows So, in order to do that I need to share them with the scientific community To do this I can use myExperiment (and paper publication), to do such a task This is a new project that enables scientists to share, re-use, edit, and run workflows, and also to communicate with one another It is a social networking site for scientists

    Slide 37:Sharing Experiments

    myGrid supports the in silico experimental process for individual scientists How do you share your results/experiments/experiences with your Research group Collaborators Scientific community How do you compare your results with others produced by e.g. Kepler / Triana?

    Slide 39:Just Enough Sharing….

    myExperiment can provide a central location for workflows from one community/group myExperiment allows you to say Who can look at your workflow Who can download your workflow Who can modify your workflow Who can run your workflow

    Remote Execution of Workflows

    Slide 41:Service Discovery

    Slide 42:Finding Services

    There are over 3500 distributed services. How do we find an appropriate one? We need to annotate services by their functions (and not their names!) The services might be distributed, but a registry of service descriptions can be central and queried Annotated with terms from the myGrid ontology Questions we can ask: Find me all the services that perform a multiple sequence alignment and accept protein sequences in FASTA format as input Same problem as finding the web resourcesSame problem as finding the web resources

    Slide 43:myGrid Ontology

    Logically separated into two parts: Service ontology Physical and operational features of web services Domain ontology Vocabulary for core bioinformatics data, data types and their relationships Ontology developed in OWL Mention owl / rdfs split hereMention owl / rdfs split here

    Slide 44:myGrid ontology

    Example : BLAST (from the DDBJ) Performs task: Alignment Uses Method: Similarity Search Algorithm Uses Resources: DNA/Protein sequence databases Inputs: biological sequence database name blast program Outputs: Blast Report

    Slide 45:Feta Search Result

    Curation by Experts Curation by the Community Automated Curation refine validate refine validate Curation by Developers seed seed refine validate seed

    Slide 46:BioCatalogue Joint Manchester-EBI

    Slide 47:Summary

    Taverna workflows: Combine local and remote resource and analysis tools Automate multi-step processes Iterate over large data sets myExperiment Provides reusable protocols for in silico science Enables sharing of workflows and expertise Provides an alternative way of running workflows Not everyone who runs workflows wants to build workflows or see workflows running

    Slide 48:Acknowledgements

    Carole Goble, Norman Paton, Robert Stevens, Anil Wipat, David De Roure, Steve Pettifer OMII-UK Tom Oinn, Katy Wolstencroft, Daniele Turi, June Finch, Stuart Owen, David Withers, Stian Soiland, Franck Tanoh, Matthew Gamble, Alan Williams, Ian Dunlop Research Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Antoon Goderis, Alastair Hampshire, Qiuwei Yu, Wang Kaixuan. Current contributors Matthew Pocock, James Marsh, Khalid Belhajjame, PsyGrid project, Bergen people, EMBRACE people. User Advocates and their bosses Simon Pearce, Claire Jennings, Hannah Tipney, May Tassabehji, Andy Brass, Paul Fisher, Peter Li, Simon Hubbard, Tracy Craddock, Doug Kell, Marco Roos, Matthew Pocock, Mark Wilkinson Past Contributors Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Juri Papay, Savas Parastatidis, Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Victor Tan, Paul Watson, and Chris Wroe. Industrial Dennis Quan, Sean Martin, Michael Niemi (IBM), Chimatica. Funding EPSRC, Wellcome Trust. http://www.mygrid.org.uk http://www.myexperiment.org

More Related