Scientific workflows in e science
1 / 22

Scientific Workflows in e-Science - PowerPoint PPT Presentation

  • Uploaded on

Scientific Workflows in e-Science. Dr Zhiming Zhao ( [email protected] ) System and Network Engineering, University of Amsterdam Virtual Laboratory for e-Science. Outline. Background Scientific workflow management system Virtual Laboratory for e-Science Our approach

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Scientific Workflows in e-Science' - elan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Scientific workflows in e science

Scientific Workflows in e-Science

Dr Zhiming Zhao

([email protected])

System and Network Engineering,

University of Amsterdam

Virtual Laboratory for e-Science


  • Background

  • Scientific workflow management system

  • Virtual Laboratory for e-Science

  • Our approach

  • Challenges and research lines

  • Activities

Problem solving a typical scenario in scientific research

Data analysis

Define problems



Problem solving: a typical scenario in scientific research

  • Analysis

  • Hypothesis

  • Related work

  • Propose experiments

  • Define steps

  • Prototype computing systems

  • Perform experiments

  • Data collection

  • Presentation

  • Dissemination

  • Visualization

  • Validation

  • Adjust experiment

  • Refine hypothesis

  • Activities are:

  • Iterative, dynamic, and human centered

  • Requires different levels of resources

Example scenarios
Example scenarios

  • In problem analysis

    • Identify domains, search key problems, find typical methods, and review related work

  • In scientific experiments: scientific computing & data processing

    • Define dependencies between computing and data processing tasks, and schedule their runtime behavior

  • In data analysis

    • Visualization, compare the results of different parameters, keep meaningful configuration and continue experiments

    • Search related work, compare results

  • In dissemination

    • Documenting experiments, present results, citation, publication

Computer support for problem solving


data sharing &








Remote resource


Computer support for problem solving

  • Problem Solving Environment: (E Gallopoulos et. al., IEEE CS Eng. 1994)

    • Organize different software components/ tools

    • Allows a user to assemble these tools at a high level of abstraction

    • Control runtime behavior of experiments

    • Examples: MATLab, Ptolemy, etc.

Scientific workflow management systems:

A new guise of PSE!

Traditional PSE:

organize and execute resources locally!

Inside a scientific workflow management system
Inside a Scientific Workflow Management System

In our view, a SWMS at least implements:

  • A model for describing workflows;

  • An engine for executing/managing workflows;

  • Different levels of support for a user to compose, execute and control a workflow.

Workflow (based on certain model)



User support

Engine level control


Resource level control


Scientific workflows in e science1
Scientific Workflows in e-Science

Experiment processes

Workflows for administration, e.g., AAA, and other issues.

Workflows varies at different

  • Phases of experiments: design, runtime control, dissemination;

  • Abstractions of resources: concrete and abstract;

  • Levels of activity details: computing, data access, search/matching, human activities;

Abstract workflows

Executable (concrete workflows)

Diversity in swms
Diversity in SWMS

  • Taverna:

  • Web services based language: Scufl;

  • FreeFluo: engine

  • Graphical viz of workflow

  • Triana:

  • Components

  • Task graph

  • Data/control flow

  • Kepler:

  • Actor,director

  • MoML

  • Execution models

  • Pegasus:

  • Based on DAGMan

  • VDL

  • DAG

  • DAGMan:

  • Computing tasks

  • DAG

Virtual laboratory for e science



Dutch telescience


Bio diversity

Virtual Laboratory for e-Science

Data intensivescience



Application layer

Generic e-science framework layer

Grid layer


Effectively reuse existing workflow managements systems, and provide a generic e-Science framework for different application domains.

A generic framework can

  • Improve the reuse of workflow components and the workflows for different experiments

  • Reduce the learning cost for different systems

  • Allow application users to work on a consistent environment when underlying infrastructure changed

Previous work vlam g environment
Previous work: VLAM-G environment

  • VLAM-G

    • A Grid enable PSE

    • Data intensive applications

    • Visual interface

    • Two levels of workflow support

    • Human interaction support


  • Process-Flow Template

  • Graphical representation of data elements and processing steps in an experimental procedure.

  • Study

  • Descriptions of experimental steps represented as an instance of a PFT with references to experiment topologies.

Experiment Topology

  • Graphical representation of self-contained data processingmodules attached to each otherin a workflow.

Lessons learned
Lessons learned

  • How to introduce a new PSE to a domain scientist?

    • Because it has a beautiful architecture?

    • Or because it can allow a scientist to keep their current work style?

  • How to use existing work?

    • Scientists need one system or more options?

  • How to include user in the computing loop?

    • Dynamic workflows and human in the loop computing are important.

Z. Zhao et al., “Scientific workflow management: between generality and applicability”, QSIC 2005, Australia

Workflow support in vl e
Workflow support in VL-e

  • Recommend suitable workflow systems for different application domains:

    • Analyze typical application use cases

    • Define small projects with different application domains

    • Review existing workflow systems

    • Recommend four workflow systems: Triana, Taverna, Kepler, and VLAMG

  • A long term

    • Extend VLAMG and develop our own generic workflow framework

A workflow bus paradigm
A workflow bus paradigm


Sub workflow 1

Sub workflow 2

Sub workflow 3




Workflow bus

A workflow bus is a special workflow system for executing meta workflows, in which sub workflows will be executed by different engines.

Z. Zhao et al., “Workflow bus for e-Science”, in IEEE Int’l Conf. e-Science 2006, Amsterdam

Applications of workflow bus
Applications of workflow bus

  • Use case 1:

    • A user has workflow in Taverna

    • Some functionality is missing in Taverna but can be provided by Triana

    • He can develop the workflow in two systems, and run it via the workflow bus

  • Use case 2:

    • A user wants to execute a Taverna or Triana workflow in multiple instances with different input data

Ongoing research
Ongoing research

  • Web service in data intensive applications

  • Execution models for Grid workflows

  • Including PSE in scientific workflows

  • Industrial standards in scientific workflows

Relevance between our research and elsevier s work
Relevance between our research and Elsevier’s work

  • In a same context from the scale of entire lifecycle of e-Science experiments

  • Different focuses

    • We focus on runtime behavior of scientific experiments, e.g., Grid computing, data/computing intensive applications, and scheduling of computing tasks

    • Elsevier highlights data search and integration on well structured data bases, research preparation, and literature search and management


  • Different characteristics in workflows

    • In our workflows, processing and managing runtime dynamic data is the key patterns

    • In Elsevier workflows, storage, replicate, access, match and integrate static data might be more common

  • Facing similar challenges:

    • Semantics based data search and integration

    • Workflow provenance

    • Collaborative interaction (workflow development, resource sharing, knowledge transfer)

    • Modeling user profiles


  • Int’l workshop on “Workflow systems in e-Science”, organized by Zhiming Zhao and Adam Belloum, in the context of ICCS06, Reading University, May 28, 2006.

    • Proceedings is in LNCS, Springer Verlag.

    • A special issue will be published in Scientific Programming Journal.


  • Workshop on “Scientific workflows and industrial workflow standards in e-Science”, organized by Adam Belloum and Zhiming Zhao, in the context of IEEE e-Science and Grid computing conference in Amsterdam December 2006.

    • Pegasus, Dr. Ewa Deelman (Department of Computer Science University of South California)

    • BPEL, Dr. Dieter König (IBM Research Germany Development Laboratory)

    • Kepler, Dr. Bertram Ludäscher (Department of Computer Science University of California, Davis)

    • Taverna, Prof. Peter Rice (European Bioinformatics Institute)

    • WS and Semantic issues, Dr. Steve Ross-Talbot (CEO, and a co-founder, of Pi4 Technologies)

    • Triana, Dr. Ian J. Taylor (Department of Computer Science Cardiff University)



  • Virtual Laboratory for e-Science:

  • Network and System Engineering, Faculty of Science, University of Amsterdam:

  • Z. Zhao; A. Belloum; H. Yakali; P.M.A. Sloot and L.O. Hertzberger: Dynamic Workflow in a Grid Enabled Problem Solving Environment, in Proceedings of the 5th International Conference on Computer and Information Technology (CIT2005), pp. 339-345 . IEEE Computer Society Press, Shanghai, China, September 2005.

  • Z. Zhao; A. Belloum; A. Wibisono; F. Terpstra; P.T. de Boer; P.M.A. Sloot and L.O. Hertzberger: Scientific workflow management: between generality and applicability, in Proceedings of the International Workshop on Grid and Peer-to-Peer based Workflows in conjunction with the 5th International Conference on Quality Software, pp. 357-364. IEEE Computer Society Press, Melbourne, Australia , September 19th-21st 2005.

  • Z. Zhao; A. Belloum; P.M.A. Sloot and L.O. Hertzberger: Agent technology and scientific workflow management in an e-Science environment, in Proceedings of the 17th IEEE International conference on Tools with Artificial Intelligence (ICTAI05), pp. 19-23. IEEE Computer Society Press, Hongkong, China, November 14th-16th 2005.