1 / 9

Applications and Requirements for Scientific Workflow Introduction

Applications and Requirements for Scientific Workflow Introduction. May 1 2006 NSF Geoffrey Fox Indiana University. Major Themes. What is different now and why Scientific workflow is in realm of possibility now What are the application requirements rather than CS requirements

mystery
Download Presentation

Applications and Requirements for Scientific Workflow Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applications and Requirementsfor Scientific WorkflowIntroduction May 1 2006NSF Geoffrey Fox Indiana University

  2. Major Themes • What is different now and why • Scientific workflow is in realm of possibility now • What are the application requirements rather than CS requirements • Prioritize, identify new issues, what old requirements have been satisfied • Ground these in scenarios or in application descriptions that lead to these requirements • Phrase as transformative research – does term “scientific workflow” conjure up the innovative future or perhaps a bureaucratic past?

  3. Applications • Extreme weather (LEAD) • Bioinformatics (myGrid, BIRN); high throughput screening • Virtual Observatory in Astronomy • Particle Physics • Generic Data Analysis • Earthquake Science • Ocean Data Assimilation • Note most of following topics come from Computer Science and one needs to identify the driving higher level application requirement • Preserve mapping of application requirements to computer science topic

  4. Topics – Application/Component Specific • [Evangelinos] Support Ocean Data assimilation • Matlab, Fortran, Parallel simulations • Dataflow standards for “large I/O” • Metascheduling • Customization of execution parameters (provenance) • [AGray] Need workflow components supporting powerful data analysis across fields • [Gil] Support workflows needed in “open access” data accompanying scientific publication • [Hendler] Support information management as well as computation

  5. Topics - Overarching • [Ellisman] What do we mean by workflow; the word means different things to different people; should we use different terms; need a better word (distributed scientific method) • [JMyers, Barga] Categorize workflows and study use; evaluate and compare; identify common patterns • [Discussion] What has changed? – data deluge is one critical change; is data a curse or a blessing • [Ellisman] What is the “scientific method” (versus “Google method”) and its implication for workflow • [Barga] What’s wrong with commercial solutions • [Laszewski] Support common Grid patterns • [Fox] Build benchmark set analogous to NAS in parallel computing • [Fahringer] Include all costs (e.g. Web Service security, SOAP) in performance models • [Deelman1] Support restructuring and planning for performance optimization • [JMyers] Manage workflows like content • [Ackerman, KMyers, Scacchi, Deelman2] Support full people (scientific process) workflow including social and organizational issues

  6. Topics – Desired Qualities • [Goble] Support users who are often under-resourced • [Discussion] Multiple classes of users: “power” “common case” “education”; do users know what they want or not? • Note industry workflow captures WELL understood business processes • [Several] Workflows will be re-used and shared • [Ellisman] Enable reproducible science • [Livny] Support high quality software • [Laszewski] Balance between features, performance, and completeness. • [Goble] Easily assemble workflows, find services and adapt previous workflows • [Goble] The workflow has to reflect the science not the services invocation interface. • [Goble] Automated workflow design is unlikely, unpopular, and undesirable as scientists know which services they want • [Goble] Support all services that users want – whether they have a WSDL interface or not

  7. Topics – Desired Features • [Several] Workflows should be scalable, fault-tolerant, restartable, adaptive and repeatable; support multi-administration heterogeneous resources • [Discussion] What do application scientists mean by above qualities? • [Livny] Why is size important? Complexity counts • [Altintas] Support end (instruments) to end (interactive data analysis) science • [Szalay] Interactive analysis as well as batch • [Gannon] Workflows triggered by events without user interaction • [Knoblock] Techniques for rapidly constructing models of new sources or services so that they can be rapidly and correctly integrated. • [Knoblock] Support for dynamically integrating data across multiple data sources (i.e, databases or web services) that were not designed to work together. • [Curbera] Support reasoning about correctness and composability • [Livny] What is meaning of correctness and reproducibility (e.g. random numbers) • [Gil] Support collections of workflows addressing common scientific questions • [Discussion] Need to support workflows of heterogeneous workflows of different types; note industry worries about linking intra-enterprise systems across enterprises

  8. Topics – Detailed Technology • [Laszewski] Extend the workflow language through a set of core librariessuch as fault tolerance and check pointing. • [Goble] Need a higher level language than BPEL • [Goble] There will be no one workflow language or workflow system, as there is no one word processor, programming language or operating system. • [Ellisman, Livny] Role of portals (science gateways) as “common case” user interface versus distributed programming for “power user” • [Altintas] User interface customizable for different domains • [Deelman2] Virtual data to capture efficiently past and future actions • [Curbera] Integrate internet-scale execution (REST) and enterprise service bus ESB; • [Discussion] Web 2.0 like Google maps; Industry distinction between interoperability and implementation

  9. Topics -- Provenance • [Freire] Support computational (workflow) steering and provenance generation • [Goble] Workflows must allow effective management of resultant data and provenance • [Barga, Moreau] Define generally provenance of execution even though multiple paradigms • [Altintas] Track provenance of workflow design, execution, and intermediate and final results • [Gannon] Initialization of workflow components are dependent on each other • [Seth] Design provenance supporting customization of adaptable workflows

More Related