1 / 34

Workflow Topics for the Next-Generation SDM-Center

UC DAVIS Department of Computer Science. San Diego Supercomputer Center. Workflow Topics for the Next-Generation SDM-Center. Ilkay Altintas altintas@ SDSC .edu Bertram Ludäscher ludaesch@UC DAVIS .edu. Sir Walter Raleigh. SciDAC SDM AHM Oct 5-6, 2005, NCSU Raleigh, NC. Overview.

olina
Download Presentation

Workflow Topics for the Next-Generation SDM-Center

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UC DAVIS Department of Computer Science San Diego Supercomputer Center Workflow Topics for the Next-Generation SDM-Center Ilkay Altintas altintas@SDSC.edu Bertram Ludäscher ludaesch@UCDAVIS.edu Sir Walter Raleigh SciDAC SDM AHM Oct 5-6, 2005, NCSU Raleigh, NC

  2. Overview • Kepler/SPA: • What we have (The GOOD) • What we don’t (yet) have (The BAD) • What we really need?? (The UGLY)  Things we might do; prioritization

  3. Macro Definitions … • #define KEPLER KEPLER/SPA • #define KEPLER KEPLER*SPA • By the end: • #define SPA KEPLERHPC

  4. What we have – The GOOD • Big Heritage from Ptolemy II • Vergil GUI for design and (some) execution monitoring • Actor-Oriented Modeling & Design • Director / Actor Separation • Models of Computation: PN, SDF, DE, .. • Nested Workflows & Hierarchical Modeling • Research Results on Modeling Complex Systems • modal models, mobile models, reconfig’able models, model lifecycle management, higher-order actors, … • head-start for CCA Extensions, e.g. • SciRUN-2 Extensions (Steve P. et al.) • Self-Managing, Dynamically-Adaptive, Autonomous Components (Manish et al.)

  5. What we have – The GOOD • Kepler Extensions (to Ptolemy II) • Mostly: loosely coupled, e.g. WS (web service) workflows • Many generic actors • ssh, scp, cmd-line,SRB, Globus, … • new R expression actor • Many custom actors • e.g. in PIW, TSI-1, TSI-2, GEON, SEEK, Resurgence, … • Several ad-hoc extensions & (initial) research, e.g. • External job scheduling (e.g. NIMROD, …) • Director extensions (fault tolerance via WS “retry”) • WF-Templates (structured combination of dataflow & control-flow: fault-tolerance, reusability) • Higher-order functions (map/3, iterate-over-array, … : simpler control-flow, optimization potential, …)

  6. Some KEPLER Actors (out of 160+ … and counting…)

  7. What we have – The GOOD • Kepler Extensions (Cont’d) • Some generic extensions • Metadata-based (EML/ADN) Dataset Search • Concept-based Actor Search (OWL) • Documentation Framework • Authentication & Authorization Framework (GAMA from GEON) • Improved component/WF archival & plug-in (KAR,…) • Provenance Recorder (“Listener”) PS … a growing open-source developers community … … and some scientific users … (TSI-1/2, PIW, GEON, SEEK, … )

  8. Concept-based Actor Search Concept-based Actor Search • Implemented as proof-of-concept • Additional operations slated for next Kepler Release (data search, port-based actor search, etc.) Biggest Challenges • Building/searching a repository … • Making changes to MoML (see KAR) • GUI changes • Ontology management Workflow Components (MoML/KAR) Ontologies (OWL) Default + Other Semantic Annotations instance expressions urn ids

  9. The GOOD: Kepler Archives • Purpose: Encapsulate WF data and actors in an archive file • … inlined or by reference • … version control  More robust workflow exchange  Easy management of semantic annotations  Plug-in architecture (Drop in and use)  Easy documentation updates • A jar-like archive file (.kar) including a manifest • All entities have unique ids (LSID) • Custom object manager and class loader • UI and API to create, define, search and load .kar files

  10. KAR File Example <entity name="Multiply or Divide" class="ptolemy.kernel.ComponentEntity"> <property name="entityId" value="urn:lsid:localhost:actor:80:1" class="org.kepler.moml.NamedObjId"/> <property name="documentation" class="org.kepler.moml.DocumentationAttribute"></property> <property name="class" value="ptolemy.actor.lib.MultiplyDivide" class="ptolemy.kernel.util.StringAttribute"> <property name="id" value="urn:lsid:localhost:class:955:1" class="ptolemy.kernel.util.StringAttribute"/></property> <property name="multiply" class="org.kepler.moml.PortAttribute"> <property name="direction" value="input" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="true" class="ptolemy.kernel.util.StringAttribute"/></property> <property name="divide" class="org.kepler.moml.PortAttribute"> <property name="direction" value="input" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="true" class="ptolemy.kernel.util.StringAttribute"/> </property> <property name="output" class="org.kepler.moml.PortAttribute"> <property name="direction" value="output" class="ptolemy.kernel.util.StringAttribute"/> <property name="dataType" value="unknown" class="ptolemy.kernel.util.StringAttribute"/> <property name="isMultiport" value="false" class="ptolemy.kernel.util.StringAttribute"/></property> <property name="semanticType00" value="http://seek.ecoinformatics.org/ontology#ArithmeticMathOperationActor" class="org.kepler.sms.SemanticType"/> </entity>

  11. Kepler Object Manager • Designed to access local and distributed objects • Objects: data, metadata, annotations, actor classes, supporting libraries, native libraries, etc. archived in kar files • Advantages: • Reduce the size of Kepler distribution • Only ship the core set of generic actors and domains • Easy exchange of full or partial workflows for collaborations • Publish full workflows with their bound data • Becomes a provenance system for derived data objects => Separate SPA workflow repository and distribution

  12. Provenance Framework • Provenance • Track origin and derivation information about scientific workflows, their runs and derived information (datasets, metadata…) • Need for Provenance • Association of process and results • reproduce results • “explain & debug” results (via lineage tracing, parameter settings, …) • optimize: “Smart Re-Runs” • Types of Provenance Information: • Data provenance • Intermediate and end results including files and db references • Process (=workflow instance) provenance • Keep the wf definition with data and parameters used in the run • Error and execution logs • Workflow design provenance (quite different) • WF design is a (little supported) process (art, magic, …) • for free via cvs: edit history • need more “structure” (e.g. templates) for individual & collaborative workflow design

  13. Kepler Provenance Recording Utility • Parametric and customizable • Different report formats • Variable levels of detail • Verbose-all, verbose-some, medium, on error • Multiple cache destinations • Saves information on • User name, Date, Run, etc… Joint work with Oscar Barney

  14. Provenance: Next Steps • .kar file generation, registration and search for provenance information • Possible data/metadata formats • Automatic report generation from accumulated data • A relational schema for the provenance info in addition to the existing XML • Smart re-runs

  15. The Future • From GOOD via BAD to UGLY • The good news (about ‘bad’ and ‘ugly’) • Lots of interesting challenges! • … so ‘ugly’ is actually good!

  16. What we don’t (yet) have … THE BAD • Much is still to do (or still ongoing) • Detached execution • many options; depend on requirements • Kepler WF repository w/ dynamic actor plug-in • Smart Reruns • avoid doing (old) work twice • Smarter Reruns (too smart?) • reuse previous results for speed-up of (new) work • NIMROD Director, CONDOR Director … • Task manager / monitor • Support for WF design & reuse • Semantic extensions • “Design Patterns”, Templates

  17. What we don’t have … THE BAD cont’d • Vertical SDM Integration • Workflow layer could be used to embed other SDM components and glue them together • Scope & Architecture unclear • Data Mining tools  new WF actors • Parallel-R  new WF actors !? • SEA, Bitmap tools  new !? • MPI-IO  alternative to current Kepler data access!? • … • Not only a technical problem • e.g. need for driving use-cases that require combination of several SDM layers together

  18. Challenges • Easier said … • “We’re not going to reinvent the wheel …” • “We just use XYZ …” • XYZ in {CCA, HDF5, PnetCDF, Ccaffeine, Condor, MPI-IO, parallel-R, …} • … than done … • Incompatible, isolated solutions and frameworks • Can’t use workflow/actor/director A with B • Coming up with a coherent, overall architecture is hard!

  19. HTC Example (using: NIMROD) • need to make Kepler NIMROD/Condor/… “aware” • similar need for HPC support

  20. Another Distribution Approach Source: Daniel Lázaro Cuadrado, Aalborg University Servers Service Locator(Peer Discovery) Client Simulation is orchestrated in a centralized manner Computer Network

  21. What we don’t have … THE UGLY • Workflow Design & (Re-)Usability • Difficult Marriage of Dataflow and Control-flow • e.g. PIW, TSI-1/2, GEON-A-type-WF, … • WF development, deployment, maintenance, use • from (Mess…) to Art to Commodity ( next presentation) • support for WF whole life-cycle • Fault Tolerance • current embedding of control-flow into dataflow yields to non-maintainable workflows! • Close Coupling of Components for HPC • CCA-style • MPI-style • Memory-to-Memory (on single nodes) • large, efficient data transfer • …

  22. WF-Design: Adapters for Semantic & Structural Incompatibility Adapters may: • be abstract (no impl.) • be concrete • bridge a semantic gap • fix a structural mismatch • be generated automatically (e.g., Taverna’s “list mismatch”) • be reused components(based on signatures) C D C D C D C1 C1 D1 C1 D D C2 C2 D2 C2 map f1 f1 f2 f2 [S] S T [S] [S] [T] map map f1 f1 f2 [[S]] S T [[S]] [[T]] [[S]] f2

  23. f1 … f1 f2 f2 Additional Design Primitives for Semantic Types Resulting Workflow Extended Transformations Starting Workflow Resulting Workflow t9: Actor Semantic Type Refinement (T T) T T t10: Port Semantic TypeRefinement (C C, D D) C D C D C D D D t11: AnnotationConstraint Refinement (  ) C D C C 1 2 1 2 1 2 t t t s s s t12: I/O Constraint Strengthening (  )   t13: Data Connection Refinement t14: Adapter Insertion t15: Actor Replacement f f t16: Workflow Combination (Map)

  24. Workflow Design Primitives End-to-End Workflow Design and Implementation • Viewed as a series of primitive “transformations” • Each takes a WF and produces a new WF • Can be combined to form design “strategies” W0 Workflow Design Top-Down t W1 t Task Driven W2 Data Driven Bottom-Up … Structure Driven Wm Output Driven Semantic Driven t Workflow Implementation Wn Input Driven

  25. Fault Tolerance & Maintenance Challenges

  26. Workflow Templates and Patterns New Ingredients Proposed Layered Architecture work w/ Anne Ngu, Shawn Bowers, Terence Critchlow

  27. Use Ideas from Fault Tolerant Shell Good ideas in ftsh; some might be (semi-)low hanging fruits for Kepler … Source: Douglas Thain, Miron Livny The Ethernet Approach to Grid Computing

  28. Kepler Coupling Components & Codes • Types of Coupling … • Loosely coupled (“1st Phase”) • Web Services (SPA, GEON, SEEK, …), • ssh actors, .. + reusability (behavorial polymorphism) + scalability (# components) – efficiency • Tight(er) coupling (“2nd Phase”) • Via CCA (SciRUN-2, Ccaffeine, …) (Cipres uses CORBA) • HPC needs: code-coupling as efficient & flexible as possible (e.g. Scott’s challenges…) • memory-to-memory (single node or shared memory), • MPI (multiple-nodes) • optimizations for transfer of data & control (streaming, socket-based connections)

  29. Accord-CCA: Ccaffeine w/ Self-Managed Behavior cf. w/ mobile models, reconfiguration in Ptolemy II … begging for a Kepler design and implementation … Source: Hua Liu and Manish Parashar

  30. Different “Directors” for Different Concerns • Example: • Ptolemy Directors – “factoring out” the concern of workflow “orchestration” (MoC) • common aspects of overall execution not left to the actors • Similarly: • “Black Box” (“flight recorder”) • a kind of “recording central” to avoid wiring 100’s of components to recording-actor(s) • “Red Box” (error handling, fault tolerance) • use ftsh ideas; tempaltes • “Yellow Box” (type checking) • for workflow design • “Blue Box” (shipping-and-handling) • central handling of data transport (by value, by reference, by scp, SRB, GridFTP, …) • “CCA++ Boxes” • Change behavior (e.g. algorithm) of a component • Change behavior (i.e., wiring) of a workflow in-flight SDF/PN/DE/… Provenance Recorder On Error Static Analysis SHA @ Component Mgr Composition Mgr

  31. Summary • The GOOD: • lots to build upon • The BAD: • no common / integrated architecture  use Kepler/SPA as a glue • this might be harder than it sounds • needs a mix of end-to-end application-drive and serious design effort for the integration architecture • The UGLY: • HPC challenges: close coupling, fault tolerance, … • The good news: there’s work to be done!

  32. Use of Semantics in SWF… “Smart” Search • Concept-based, e.g., “find all datasets containing biomass measurements” Improved Linking, Merging, Integration • Establishing links between data through semantic annotations & ontologies • Combining heterogeneous sources based on annotations • Concatenate, Union (merge), Join, etc. Transforming • Construct mappings from schema S1 to S2 based on annotations Semantic Propagation • “Pushing” semantic annotations through transformations/queries

  33. (≺) Helping with “shims” / adapters • Services can be semantically compatible, but structurally incompatible Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Incompatible StructuralType Ps StructuralType Pt (⋠)  (Ps) Desired Connection Source Actor Target Actor Pt Ps Source: [Bowers-Ludaescher, DILS’04]

More Related