Interactive composition of scientific workflows
Download
1 / 18

Interactive Composition - PowerPoint PPT Presentation


  • 269 Views
  • Uploaded on

Interactive Composition of Scientific Workflows. Yolanda Gil USC Information Sciences Institute. Jihie Kim Varun Ratnakar Marc Spraragen . Southern California Earthquake Center (SCEC): Community Modeling Environment. User-Guided Creation of Workflows in SCEC.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Interactive Composition ' - erika


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Interactive composition of scientific workflows l.jpg

Interactive Composition of Scientific Workflows

Yolanda Gil

USC Information Sciences Institute

Jihie Kim

Varun Ratnakar

Marc Spraragen


Southern california earthquake center scec community modeling environment l.jpg
Southern California Earthquake Center (SCEC):Community Modeling Environment


User guided creation of workflows in scec l.jpg
User-Guided Creation of Workflows in SCEC

  • Problem: In order to bring sophisticated models to a wide range of users (civil engineers, city planners, disaster resp. teams), we need to provide assistance and automation while allowing users to guide the process

    • Choosing appropriate models (e.g., given site, eqk. Forecast)

    • Setting parameters through valid approximations (e.g., shear-wave velocity)

    • Complying with parameter value constraints (e.g., magnitude)

    • Detecting and resolving interacting constraints

    • Composing valid end-to-end pathways from individual models

    • Execution of pathway on grid resources

  • Same problems arise for scientists sharing models across SCEC institutions and across disciplines


The process of creating an executable workflow l.jpg
The Process of Creating an Executable Workflow

  • Creating a valid workflow template

    • Selecting simulation models and connecting inputs and outputs

    • Adding other steps for data conversions/transformations

  • Creating instantiated workflow

    • Providing input data to pathway inputs (logical assignments)

  • Creating executable workflow

    • Given requirements of each model, find and assign adequate resources for each model

    • Select physical locations for logical names

    • Include data movement steps, including data deposition steps


The process of creating an executable workflow5 l.jpg
The Process of Creating an Executable Workflow

  • Creating a valid workflow template

    • Selecting simulation models and connecting inputs and outputs

    • Adding other steps for data conversions/transformations

  • Creating instantiated workflow

    • Providing input data to pathway inputs (logical assignments)

  • Creating executable workflow

    • Given requirements of each model, find and assign adequate resources for each model

    • Select physical locations for logical names

    • Include data movement steps, including data deposition steps

[Tangmurarunkit, Decker & Kesselman 03]

This talk


Slide6 l.jpg

A Valid Workflow Template

Duration-Year

Task Result: Hazard curve: SA vs. prob. exc.

Fault-Grid-Spacing

UTM

Converter

(get-Lat-Long-

given-UTM)

Lat.

long

UTM

(, , , )

Rupture Offset

PEER-Fault

Gaussian Dist

No Truncation

Total Moment

Rate

Mag-Length-sigma

Dip

Ruptures

Rake

Hazard curve: SA vs. prob. exc.

Magnitude (min)

Hazard Curve

Calculator: SA

vs. prob. exc.

Ruptures

Magnitude (max)

rfml

Magnitude (mean)

Rupture

Lat

Long.

Velocity

CVM-get-

Velocity-

at-point

Field

(2000)

IMR: SA

exc. prob.

Lat

Long.

Site VS30

SA exc.

probs.

Site Basin-Depth-2.5

Lat

Long.

Basin-Depth

Basin-Depth

Calculator

rfml

SA Period

Gaussian

Truncation

SA exc.

prob.

Std. Dev. Type


Challenges for interactive composition of valid workflow templates l.jpg
Challenges for Interactive Composition of Valid Workflow Templates

  • Automatic tracking of workflow constraints

    • User is notified if there are problems but does not have to keep track of details

  • Provide flexible interaction

    • User can start from initial data, from data products, or steps

    • User can specify abstract descriptions of steps and later specialize them

    • User can reuse, merge, or build from scratch

  • Proactive assistance

    • System should not just point out problems but help user by suggesting fixes (always)

  • And… how do we define what “valid” means?


Interactive composition of valid workflow templates approach l.jpg
Interactive Composition of Valid Workflow Templates: Approach

Mixed-initiative system that helps users create, reuse, and combine pathways by exploiting:

  • Knowledge-based descriptions of components

    • Ontology of components and component types based on common features and parameter constraints

  • Analysis of (partially constructed) pathways based on AI planning techniques

    • Relate steps to goals and initial states, and interpret user actions in terms of incremental plan generation

    • Provide formal definitions of desirable properties of pathways

      Develop algorithm that integrates both techniques to check constraints and properties, guaranteeing correctness


Slide9 l.jpg

Ontology of Components Approach

Domain Ontology

Hazard-Level-with-Median

F2-Hazard-Level

Distance

Basin-Depth

Hazard-Level-with-SA

Hazard-Level-with-PGA

Hazard-Level-with-PGV

Fault-Type

IMR-Input-Parameter

F2-SA-Median-wrt-VS30

Hazard-Level-with-SA-Median

Hazard-Level-with-SA-Std-Dev

Hazard-Level-with-SA-Prob-Exc

Hazard-Level-with-Median

Hazard-Level-with-Std-Dev

Parameter

Field-2000-Input-Parameter

. . .

. . .

Compute-Hazard-Level-

given-IMR-input-parameters

Hazard-Level

probability-function

IMT

Compute-Hazard-Level-

with-SA-

given-IMR-input-parameters

Compute-Hazard-Level-

with-PGV-

given-IMR-input-parameters

Compute-Hazard-Level-with-PGA-

given-IMR-input-parameters

. . .

probability-function

Compute-Hazard-Level-with-SA-Median-

given-IMR-input-parameters

Compute-Hazard-Level-with-SA-Std-Dev-

given-IMR-input-parameters

IMR

Compute-Hazard-Level-with-SA-Prob-Exc-

given-IMR-input-parameters

Compute-F2-Hazard-Level-

given-Field-2000-input-parameters

. . .

. . .

. . .

Compute-F2-SA-Median-

given-Field-2000-input-parameters

Compute-F2-SA-Median-wrt-Distance-JB-

given-Fault-Type-&-Basin-Depth-&-…

Compute-F2-SA-MEDIAN-wrt-VS30-

given-Fault-Type-&-Basin-Depth-&-…

. . .

. . .

F2-operation-SA-Median-Distance-JB

F2-operation-SA-Median-VS30


Year l modeling and using simulation code for seismic hazard analysis with docker gil ratnakar 02 l.jpg
Year Approachl: Modeling and Using Simulation Code for Seismic Hazard Analysis with DOCKER [Gil & Ratnakar 02]

Model developers can easily add simple constraints to model description and document their sources and criticality

Declarative descriptions of models are linked to ontologies and reasoners

System generates formal representations of model constraints in PowerLoom as well as XSD and WSDL

User is allowed to override model constraints to accommodate analysis

System reasons about model constraints and suggests alternative models


Desirable properties of workflow templates l.jpg
Desirable Properties of Workflow Templates Approach

  • Satisfied iff the sources of input parameters for all components are specified

    • A parameter p  input-parameters (c) is satisfied iff  a link < co,po,ci,pi> L s.t. pi = p

  • Purposeful iff the workflow template specifies at least one end result

    • A workflow template <C, L, I, G> is purposeful G ≠ Ø.

  • Grounded iff each component has a unique assignment to an executable component

    • A workflow template <C, L, I, G> is grounded iff  c C, c is grounded(c)

  • Complete iff satisfied, purposeful, and grounded

  • Acyclic iff no loops

    • A workflow template <C, L, I, G> is acyclic iff  c C , c is not Linked to c.

  • Justified iff all components contribute to the end results

    • A component c  C is justified iff c G or c2 G where c is Linked to c2.

  • Parsimonious iff there are no redundant links or components

    • A Link l <co,po,ci, pi>  L is redundant iff  link l2 <co’,po’,ci’, pi’>  L s.t. l  l2 and co = co’ and po’ = po and ci = ci’ and pi = pi’.

  • Well-Formed iff acyclic, justified, and parsimonious

  • Consistent iff all links satisfy defined component requirements and constraints

    • A Link <c1,p1, c2, p2> is type-consistentiff subtype-of(range(c1,p1),range(c2,p2))

    • A Link <c1,p1, c2, p2> is semantically-consistentiff subsumes(range(c1,p1),range(c2,p2)

  • Correct iff complete, well-formed, and consistent


Assisting users in creating workflow templates l.jpg
Assisting Users in Creating Workflow Templates Approach

  • User interaction results in modifications to workflows

    • Specify desired result, external/user provided input

    • Add/remove step, add/remove link

    • Specialize step (e.g., IMR -> IMR-SA)

  • As user creates a workflow, intermediate stages result in possibly incorrect workflows

  • ErrorScan algorithm detects errors and generates possible fixes

  • Fixes are multi-step and “click-through”

  • Errors and fixes are ranked using heuristics

  • If no errors detected, workflow is guaranteed to be correct


Assisting users in creating workflow templates13 l.jpg
Assisting Users in Creating Workflow Templates Approach

ErrorScan algorithm

ErrorScan

Input: Workflow W <C,L,I,G>

Output: list of errors and corresponding fix suggestions

I. If W is not purposeful, return Error.

Suggestions: define end result e using types from the KB, AddEndResult (e).

II. For each Component C in W:

a. If C is not Justified, return Error.

Suggestions p that is output-parameter (c), find components cj in the workflow or the KB that have pj as input- parameter(cj), and subsumes(pj,p), AddLink(c,p,cj,pj)

b. If C is not grounded, return Error.

Suggestions: ( Cj  FindDirectSubtypes(c),

SpecializeComponent(C, Cj).

c. For each i in input-parameter(c):

1. If i is not Satisfied, return Error.

Suggestions:  cj  C with output parameter pj such that

subsumes(range(c,i),range(cj,pj))

AddLink(cj,pj,c,i).

Suggestions:  cj  FindMatchingOutput (i)),

AddLink(cj,pj,c,i).

Suggestion:AddAndLinkComponent

(W, AddInitialInput(i),range( i), c, i)

III. For each Link L in W:

a.If L is not Consistent, return Error.

Suggestions:  Ci FindInterPosingComponent(L),

InterposeComponent (Ci, L).

Suggestion: RemoveLink(L).

b. If L is Redundant, return Error.

Suggestion: RemoveLink (L).

  • User interaction results in modifications to workflows

    • Specify desired result, external/user provided input

    • Add/remove step, add/remove link

    • Specialize step (e.g., IMR -> IMR-SA)

  • As user creates a workflow, intermediate stages result in possibly incorrect workflows

  • ErrorScan algorithm detects errors and generates possible fixes

  • Fixes are multi-step and “click-through”

  • Errors and fixes are ranked using heuristics

  • If no errors detected, workflow is guaranteed to be correct


Cat composition analysis tool to create pathway templates l.jpg
CAT: Composition Analysis Tool Approachto Create Pathway Templates

Declarative descriptions of models are linked to ontologies and reasoners

System reasons about model constraints and points out errors and fixes

User builds a pathway specification from library of models

System guarantees correctness of pathway templates


Slide15 l.jpg

Results: Approach

Scientific Workflow Template Executed as Workflow

End result: Hazard Map

(around USC area)


Conclusions and future work l.jpg
Conclusions and Future Work Approach

  • Mixed-initiative approach to create workflows incorporates:

    • Knowledge representation and reasoning

    • Planning principles

  • Ongoing work to deploy ErrorScan as a grid service

  • Plan to integrate with automatic planning algorithm (Pegasus):

    • To complete workflow template upon user’s request

    • To provide more sophisticated suggestions

    • To handle resource assignment automatically

  • Instantiate workflow template with input data

    • Using mixed-initiative query planning system [Tuchinda et al IAAI 04]

  • Longer term: iterative closed-loop workflow creation and execution


Http www isi edu ikcap cat l.jpg
http://www.isi.edu/ikcap/cat Approach

  • http://www.isi.edu/ikcap/cat: On-line demonstration, access to portal, publications

    • Jihie Kim, Marc Spraragen, and Yolanda Gil, An Intelligent Assistant for Interactive Workflow Composition , Proceedings of the International Conference on Intelligent User Interfaces (IUI), 2004.

    • Jihie Kim and Yolanda Gil, Towards Interactive Composition of Semantic Web Services, AAAI Spring Symposium on Semantic Web Services, 2004.

    • Marc Spraragen. Mixed-Initiative Workflow Composition, AAAI student paper, 2004.

    • Jihie Kim and Yolanda Gil, Towards Interactive Composition of Semantic Web Services (Poster), 2nd International Semantic Web Conference (ISWC), 2003.

  • pegasus.isi.edu, www.isi.edu/ikcap/cognitive-grids

  • www.isi.edu/~gil, [email protected]


Slide18 l.jpg

Pegasus: Fully Automated Workflow Generation for Computational Grids (joint work with E. Deelman, J. Blythe, C. Kesselman, and GriPhyN participants)

[Deelman et al JGC’03; Blythe et al IAAI’03; Blythe et al ICAPS’03; Gil et al IEEE IS’04]

  • Given: desired result and constraints

    • A desired result (high-level, metadata description)

    • A set of application components described in the Grid

    • A set of resources in the Grid (dynamic, distributed)

    • A set of constraints and preferences on solution quality

  • Find: an executable job workflow

    • A configuration of components that generates the desired result

    • A specification of resources where components can be executed and data can be stored

  • Approach: Use AI planning techniques to search the solution space and evaluate tradeoffs

    • Specified as initial state, goal state, components/steps

    • Exploit heuristics to direct the search for solutions and represent optimality and policy criteria


ad