workflow management in grids l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Workflow Management in Grids PowerPoint Presentation
Download Presentation
Workflow Management in Grids

Loading in 2 Seconds...

play fullscreen
1 / 37

Workflow Management in Grids - PowerPoint PPT Presentation


  • 349 Views
  • Uploaded on

Workflow Management in Grids Ewa Deelman Center for Grid Technologies USC Information Sciences Institute People Involved Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Jaskaran Singh, Karan Vahi (ISI) Jim Blythe, Yolanda Gil (ISI) Anirban Mandal (Rice) Outline

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Workflow Management in Grids' - KeelyKia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
workflow management in grids

Workflow Management in Grids

Ewa Deelman

Center for Grid Technologies

USC Information Sciences Institute

people involved
People Involved
  • Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Jaskaran Singh, Karan Vahi (ISI)
  • Jim Blythe, Yolanda Gil (ISI)
  • Anirban Mandal (Rice)
outline
Outline
  • The GriPhyN project and Grid Applications
  • Workflow Management in Grids
  • Pegasus, Planning for Execution in Grids
    • Framework Description
    • Generation of Executable Workflows
      • Based on abstract workflow descriptions
      • Based on application-level metadata
  • Application of Pegasus to the LIGO pulsar search
  • Future Research Directions
griphyn data grid challenge
GriPhyN Data Grid Challenge
  • Provide a framework which enables Virtual Organizations around the world to perform computationally demanding analysis of large, geographically distributed datasets.
  • The Virtual Organizations are large and highly distributed
  • The datasets are large, currently on the order of Terabytes and expected to grow to the level of 100s of Petabytes in the next decade
  • Provide a seamless access to data: experimental raw data or processed data products
  • Enable a user/application to ask for any domain-specific data, whether computed or not

Concept of Virtual Data

griphyn project
GriPhyN Project
  • 5-year, $12.5M NSF ITR proposal to realize the concept of virtual data
  • Four Applications:
    • ATLAS and CMS experiments at Large Hadron Collider at CERN
    • SDSS (Sloan Digital Sky Survey)
    • LIGO (Laser Interferometer Gravitational-wave Observatory)
  • Key research areas:
    • Virtual data technologies (information models, management of virtual data software, etc.)
    • Request planning and scheduling (including policy representation and enforcement)
    • Task execution (fault tolerance)
  • CS Participants: U.Chicago, USC/ISI, UW-Madison, UCSD, UCB, Indiana, Northwestern, Florida
  • Close collaboration and participation from the Physicists community
applications
Applications
  • Increasing in the level of complexity
  • Use of individual application components
  • Reuse of individual intermediate data products
  • Description of Data Products using Metadata Attributes
  • Execution environment is complex and very dynamic
    • Resources come and go
    • Data is replicated
    • Components can be found at various locations or staged in on demand
  • Separation between
    • the application description
    • the actual execution description
outline7
Outline
  • The GriPhyN project and Grid Applications
  • Workflow Management in Grids
  • Pegasus, Planning for Execution in Grids
    • Framework Description
    • Generating Executable Workflows
      • Based on abstract workflow descriptions
      • Based on application-level metadata
  • Application of Pegasus to the LIGO pulsar search
  • Future Research Directions
slide8

Abstract Workflow Generation

Concrete

Workflow Generation

generating an abstract workflow
Generating an Abstract Workflow
  • Available Information
    • Specification of component capabilities
    • Ability to generate the desired data products

Select and configure application components to form an abstract workflow

    • assign input files that exist or that can be generated by other application components.
    • specify the order in which the components must be executed
    • components and files are referred to by their logical names
      • Logical transformation name
      • Logical file name
      • Both transformations and data can be replicated
generating a concrete workflow
Generating a Concrete Workflow
  • Information
    • location of files and component Instances
    • State of the Grid resources
  • Select specific
    • Resources
    • Files
    • Adding jobs required to form a concrete workflow that can be executed in the Grid environment
      • Data movement
    • Data registration
    • Each component in the abstract workflow is turned into an executable job
why automate workflow generation
Why Automate Workflow Generation?
  • Usability: Limit User’s necessary Grid knowledge
      • Monitoring and Directory Service
      • Replica Location Service
      • Transformation Catalog
  • Complexity:
    • User needs to make choices
      • Alternative application components
      • Alternative files
      • Alternative locations
    • The user may reach a dead end
    • Many different interdependencies may occur among components
  • Solution cost:
    • Evaluate the alternative solution costs
      • Performance
      • Reliability
      • Resource Usage
  • Global cost:
    • minimizing cost within a community or a virtual organization
    • requires reasoning about individual user’s choices in light of other user’s choices
outline13
Outline
  • The GriPhyN project and Grid Applications
  • Workflow Management in Grids
  • Pegasus, Planning for Execution in Grids
    • Framework Description
    • Generating Executable Workflows
      • Based on abstract workflow descriptions
      • Based on application-level metadata
  • Application of Pegasus to the LIGO pulsar search
  • Future Research Directions
pegasus planning for execution in grids
Pegasus, Planning for Execution in Grids
  • performs the mapping from an abstract workflow to a concrete workflow, which can be executed on the Grid
  • isolates the user from many Grid details
  • automatically locates physical locations for both components (transformations) and data, via Globus RLS and the Transformation Catalog
  • finds appropriate resources to execute the components (via Globus MDS)
  • whenever several alternatives are possible (e.g., alternative physical files, alternative resources) it makes a random choice
  • publishes newly derived data products
  • reuses existing data products where applicable
slide15

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

  • The output jobs for the Dag are all the leaf nodes i.e. f, h, i.
  • Each job requires 2 input files and generates 2 output files.
  • The user specifies the output location

Abstract Workflow Reduction

slide16

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

Optimizing from the point of view of Virtual Data

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i

  • Jobs d, e, f have output files that have been found in the Replica Location Service.
  • Additional jobs are deleted.
  • All jobs (a, b, c, d, e, f) are removed from the DAG.
slide17

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

  • Planner picks execution
  • and replica locations
    • Plans for staging data in

Job c

adding transfer nodes for the input files for the root nodes

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i

slide18

transferring the output files of the leaf job (f) to the output location

Staging data out and registering

new derived products in the RLS

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Staging and registering for each job that materializes data (g, h, i ).

Job i

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

slide19

The final, executable DAG

Input DAG

Job g

Job h

Job c

Job a

Job b

Job i

Job f

KEY

The original node

Input transfer node

Registration node

Output transfer node

Job d

Job e

Job g

Job h

Job i

pegasus initial solution
Pegasus’ Initial Solution
  • Reused existing data products
  • Used in a variety of complex applications
    • Astronomy
    • Bioinformatics
    • High-energy physics
      • Cluster of 25 dual-processor Pentium machines.
      • Computation: 7 days, 678 jobs with 250 events each
      • Produced ~ 200GB of simulated data.
  • Provided a feasible solution, but not necessarily a low-cost one
  • To improving the quality of the solution:
    • Need to efficiently search large problem space and apply both local and global optimizations
    • Need to compose various optimization strategies
    • Need to reuse the resource and component models.
ai planning technologies
AI Planning technologies
  • Provide broad-base, generic implementation foundation
  • Use techniques, such as backtracking, and domain-specific and domain-independent control rules
  • Allow the easy addition of new system constraints and rules
  • Incorporate optimality and policy into the search for solutions
  • Can integrate the generation of workflows across users and policies within VOs.
  • Collaboration with Yolanda Gil and Jim Blythe (ISI)
ai based abstract and concrete planner
AI-based Abstract and Concrete Planner

Goal (Provided by the User)

  • A metadata specification of the information the user requires and the desired location for the output file

Initial State (Automatically Extracted)

  • Information about the state of the Grid
  • Information about data location

Operators (encoded for the application domain)

  • Represent the execution of a component at a particular location and the generation a particular file(s)
  • A file movement across the network

Search control rules (Grid or application specific)

  • Specify options that should be exclusively considered at any choice point in the search algorithm (execute “close” to the data
techniques used by the planner
Techniques Used by the Planner
  • Uses search algorithms to explore the solution space
  • Uses local heuristics to choose between alternatives and prune the solution space
    • Execute “close” to the data
  • Global evaluation of alternative plans (user, community specific)
    • Performance
    • Reliability
    • Resource usage
outline25
Outline
  • The GriPhyN project and Grid Applications
  • Workflow Management in Grids
  • Pegasus, Planning for Execution in Grids
    • Framework Description
    • Generating Executable Workflows
      • Based on abstract workflow descriptions
      • Based on application-level metadata
  • Application of Pegasus to the LIGO pulsar search
  • Future Research Directions
ligo experiment laser interferometer gravitational wave observatory

LIGO Livingston

LIGO Experiment(Laser Interferometer Gravitational-Wave Observatory)
  • Aims to detect gravitational waves predicted by Einstein’s theory of relativity.
  • Can be used to detect
    • binary pulsars
    • mergers of black holes
    • “starquakes” in neutron stars
  • Two installations: in Louisiana (Livingston) and Washington State
    • Other projects: Virgo (Italy), GEO (Germany), Tama (Japan)
  • Instruments are designed to measure the effect of gravitational waves on test masses suspended in vacuum.
  • Data collected during experiments is a collection of time series (multi-channel)
  • Analysis is performed in time and Fourier domains
ligo s pulsar search laser interferometer gravitational wave observatory

archive

Interferometer

Hz

Time

raw channels

LIGO’s Pulsar Search(Laser Interferometer Gravitational-wave Observatory)

Extract

channel

Short

Fourier

Transform

transpose

Long time frames

30 minutes

Short time frames

Single Frame

Time-frequency Image

Extract frequency range

event DB

Construct image

Find Candidate

Store

ligo s problem decomposition
LIGO’s Problem Decomposition
  • Originally all the pulsar search conducted inside the LIGO Data Analysis System
    • Resources not sufficient to conduct all the desired analysis
  • The various components of the search, SFT, sSFT, pulsar search were identified and described in terms of metadata attributes
  • Subsets of the code were ported to the linux environment
  • Interface was created to LDAS to schedule jobs, stage data in and out and send the results of the analysis to the LIGO event database
  • Formed the application-level components
  • High level of interest and participation from the GEO experiment in Germany

LIGO collaborators: Kent Blackburn, Albert Lazzarini (Caltech) Scott Koranda (UWM) and others

GEO collaborators: Maria Alessadra Papa (Albert Einstein Institute, Germany),

Alicia Sintes (Balearic Islands University, Spain)

slide29

LIGO’s pulsar search at SC 2002

  • The pulsar search conducted at SC 2002
    • Used LIGO’s data collected during the first scientific run of the instrument
    • Targeted a set of 1000 locations of known pulsar as well as random locations in the sky
    • Results of the analysis were be published via LDAS (LIGO Data Analysis System) to the LIGO Scientific Collaboration
    • performed using LDAS and compute and storage resources at Caltech, University of Southern California, University of Wisconsin Milwaukee.

Visualization by Marcus Thiebaux

results
SC 2002 demo

Over 58 pulsar searches

Total of

330 tasks

469 data transfers

330 output files produced.

The total runtime was 11:24:35.

To date

185 pulsar searches

Total of

975 tasks

1365 data transfers

975 output files

Total runtime

96:49:47

Results
summary so far
Summary so far
  • To date
    • Applied to gravitational wave physics applications
    • Solution based on application component composition
    • Reasoning at the metadata level
    • Simple optimizations
  • Challenges Tomorrow
    • Build and integrate models and ontologies
      • Application components
      • Grid components
    • Apply and extend AI planning technologies to the problem domain
      • More sophisticated solutions
    • Generic solution, but also application relevant
    • Investigate Reliability in Grids
outline32
Outline
  • The GriPhyN project and Grid Applications
  • Workflow Management in Grids
  • Pegasus, Planning for Execution in Grids
    • Framework Description
    • Generating Executable Workflows
      • Based on abstract workflow descriptions
      • Based on application-level metadata
  • Application of Pegasus to the LIGO pulsar search
  • Future Research Directions
challenges
Challenges
  • Dealing with faults in the system
    • Fault tolerance
    • Fault avoidance
  • Supporting a wide spectrum of applications
  • Optimizing application performance
  • Incorporating policies in the planning decisions
  • Providing data provenance information for derived data
future research directions smart middleware for the grid
Future Research Directions: Smart middleware for the Grid
  • Knowledge everywhere: expressive knowledge sources to support all levels of decision-making in the Grid
  • Intelligent reasoners: use knowledge and operate incrementally to reason about tradeoffs in task allocation
  • Smart workflows: augment workflows with annotations, such as constraints and alternatives, to support incremental updates by heterogeneous agents

Continued Collaboration with Jim Blythe, Yolanda Gil and Hongsuda Tangmunarunkit

slide35

Workflow History

Community

Users

Workflow

history

Workflow

history

High-level specification of

desired results, constraints,

requirements, user policies

Policy

Information

Services

Policy

KB

Other

KB

Application

KB

Resource

KB

Workflow

Refinement

Policy

Management

Other

Grid

services

Smart Workflow

Pool

Resource

Matching

Workflow

Repair

Workflow Manager

Resource

Indexes

Simulation

codes

Replica

Locators

Community Distributed Resources

(e.g., computers, storage, network,

simulation codes, data)

Pervasive Knowledge Sources

Intelligent Reasoners

slide36

User’s

Workflow

refinement

Allow incremental workflow generation

Request

Levels of

abstraction

Policy

reasoner

Application

Workflow

repair

-level

knowledge

Relevant

components

Logical

tasks

Full

abstract

workflow

Tasks

bound to

Onto-based

Matchmaker

resources

Partial

and sent for

execution

execution

Not yet

time

executed

executed

publications
Publications

AI Forum

  • “The Role of Planning in Grid Computing”Jim Blythe, Ewa Deelman, Yolanda Gil, Carl Kesselman, Amit Agarwal, Gaurang Mehta, Karan Vahi, accepted to the International Conference on Automated Planning and Scheduling 2003
  • “Transparent Grid Computing: a Knowledge-Based Approach”Jim Blythe, Ewa Deelman, Yolanda Gil, Carl Kesselman, accepted to Innovative Applications of Artificial Intelligence Conference 2003

Grid Forum

  • “Workflow Management in GriPhyN”, Chapter in “The Grid Resource Management” book, E. Deelman, J. Blythe, Y. Gil, Carl Kesselman 2003
  • "Mapping Abstract Complex Workflows onto Grid Environments," E. Deelman, J. Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, Adam Arbree, Richard Cavanaugh, Kent Blackburn, Albert Lazzarini, Scott Koranda, to appear in the Journal of Grid Computing, 2003.
  • “Pegasus: Planning for Execution in Grids”, Ewa Deelman, Jim Blythe, Yolanda Gil, Carl Kesselman, Technical report GriPhyN 2002-20.

www.griphyn.org, www.isi.edu/~deelman/pegasus.htm