Virtual data management for grid computing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 131

Virtual Data Management for Grid Computing PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Virtual Data Management for Grid Computing. Michael Wilde Argonne National Laboratory [email protected] Gaurang Mehta, Karan Vahi Center for Grid Technologies USC Information Sciences Institute gmehta,vahi @isi.edu. Outline. The concept of Virtual Data

Download Presentation

Virtual Data Management for Grid Computing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Virtual data management for grid computing

Virtual Data Management for Grid Computing

Michael Wilde

Argonne National Laboratory

[email protected]

Gaurang Mehta, Karan Vahi

Center for Grid Technologies

USC Information Sciences Institute

gmehta,vahi @isi.edu


Outline

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and approach for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


The virtual data concept

The Virtual Data Concept

Enhance scientific productivity through:

  • Discovery and application of datasets and programs at petabyte scale

  • Enabling use of a worldwide data grid as a scientific workstation

    Virtual Data enables this approach by creating datasets from workflow “recipes” and recording their provenance.


Virtual data concept

Virtual Data Concept

  • Motivated by next generation data-intensive applications

    • Enormous quantities of data, petabyte-scale

    • The need to discover, access, explore, and analyze diverse distributed data sources

    • The need to organize, archive, schedule, and explain scientific workflows

    • The need to share data products and the resources needed to produce and store them


Griphyn the grid physics network

GriPhyN – The Grid Physics Network

  • NSF-funded IT research project

  • Supports the concept of Virtual Data, where data is materialized on demand

  • Data can exist on some data resource and be directly accessible

  • Data can exist only in a form of a recipe

  • The GriPhyN Virtual Data System can seamlessly deliver the data to the user or application regardless of the form in which the data exists

  • GriPhyN targets applications in high-energy physics, gravitational-wave physics and astronomy


Virtual data system capabilities

Virtual Data System Capabilities

Producing data from transformations with uniform, precise data interface descriptions enables…

  • Discovery: finding and understanding datasets and transformations

  • Workflow: structured paradigm for organizing, locating, specifying, & producing scientific datasets

    • Forming new workflow

    • Building new workflow from existing patterns

    • Managing change

  • Planning: automated to make the Grid transparent

  • Audit: explanation and validation via provenance


Virtual data example galaxy cluster search

Virtual Data Example:Galaxy Cluster Search

DAG

Sloan Data

Galaxy cluster

size distribution

Jim Annis, Steve Kent, Vijay Sehkri, Fermilab, Michael Milligan, Yong Zhao, University of Chicago


Virtual data management for grid computing

mass = 200

decay = bb

mass = 200

mass = 200

decay = ZZ

mass = 200

decay = WW

stability = 3

mass = 200

decay = WW

mass = 200

decay = WW

stability = 1

mass = 200

event = 8

mass = 200

decay = WW

stability = 1

event = 8

mass = 200

plot = 1

mass = 200

decay = WW

event = 8

mass = 200

decay = WW

stability = 1

plot = 1

mass = 200

decay = WW

plot = 1

Virtual Data Application:

High Energy Physics Data Analysis

mass = 200

decay = WW

stability = 1

LowPt = 20

HighPt = 10000

Work and slide by

Rick Cavanaugh and

Dimitri Bourilkov,

University of Florida


Lifecycle of virtual data

Lifecycle of Virtual Data

I have some subject images, what analyses are available? which can be applied to this format?

I need to document the devices and methods used to measure the data, and keep track of the analysis steps applied to the data.

I’ve come across some interesting data, but I need to understand the nature of the analyses applied when it was constructed before I can trust it for my purposes.

I want to apply an image registration program to thousands of objects. If the results already exist, I’ll save weeks of computation.


A virtual data grid

A Virtual Data Grid


Virtual data scenario

psearch –t 10 …

file1

file8

simulate –t 10 …

file1

file1

File3,4,5

file2

reformat –f fz …

file7

conv –I esd –o aod

summarize –t 10 …

file6

Virtual Data Scenario

Manage workflow;

Update workflow following changes

On-demand data generation

Explain provenance, e.g. for file8:

  • psearch –t 10 –i file3 file4 file5 –o file8summarize –t 10 –i file6 –o file7reformat –f fz –i file2 –o file3 file4 file5 conv –l esd –o aod –i file 2 –o file6simulate –t 10 –o file1 file2


Vdl and abstract workflow

Abstract Workflow

VDL and Abstract Workflow

VDL descriptions

User request data file “c”


Example workflow reduction

Example Workflow Reduction

  • Original abstract workflow

  • If “b” already exists (as determined by query to the RLS), the workflow can be reduced


Mapping from abstract to concrete

Mapping from abstract to concrete

  • Query RLS, MDS, and TC, schedule computation and data movement


Workflow management in griphyn

Workflow management in GriPhyN

  • Workflow Generation: how do you describe the workflow (at various levels of abstraction)? (Virtual Data Language and catalog “VDC”))

  • Workflow Mapping/Refinement: how do you map an abstract workflow representation to an executable form? (“Planners”: Pegasus)

  • Workflow Execution: how to you reliably execute the workflow? (Condor’s DAGMan)


Executable workflow construction

Executable Workflow Construction

  • VDL tools used to build an abstract workflow based on VDL descriptions

  • Planners (e.g. Pegasus) take the abstract workflow and produce an executable workflow for the Grid or other environments

  • Workflow executors (“enactment engines”) like Condor DAGMan execute the workflow


Terms

Terms

  • Abstract Workflow (DAX)

    • Expressed in terms of logical entities

    • Specifies all logical files required to generate the desired data product from scratch

    • Dependencies between the jobs

    • Analogous to build style dag

  • Concrete Workflow

    • Expressed in terms of physical entities

    • Specifies the location of the data and executables

    • Analogous to a make style dag


Outline1

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and Approach for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


Vdl virtual data language describes data transformations

VDL: Virtual Data LanguageDescribes Data Transformations

  • Transformation - “TR”

    • Abstract template of program invocation

    • Similar to "function definition"

  • Derivation – “DV”

    • “Function call” to a Transformation

    • Store past and future:

      • A record of how data products were generated

      • A recipe of how data products can be generated

  • Invocation

    • Record of a Derivation execution


Example transformation

Example Transformation

TR t1( out a2, in a1, none pa = "500", none env = "100000" ) {

argument = "-p "${pa};

argument = "-f "${a1};

argument = "-x –y";

argument stdout = ${a2};

profile env.MAXMEM = ${env};

}

$a1

t1

$a2


Example derivations

Example Derivations

DV d1->t1 (env="20000", pa="600",a2=@{out:run1.exp15.T1932.summary},a1=@{in:run1.exp15.T1932.raw},

);

DV d2->t1 (a1=@{in:run1.exp16.T1918.raw},a2=@{out.run1.exp16.T1918.summary}

);


Workflow from file dependencies

Workflow from File Dependencies

file1

TR tr1(in a1, out a2) {

argument stdin = ${a1}; 

argument stdout = ${a2}; }

TR tr2(in a1, out a2) {

argument stdin = ${a1};

argument stdout = ${a2}; }

DV x1->tr1(a1=@{in:file1}, a2=@{out:file2});

DV x2->tr2(a1=@{in:file2}, a2=@{out:file3});

x1

file2

x2

file3


Example workflow

Example Workflow

  • Complex structure

    • Fan-in

    • Fan-out

    • "left" and "right" can run in parallel

  • Uses input file

    • Register with RLS

  • Complex file dependencies

    • Glues workflow

preprocess

findrange

findrange

analyze


Workflow step preprocess

Workflow step "preprocess"

  • TR preprocess turns f.a into f.b1 and f.b2

    TR preprocess( output b[], input a ) {argument = "-a top";argument = " –i "${input:a};argument = " –o " ${output:b};

    }

  • Makes use of the "list" feature of VDL

    • Generates 0..N output files.

    • Number file files depend on the caller.


Workflow step findrange

Workflow step "findrange"

  • Turns two inputs into one output

    TR findrange( output b, input a1, input a2,none name="findrange", none p="0.0" ) {argument = "-a "${name};argument = " –i " ${a1} " " ${a2};argument = " –o " ${b};argument = " –p " ${p};

    }

  • Uses the default argument feature


Can also use list parameters

Can also use list[] parameters

TR findrange( output b, input a[],none name="findrange", none p="0.0" ) {argument = "-a "${name};argument = " –i " ${" "|a};argument = " –o " ${b};argument = " –p " ${p};

}


Workflow step analyze

Workflow step "analyze"

  • Combines intermediary results

    TR analyze( output b, input a[] ) {argument = "-a bottom";argument = " –i " ${a};argument = " –o " ${b};

    }


Complete vdl workflow

Complete VDL workflow

  • Generate appropriate derivations

    DV top->preprocess( b=[ @{out:"f.b1"}, @{ out:"f.b2"} ], a=@{in:"f.a"} );

    DV left->findrange( b=@{out:"f.c1"}, a2=@{in:"f.b2"}, a1=@{in:"f.b1"}, name="left", p="0.5" );

    DV right->findrange( b=@{out:"f.c2"}, a2=@{in:"f.b2"}, a1=@{in:"f.b1"}, name="right" );

    DV bottom->analyze( b=@{out:"f.d"}, a=[ @{in:"f.c1"}, @{in:"f.c2"} );


Compound transformations

Compound Transformations

  • Using compound TR

    • Permits composition of complex TRs from basic ones

    • Calls are independent

      • unless linked through LFN

    • A Call is effectively an anonymous derivation

      • Late instantiation at workflow generation time

    • Permits bundling of repetitive workflows

    • Model: Function calls nested within a function definition


Compound transformations cont

Compound Transformations (cont)

  • TR diamond bundles black-diamonds

    TR diamond( out fd, io fc1, io fc2, io fb1, io fb2, in fa, p1, p2 ) {

    call preprocess( a=${fa}, b=[ ${out:fb1}, ${out:fb2} ] );

    call findrange( a1=${in:fb1}, a2=${in:fb2}, name="LEFT", p=${p1}, b=${out:fc1} );

    call findrange( a1=${in:fb1}, a2=${in:fb2}, name="RIGHT", p=${p2}, b=${out:fc2} );

    call analyze( a=[ ${in:fc1}, ${in:fc2} ], b=${fd} );

    }


Compound transformations cont1

Compound Transformations (cont)

  • Multiple DVs allow easy generator scripts:

    DV d1->diamond( fd=@{out:"f.00005"}, fc1=@{io:"f.00004"}, fc2=@{io:"f.00003"}, fb1=@{io:"f.00002"}, fb2=@{io:"f.00001"}, fa=@{io:"f.00000"}, p2="100", p1="0" );

    DV d2->diamond( fd=@{out:"f.0000B"}, fc1=@{io:"f.0000A"}, fc2=@{io:"f.00009"}, fb1=@{io:"f.00008"}, fb2=@{io:"f.00007"}, fa=@{io:"f.00006"}, p2="141.42135623731", p1="0" );

    ...

    DV d70->diamond( fd=@{out:"f.001A3"}, fc1=@{io:"f.001A2"}, fc2=@{io:"f.001A1"}, fb1=@{io:"f.001A0"}, fb2=@{io:"f.0019F"}, fa=@{io:"f.0019E"}, p2="800", p1="18" );


Functional mri analysis

Functional MRI Analysis


Fmri example air tools

fMRI Example: AIR Tools

TR air::align_warp( in reg_img, in reg_hdr, in sub_img,

in sub_hdr, m, out warp )

{

argument = ${reg_img};

argument = ${sub_img};

argument = ${warp};

argument = "-m " ${m};

argument = "-q";

}

TR air::reslice( in warp, sliced, out sliced_img, out sliced_hdr )

{

argument = ${warp};

argument = ${sliced};

}


Fmri example air tools1

fMRI Example: AIR Tools

TR air::warp_n_slice(

in reg_img, in reg_hdr, in sub_img, in sub_hdr, m = "12",

io warp, sliced, out sliced_img, out sliced_hdr )

{

call air::align_warp( reg_img=${reg_img}, reg_hdr=${reg_hdr},

sub_img=${sub_img}, sub_hdr=${sub_hdr},

m=${m},

warp = ${out:warp} );

call air::reslice( warp=${in:warp}, sliced=${sliced},

sliced_img=${sliced_img}, sliced_hdr=${sliced_hdr} );

}

TR air::softmean( in sliced_img[], in sliced_hdr[], arg1 = "y",

arg2 = "null", atlas, out atlas_img, out atlas_hdr )

{

argument = ${atlas};

argument = ${arg1} " " ${arg2};

argument = ${sliced_img};

}


Fmri example air tools2

fMRI Example: AIR Tools

DV air::i3472_3->air::warp_n_slice(

reg_hdr = @{in:"3472-3_anonymized.hdr"},

reg_img = @{in:"3472-3_anonymized.img"},

sub_hdr = @{in:"3472-3_anonymized.hdr"},

sub_img = @{in:"3472-3_anonymized.img"},

warp = @{io:"3472-3_anonymized.warp"},

sliced = "3472-3_anonymized.sliced",

sliced_hdr = @{out:"3472-3_anonymized.sliced.hdr"},

sliced_img = @{out:"3472-3_anonymized.sliced.img"} );

DV air::i3472_6->air::warp_n_slice(

reg_hdr = @{in:"3472-3_anonymized.hdr"},

reg_img = @{in:"3472-3_anonymized.img"},

sub_hdr = @{in:"3472-6_anonymized.hdr"},

sub_img = @{in:"3472-6_anonymized.img"},

warp = @{io:"3472-6_anonymized.warp"},

sliced = "3472-6_anonymized.sliced",

sliced_hdr = @{out:"3472-6_anonymized.sliced.hdr"},

sliced_img = @{out:"3472-6_anonymized.sliced.img"} );


Fmri example air tools3

fMRI Example: AIR Tools

DV air::a3472_3->air::softmean(

sliced_img = [

@{in:"3472-3_anonymized.sliced.img"},

@{in:"3472-4_anonymized.sliced.img"},

@{in:"3472-5_anonymized.sliced.img"},

@{in:"3472-6_anonymized.sliced.img"} ],

sliced_hdr = [

@{in:"3472-3_anonymized.sliced.hdr"},

@{in:"3472-4_anonymized.sliced.hdr"},

@{in:"3472-5_anonymized.sliced.hdr"},

@{in:"3472-6_anonymized.sliced.hdr"} ],

atlas = "atlas",

atlas_img = @{out:"atlas.img"},

atlas_hdr = @{out:"atlas.hdr"}

);


Query examples

Query Examples

Which TRs can process a "subject" image?

  • Q: xsearchvdc -q tr_meta dataType subject_image input

  • A: fMRIDC.AIR::align_warp

    Which TRs can create an "ATLAS"?

  • Q: xsearchvdc -q tr_meta dataType atlas_image output

  • A: fMRIDC.AIR::softmean

    Which TRs have output parameter "warp" and a parameter "options"

  • Q: xsearchvdc -q tr_para warp output options

  • A: fMRIDC.AIR::align_warp

    Which DVs call TR "slicer"?

  • Q: xsearchvdc -q tr_dv slicer

  • A: fMRIDC.FSL::s3472_3_x->fMRIDC.FSL::slicer

    fMRIDC.FSL::s3472_3_y->fMRIDC.FSL::slicer fMRIDC.FSL::s3472_3_z->fMRIDC.FSL::slicer


Query examples1

Query Examples

List anonymized subject-images for young subjects. This query searches for files based on their metadata.

  • Q: xsearchvdc -q lfn_meta

    dataType subject_image

    privacy anonymized

    subjectType young

  • A: 3472-4_anonymized.img

    For a specific patient image, 3472-3, show all DVs and files that were derived from this image, directly or indirectly.

  • Q: xsearchvdc -q lfn_tree

    3472-3_anonymized.img

  • A: 3472-3_anonymized.img

    3472-3_anonymized.sliced.hdr

    atlas.hdr

    atlas.img

    atlas_z.jpg

    3472-3_anonymized.sliced.img


Virtual provenance list of derivations and files

Virtual Provenance:list of derivations and files

<job id="ID000001" namespace="Quarknet.HEPSRCH" name="ECalEnergySum" level="5"

dv-namespace="Quarknet.HEPSRCH" dv-name="run1aesum">

<argument><filename file="run1a.event"/> <filename file="run1a.esm"/></argument>

<uses file="run1a.esm" link="output" dontRegister="false" dontTransfer="false"/>

<uses file="run1a.event" link="input" dontRegister="false“ dontTransfer="false"/>

</job>

...

<job id="ID000014" namespace="Quarknet.HEPSRCH" name="ReconTotalEnergy" level="3"…

<argument><filename file="run1a.mis"/> <filename file="run1a.ecal"/> …

<uses file="run1a.muon" link="input" dontRegister="false" dontTransfer="false"/>

<uses file="run1a.total" link="output" dontRegister="false" dontTransfer="false"/>

<uses file="run1a.ecal" link="input" dontRegister="false" dontTransfer="false"/>

<uses file="run1a.hcal" link="input" dontRegister="false" dontTransfer="false"/>

<uses file="run1a.mis" link="input" dontRegister="false" dontTransfer="false"/>

</job>

<!--list of all files used -->

<filename file="ecal.pct" link="inout"/>

<filename file="electron10GeV.avg" link="inout"/>

<filename file="electron10GeV.sum" link="inout"/> <filename file="hcal.pct" link="inout"/>

...

(excerpted for display)


Virtual provenance in xml control flow graph

Virtual Provenance in XML:control flow graph

<child ref="ID000003"> <parent ref="ID000002"/> </child>

<child ref="ID000004"> <parent ref="ID000003"/> </child>

<child ref="ID000005"> <parent ref="ID000004"/>

<parent ref="ID000001"/>...

<child ref="ID000009"> <parent ref="ID000008"/> </child>

<child ref="ID000010"> <parent ref="ID000009"/>

<parent ref="ID000006"/>...

<child ref="ID000012"> <parent ref="ID000011"/> </child>

<child ref="ID000013"> <parent ref="ID000011"/> </child>

<child ref="ID000014"> <parent ref="ID000010"/>

<parent ref="ID000012"/>...

<parent ref="ID000013"/>...</child>…

(excerpted for display…)


Invocation provenance

Invocation Provenance

Completion status and resource usage

Attributes of executable transformation

Attributes of input and output files


Virtual data management for grid computing

Future: Provenance & Workflowfor Web Services


Outline2

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and Issues for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


Outline3

Outline

  • Introduction and the GriPhyN project

  • Chimera

  • Overview of Grid concepts and tools

  • Pegasus

  • Applications using Chimera and Pegasus

  • Research issues

  • Exercises


Motivation for grids how do we solve problems

Motivation for Grids: How do we solve problems?

  • Communities committed to common goals

    • Virtual organizations

  • Teams with heterogeneous members & capabilities

  • Distributed geographically and politically

    • No location/organization possesses all required skills and resources

  • Adapt as a function of the situation

    • Adjust membership, reallocate responsibilities, renegotiate resources


The grid vision

The Grid Vision

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

  • On-demand, ubiquitous access to computing, data, and services

  • New capabilities constructed dynamically and transparently from distributed services


The grid

The Grid

  • Emerging computational, networking, and storage infrastructure

    • Pervasive, uniform, and reliable access to remote data, computational, sensor, and human resources

  • Enable new approaches to applications and problem solving

    • Remote resources the rule, not the exception

  • Challenges

    • Heterogeneous components

    • Component failures common

    • Different administrative domains

    • Local policies for security and resource usage


Data grids for high energy physics

~PBytes/sec

~100 MBytes/sec

Offline Processor Farm

~20 TIPS

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

~100 MBytes/sec

Online System

Tier 0

CERN Computer Centre

~622 Mbits/sec or Air Freight (deprecated)

Tier 1

France Regional Centre

Germany Regional Centre

Italy Regional Centre

FermiLab ~4 TIPS

~622 Mbits/sec

Tier 2

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

HPSS

HPSS

HPSS

HPSS

HPSS

~622 Mbits/sec

Institute ~0.25TIPS

Institute

Institute

Institute

Physics data cache

~1 MBytes/sec

1 TIPS is approximately 25,000

SpecInt95 equivalents

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Pentium II 300 MHz

Pentium II 300 MHz

Pentium II 300 MHz

Pentium II 300 MHz

Tier 4

Physicist workstations

Data Grids for High Energy Physics

Slide courtesy Harvey Newman, CalTech

www.griphyn.org www.ppdg.net www.eu-datagrid.org


Grid applications

Grid Applications

  • Increasing in the level of complexity

  • Use of individual application components

  • Reuse of individual intermediate data products

  • Description of Data Products using Metadata Attributes

  • Execution environment is complex and very dynamic

    • Resources come and go

    • Data is replicated

    • Components can be found at various locations or staged in on demand

  • Separation between

    • the application description

    • the actual execution description


Grid3 the laboratory

Grid3 – The Laboratory

Supported by the National Science Foundation and the Department of Energy.


Globus monitoring and discovery service mds

Globus Monitoring and Discovery Service (MDS)

The MDS architecture is a flexible hierarchy. There can be several levels of GIISes; any GRIS can register with any GIIS; and any GIIS can register with another, making this approach modular and extensible.


Gridftp

GridFTP

  • Data-intensive grid applications transfer and replicate large data sets (terabytes, petabytes)

  • GridFTP Features:

    • Third party (client mediated) transfer

    • Parallel transfers

    • Striped transfers

    • TCP buffer optimizations

    • Grid security

  • Important feature is separation of control and data channel


A replica location service

A Replica Location Service

  • A Replica Location Service (RLS) is a distributed registry service that records the locations of data copies and allows discovery of replicas

  • Maintains mappings between logical identifiers and target names

    • Physical targets: Map to exact locations of replicated data

    • Logical targets: Map to another layer of logical names, allowing storage systems to move data without informing the RLS

  • RLS was designed and implemented in a collaboration between the Globus project and the DataGrid project


Virtual data management for grid computing

Replica Location Indexes

RLI

RLI

LRC

LRC

LRC

LRC

LRC

Local Replica Catalogs

  • LRCs contain consistent information about logical-to-target mappings on a site

  • RLIs nodes aggregate information about LRCs

  • Soft state updates from LRCs to RLIs: relaxed consistency of index information, used to rebuild index after failures

  • Arbitrary levels of RLI hierarchy


Condor s dagman

Condor’s DAGMan

  • Developed at UW Madison (Livny)

  • Executes a concrete workflow

  • Makes sure the dependencies are followed

  • Executes the jobs specified in the workflow

    • Execution

    • Data movement

    • Catalog updates

  • Provides a “rescue DAG” in case of failure


Outline4

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and Approach for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


Pegasus outline

Pegasus Outline

  • Introduction

  • Pegasus and How it works

  • Pegasus Components and Internals

  • Deferred Planning

  • Pegasus Portal


Introduction

Introduction

  • Scientific applications in various domains are compute and data intensive.

  • Not feasible to run on a single resource (limits scalability).

  • Scientific application can be represented as a workflow.

  • Use of individual application components.

  • Reuse data if possible (Virtual Data).


Virtual data management for grid computing

Abstract Workflow Generation

Concrete

Workflow Generation


Generating an abstract workflow

Generating an Abstract Workflow

  • Available Information

    • Specification of component capabilities

    • Ability to generate the desired data products

      Select and configure application components to form an abstract workflow

    • assign input files that exist or that can be generated by other application components.

    • specify the order in which the components must be executed

    • components and files are referred to by their logical names

      • Logical transformation name

      • Logical file name

      • Both transformations and data can be replicated


Generating a concrete workflow

Generating a Concrete Workflow

  • Information

    • location of files and component Instances

    • State of the Grid resources

  • Select specific

    • Resources

    • Files

    • Add jobs required to form a concrete workflow that can be executed in the Grid environment

    • Each component in the abstract workflow is turned into an executable job


Virtual data management for grid computing

Concrete Workflow Generator

Domain

Knowledge

Resource

Information

Location

Information

Plan submitted to the grid

Concrete Workflow

Generator

Abstract Workflow


Pegasus

Pegasus

  • Pegasus - Planning for Execution on Grids

  • Planning framework

  • Take as input an abstract workflow

    • Abstract Dag in XML (DAX)

  • Generates concrete workflow.

  • Submits the workflow to a workflow executor (e.g. Condor Dagman).


Pegasus outline1

Pegasus Outline

  • Introduction

  • Pegasus and How it works

  • Pegasus Components and Internals

  • Deferred Planning

  • Pegasus Portal


Virtual data management for grid computing

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

  • The output jobs for the Dag are all the leaf nodes i.e. f, h, i.

  • Each job requires 2 input files and generates 2 output files.

  • The user specifies the output location

Abstract Workflow Reduction


Virtual data management for grid computing

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

Optimizing from the point of view of Virtual Data

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i

  • Jobs d, e, f have output files that have been found in the Replica Location Service.

  • Additional jobs are deleted.

  • All jobs (a, b, c, d, e, f) are removed from the DAG.


Virtual data management for grid computing

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm

Planner picks execution

and replica locations

Plans for staging data in

Job c

adding transfer nodes for the input files for the root nodes

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Job i


Virtual data management for grid computing

Staging data out and registering

new derived products

transferring the output files of the leaf job (f) to the output location

Job c

Job a

Job b

Job f

Job d

Job e

Job g

Job h

Staging and registering for each job that materializes data (g, h, i ).

Job i

KEY

The original node

Input transfer node

Registration node

Output transfer node

Node deleted by Reduction algorithm


Virtual data management for grid computing

The final, executable DAG

Input DAG

Job g

Job h

Job c

Job a

Job b

Job i

Job f

KEY

The original node

Input transfer node

Registration node

Output transfer node

Job d

Job e

Job g

Job h

Job i


Pegasus outline2

Pegasus Outline

  • Introduction

  • Pegasus and How it works

  • Pegasus Components and Internals

  • Deferred Planning

  • Pegasus Portal


Pegasus components

Pegasus Components


Replica location service

Computation

Replica Location Service

  • Pegasus uses the RLS to find input data

RLI

LRC

LRC

LRC

  • Pegasus uses the RLS to register new data products


Pegasus components1

Pegasus Components


Transformation catalog

Transformation Catalog

  • Transformation Catalog keeps information about domain transformations (executables).

  • It keeps information like the logical name of the transformation, the resource on the which the transformation is available and the URL for the physical transformation.

  • The new transformation catalog also allows to define different types of transformation, Architecture, OS, OS-version and glibc version of the transformation.

  • Profiles which allow a user to attach Environment variables, hints, scheduler related information, job-manager configurations


Transformation catalog1

Transformation Catalog

  • A set of standard api's are available to perform various operations on the transformation Catalog.

  • Various implementations are available for the transformation catalog like

    • Database :

      • Useful for large Catalogs and supports fast adds, deletes and queries.

      • Standard database security used.

    • File Format :

      • Useful for small number of transformations and easy editing.

      • isi vds::fft:1.0 /opt/bin/fft INSTALLED INTEL32::LINUX ENV::VDS_HOME=/opt/vds

  • Users can provide their own implementation.


Pegasus components2

Pegasus Components


Resource information catalog pool config

Resource Information Catalog(Pool Config)

  • Pool Config is an XML file which contains information about various pools/sites on which DAGs may execute.

  • Some of the information contained in the Pool Config file is

    • Specifies the various job-managers that are available on the pool for the different types of schedulers.

    • Specifies the GridFtp storage servers associated with each pool.

    • Specifies the Local Replica Catalogs where data residing in the pool has to be cataloged.

    • Contains profiles like environment hints which are common site-wide.

    • Contains the working and storage directories to be used on the pool.


Resource information catalog pool config1

Resource Information CatalogPool Config

  • Two Ways to construct the Pool Config File.

    • Monitoring and Discovery Service

    • Local Pool Config File (Text Based)

  • Client tool to generate Pool Config File

    • The tool gen-poolconfig is used to query the MDS and/or the local pool config file/s to generate the XML Pool Config file.


Monitoring and discovery service

Monitoring and Discovery Service

  • MDS provides up-to-date Grid state information

    • Total and idle job queues length on a pool of resources (condor)

    • Total and available memory on the pool

    • Disk space on the pools

    • Number of jobs running on a job manager

  • Can be used for resource discovery and selection

    • Developing various task to resource mapping heuristics

  • Can be used to publish information necessary for replica selection

    • Publish gridftp transfer statistics

    • Developing replica selection components


Pegasus components3

Pegasus Components


Site selector

Site Selector

  • Determines job to site mappings.

  • Selector Api provides easy pluggable implementations.

  • Implementations provided

    • Random

    • RoundRobin

    • Group

    • NonJavaCallout

  • Research on advance algorithms for site selector in progress


Pegasus components4

Pegasus Components


Transfer tools workflow executors

Transfer Tools / Workflow Executors

  • Various transfer tools are used for staging data between grid sites.

  • Define strategies to reliably and efficiently transfer data

  • Current implementations

    • Globus url copy

    • Transfer

    • Transfer2 (T2)

    • Stork

  • Workflow executors take the concrete workflow and act as a metascheduler for the Grid.

  • Current supported workflow executors

    • Condor Dagman

    • Gridlab GRMS


System configuration properties

System Configuration - Properties

  • Properties file define and modify the behavior of Pegasus.

  • Properties set in the $VDS_HOME/etc/properties can be overridden by defining them either in $HOME/.chimerarc or by giving them on the command line of any executable.

    • eg. Gendax –Dvds.home=path to vds home……

  • Some examples follow but for more details please read the sample.properties file in $VDS_HOME/etc directory.

  • Basic Required Properties

    • vds.home : This is auto set by the clients from the environment variable $VDS_HOME

    • vds.properties : Path to the default properties file

      • Default : ${vds.home}/etc/properties


Pegasus clients in the virtual data system

Pegasus Clients in the Virtual Data System

  • gencdag

    • Converts DAX to concrete Dag

  • genpoolconfig

    • Generates a pool config file from MDS

  • tc-client

    • Queries, adds, deletes transformations from the Transformation Catalog

  • rls-client, rls-query-client

    • Queries, adds, deletes entry from the RLS

  • partiondax

    • Partitions the abstract Dax into smaller dax’s.


Concrete planner gencdag

Concrete Planner Gencdag

  • The Concrete planner takes the DAX produced by Chimera and converts into a set of condor dag and submit files.

  • You can specify more then one execution pools. Execution will take place on the pools on which the executable exists. If the executable exists on more then one pool then the pool on which the executable will run is selected depending on the site selector used.

  • Output pool is the pool where you want all the output products to be transferred to. If not specified the materialized data stays on the execution pool

  • Authentication removes sites which are not accessible.


Pegasus outline3

Pegasus Outline

  • Introduction

  • Pegasus and How it works

  • Pegasus Components and Internals

  • Deferred Planning

  • Pegasus Portal


Abstract to concrete steps

Abstract To Concrete Steps


Original pegasus configuration

Original Pegasus configuration

Simple scheduling: random or round robin

using well-defined scheduling interfaces.


Deferred planning through partitioning

Deferred Planning through Partitioning

A variety of partitioning algorithms can be implemented


Mega dag is created by pegasus and then submitted to dagman

Mega DAG is created by Pegasus and then submitted to DAGMan


Mega dag pegasus

Mega DAG Pegasus


Re planning capabilities

Re-planning capabilities


Complex replanning for free almost

Complex Replanning for Free (almost)


Planning scheduling granularity

Planning & Scheduling Granularity

  • Partitioning

    • Allows to set the granularity of planning ahead

  • Node aggregation

    • Allows to combine nodes in the workflow and schedule them as one unit (minimizes the scheduling overheads)

    • May reduce the overheads of making scheduling and planning decisions

  • Related but separate concepts

    • Small jobs

      • High-level of node aggregation

      • Large partitions

    • Very dynamic system

      • Small partitions


Partitioned workflow processing

Partitioned Workflow Processing

  • Create workflow partitions

    • partition the abstract workflow into smaller workflows.

    • create the xml partition graph that lists out the dependencies between partitions.

  • Create the MegaDAG (creates the dagman submit files)

    • transform the xml partition graph to it’s corresponding condor representation.

  • Submit the MegaDAG

    • Each job invokes Pegasus on a partition and then submits the plan generated back to condor.


Future work for pegasus

Future work for Pegasus

  • Staging in executables on demand

  • Expanding the scheduling plug-ins

  • Investigating various partitioning approaches

  • Investigating reliability across partitions


Pegasus outline4

Pegasus Outline

  • Introduction

  • Pegasus and How it works

  • Pegasus Components and Internals

  • Deferred Planning

  • Pegasus Portal


Portal

Portal

  • Provides proxy based login

  • Authentication to the sites

  • Configuration for files

  • Metadata based workflow creation

  • Workflow Submission and Monitoring

  • User Notification


Portal demonstration

Portal Demonstration


Outline5

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and Issues for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


Types of applications

Types of Applications

  • Gravitational Wave Physics

  • High Energy Physics

  • Astronomy

  • Earthquake Science

  • Computational Biology


Ligo scientific collaboration

LIGO Scientific Collaboration

  • Continuous gravitational waves are expected to be produced by a variety of celestial objects

  • Only a small fraction of potential sources are known

  • Need to perform blind searches, scanning the regions of the sky where we have no a priori information of the presence of a source

    • Wide area, wide frequency searches

  • Search is performed for potential sources of continuous periodic waves near the Galactic Center and the galactic core

  • Search for binary inspirals collapsing into black holes.

  • The search is very compute and data intensive


Virtual data management for grid computing

LSC Resources

Teragrid Resources

Other Resources

Caltech Teragrid

ANL-Teragrid

WISC

UWM

NCSA-Teragrid

Caltech

PSC-Teragrid

Cardiff

ISI

SC04

USC

SDSC-Teragrid

AEI

LSU

UTB

Additional resources used: Grid3 iVDGL resources


Ligo acknowledgements

LIGO Acknowledgements

  • Patrick Brady, Scott Koranda, Duncan Brown, Stephen Fairhurst University of Wisconsin Milwaukee, USA

  • Stuart Anderson, Kent Blackburn, Albert Lazzarini,Hari Pulapaka, Teviet Creighton Caltech, USA

  • Gabriela Gonzalez, Louisiana State University

  • Many Others involved in the Testbed

  • www.ligo.caltech.edu

  • www.lsc-group.phys.uwm.edu/lscdatagrid/

  • LIGO Laboratory operates under NSF cooperative agreement PHY-0107417


Montage

Montage

  • Montage (NASA and NVO)

    • Deliver science-grade custom mosaics on demand

    • Produce mosaics from a wide range of data sources (possibly in different spectra)

    • User-specified parameters of projection, coordinates, size, rotation and spatial sampling.

Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid.


Small montage workflow

Small Montage Workflow

~1200 nodes


Montage acknowledgments

Montage Acknowledgments

  • Bruce Berriman, John Good, Anastasia Laity, Caltech/IPAC

  • Joseph C. Jacob, Daniel S. Katz, JPL

  • http://montage.ipac. caltech.edu/

  • Testbed for Montage: Condor pools at USC/ISI, UW Madison, and Teragrid resources at NCSA, Caltech, and SDSC.

    Montage is funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computational Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology.


Other applications southern california earthquake center

Other ApplicationsSouthern California Earthquake Center

  • Southern California Earthquake Center (SCEC), in collaboration with the USC Information Sciences Institute, San Diego Supercomputer Center, the Incorporated Research Institutions for Seismology, and the U.S. Geological Survey, is developing theSouthern California Earthquake Center Community Modeling Environment (SCEC/CME).

  • Create fully three-dimensional (3D) simulations of fault-system dynamics.

  • Physics-based simulations can potentially provide enormous practical benefits for assessing and mitigating earthquake risks through Seismic Hazard Analysis (SHA).

  • The SCEC/CME system is an integrated geophysical simulation modeling framework that automates the process of selecting, configuring, and executing models of earthquake systems.

Acknowledgments :

Philip Maechling and Vipin Gupta

University Of Southern California


Astronomy

Astronomy

  • Galaxy Morphology (National Virtual Observatory)

    • Investigates the dynamical state of galaxy clusters

    • Explores galaxy evolution inside the context of large-scale structure.

    • Uses galaxy morphologies as a probe of the star formation and stellar distribution history of the galaxies inside the clusters.

    • Data intensive computations involving hundreds of galaxies in a cluster

The x-ray emission is shown in blue, and the optical mission is in red. The colored dots are located at the positions of the galaxies within the cluster; the dot color represents the value of the asymmetry index. Blue dots represent the most asymmetric galaxies and are scattered throughout the image, while orange are the most symmetric, indicative of elliptical galaxies, are concentrated more toward the center.

People involved: Gurmeet Singh, Mei-Hui Su, many others


Biology applications

Biology Applications

Tomography (NIH-funded project)

  • Derivation of 3D structure from a series of 2D electron microscopic projection images,

  • Reconstruction and detailed structural analysis

    • complex structures like synapses

    • large structures like dendritic spines.

  • Acquisition and generation of huge amounts of data

  • Large amount of state-of-the-art image processing required to segment structures from extraneous background.

Dendrite structure to be rendered by

Tomography

Work performed by Mei-Hui Su with Mark Ellisman, Steve Peltier, Abel Lin, Thomas Molina (SDSC)


Virtual data management for grid computing

BLAST: set of sequence comparison algorithms that are used to search sequence databases for optimal local alignments to a query

  • 2 major runs were performed using Chimera and Pegasus:

  • 60 genomes (4,000 sequences each),

  • In 24 hours processed Genomes selected from DOE-sponsored sequencing projects

    • 67 CPU-days of processing time delivered

    • ~ 10,000 Grid jobs

    • >200,000 BLAST executions

    • 50 GB of data generated

  • 2) 450 genomes processed

  • Speedup of 5-20 times were achieved because the compute nodes we used efficiently by keeping the submission of the jobs to the compute cluster constant.

Lead by Veronika Nefedova (ANL) as part of the

Paci Data Quest Expedition program


Virtual data example galaxy cluster search1

Virtual Data Example:Galaxy Cluster Search

DAG

Sloan Data

Galaxy cluster

size distribution

Jim Annis, Steve Kent, Vijay Sehkri, Fermilab, Michael Milligan, Yong Zhao, University of Chicago


Cluster search workflow graph and execution trace

Cluster SearchWorkflow Graphand Execution Trace

Workflow jobs vs time


Capone grid interactions

Monitoring

RLS

MDS

GridCat

SE

DonQuijote

MonALISA

Capone

ProdDB

CE

Windmill

gsiftp

WN

gatekeeper

Chimera

VDC

Condor-G

schedd

Pegasus

GridMgr

Capone Grid Interactions

ATLAS Slides courtesy of Marco Mambelli, U Chicago ATLAS Team


Atlas capone production executor

RLS

Monitoring

Monitoring

MDS

GridCat

MDS

GridCat

SE

DonQuijote

RLS

MonALISA

MonALISA

SE

Capone

ProdDB

DonQuijote

CE

Windmill

gsiftp

WN

gatekeeper

Chimera

Capone

ProdDB

VDC

CE

Condor-G

Windmill

gsiftp

schedd

WN

Pegasus

GridMgr

gatekeeper

Chimera

VDC

Condor-G

schedd

Pegasus

GridMgr

ATLAS “Capone” Production Executor

  • Reception

    • Job received from work distributor

  • Translation

    • Un-marshalling, ATLAS transformation

  • DAX generation

    • VDL tools generate abstract DAG

  • Input file retrieval from RLS catalog

    • Check RLS for input LFNs (retrieval of GUID, PFN)

  • Scheduling: CE and SE are chosen

  • Concrete DAG generation and submission

    • Pegasus creates Condor submit files

    • DAGMan invoked to manage remote steps


A job in capone 2 execution

Monitoring

MDS

GridCat

RLS

MonALISA

SE

DonQuijote

Capone

ProdDB

CE

Windmill

gsiftp

WN

gatekeeper

Chimera

VDC

Condor-G

schedd

Pegasus

GridMgr

A job in Capone (2, execution)

  • Remote job running / status checking

    • Stage-in of input files, create POOL FileCatalog

    • Athena (ATLAS code) execution

  • Remote Execution Check

    • Verification of output files and exit codes

    • Recovery of metadata (GUID, MD5sum, exe attributes)

  • Stage Out: transfer from CE site to destination SE

  • Output registration

    • Registration of the output LFN/PFN and metadata in RLS

  • Finish

    • Job completed successfully, communicates to Windmill that jobs is ready for validation

  • Job status is sent to Windmill during all the execution

  • Windmill/DQ validate & register output in ProdDB


Ramp up atlas dc2

CPU-day

Mid July

Sep 10

Ramp up ATLAS DC2


Outline6

Outline

  • The concept of Virtual Data

  • The Virtual Data System and Language

  • Tools and approach for Running on the Grid

  • Pegasus: Grid Workflow Planning

  • Virtual Data Applications on the Grid

  • Research issues


Research directions in virtual data grid architecture

Research directions inVirtual Data Grid Architecture


Research issues

Research issues

  • Focus on data intensive science

  • Planning is necessary

  • Reaction to the environment is a must (things go wrong, resources come up)

  • Iterative Workflow Execution:

    • Workflow Planner

    • Workload Manager

  • Planning decision points

    • Workflow Delegation Time (eager)

    • Activity Scheduling Time (deferred)

    • Resource Availability Time (just in time)

  • Decision specification level

  • Reacting to the changing

    environment and recovering from failures

  • How does the communication takes place? Callbacks, workflow annotations etc…

Abstract workflow

Planner

Concrete workflow

info

Manager

Tasks

info

Resource Manager

Grid


Provenance hyperlinks

Provenance Hyperlinks


Goals for a dataset model

Goals for a Dataset Model

<FORM

<Title…>

/FORM>

File

Set of files

Object closure

XML Element

Relational query or spreadsheet range

New user-defined dataset type:

Set of files with relational index

Speculative model described in CIDR 2003 paper by Foster, Voeckler, Wilde and Zhao


Acknowledgements

Acknowledgements

GriPhyN, iVDGL, and QuarkNet(in part) are supported by the National Science Foundation

The Globus Alliance, PPDG, and QuarkNet are supported in part by the US Department of Energy, Office of Science; by the NASA Information Power Grid program; and by IBM


Acknowledgements1

Acknowledgements

  • Argonne and The University of Chicago: Ian Foster, Jens Voeckler, Yong Zhao, Jin Soon Chang, Luiz Meyer (UFRJ)

    • www.griphyn.org/vds

  • The USC Information Sciences Institute:Ewa Deelman, Carl Kesselman, Gurmeet Singh, Mei-Hui Su

    • http://pegasus.isi.edu

  • Special thanks to Jed Dobson, fMRI Data Center, Dartmouth College


For further information

For further information

  • Chimera and Pegasus:

    • www.griphyn.org/chimera

    • http://pegasus.isi.edu

  • Workflow Management research group in GGF:

    • www.isi.edu/~deelman/wfm-rg

  • Demos on the SC floor

    • Argonne National Laboratory andUniversity of Southern California Booths.


  • Login