slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Managing VO data and process flows PowerPoint Presentation
Download Presentation
Managing VO data and process flows

Loading in 2 Seconds...

play fullscreen
1 / 15

Managing VO data and process flows - PowerPoint PPT Presentation


  • 192 Views
  • Uploaded on

T HE US N ATIONAL V IRTUAL O BSERVATORY. Managing VO data and process flows. Matthew J. Graham CACR/Caltech. Overview. Astronomical data VOStore/VOSpace Workflows Astrogrid workflow CEA. VO Wheel™. The importance of data. Data is the raison d’être of the VO

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Managing VO data and process flows' - yates


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

THE US NATIONAL VIRTUAL OBSERVATORY

Managing VO data and

process flows

Matthew J. Graham

CACR/Caltech

NVO Summer School 2005

overview
Overview
  • Astronomical data
  • VOStore/VOSpace
  • Workflows
  • Astrogrid workflow
  • CEA

NVO Summer School 2005

the importance of data

VO Wheel™

The importance of data
  • Data is the raison d’être of the VO
  • LSST is the data source nonpareil
    • data rates of 540MB/s  ~16TB in 8 hrs
    • final archive > 3PB of data
  • Well-established ways of handling distributed data:
    • SRB
    • PVFS
    • OGSA-DAI

NVO Summer School 2005

data use cases
Data use cases
  • Client has data:
    • stored locally: transfers it to service
    • stored locally: service retrieves it
    • stored elsewhere: service retrieves it
  • Service generates data:
    • stores it locally: notifies client of location
    • transfers it to the client’s local store
    • transfers it to a client-designated store

NVO Summer School 2005

vostore
VOStore
  • Provides a uniform interface to existing or new data storage locations (Facade pattern)
  • Structured/unstructured data both first level
  • Methods:
  • get
  • put
  • list / listAll
  • importInit
  • importData (sync/async)
  • exportInit
  • exportData (sync/async)
  • delete
  • rename

NVO Summer School 2005

vospace
VOSpace
  • Orchestrates VOStores:
    • data collections: directories, user-defined
    • authorisation: user groups
    • processing efficiency: where is the nearest copy?
  • move
  • copy
  • identifiers

NVO Summer School 2005

how to manage the flows
How to manage the flows?
  • Way of describing a flow:
    • processes/steps, inputs/outputs, serial/parallel execution, control logic, variables, inline scripting
    • preferably XML (verbose but rigourous)
  • Way of controlling a flow: engine
  • e-Science vs. e-Business:
    • open-ended vs. closed
    • verification and publication
    • static vs. dynamic workflows
    • volume and type of data
    • meta-transactions
    • customer, manager and user vs. scientist

NVO Summer School 2005

workflow patterns
Workflow patterns

Sequence:

Parallel split

Synchronisation

Multi + Synchronizing Merge

AND

XOR

Exclusive choice

Simple Merge

Multi + Multi

Multi

Multi choice

Multi Merge

Multi + Discriminator

Deferred choice

Multiple Instances with/out Synch

Implicit termination

Interleaved Parallel Routing

Milestone

NVO Summer School 2005

workflow kerfuffle
Workflow kerfuffle
  • Workflow languages: BPEL (BPEL4WS, WSBPEL, WSFL, XLANG), BPML, WS-CDL (WSCL, WSCI) , XPDL, BPSS, PSL, AGWL, DGL, DPML, GJobDL, GSFL, GFDL, GWorkflowDL, MoML, SWFL, YAWL, SCUFL/Xscufl, WPDL, PIF, PSL, OWL-S, xWFL, XPL, INCA
  • Workflow engines: Taverna, Kepler, Pegasus, DiscoveryNet, Triana, SPA, Geodise, ICENI, Askalon, GridNexus, BioPipe, BizTalk, BPWS4J, DAGMan, GridAnt, GJH, GRMS, GWFE, GWES, ITIEE, JIGSA, Karajan, ScyFLOW, SDSC Matrix, SHOP2, wftk, YAWL Engine, WFEE

NVO Summer School 2005

astrogrid workflow components
Astrogrid workflow components
  • JES (Job Execution System)
    • Astrogrid workflow engine
    • Manages control flow
    • Runs steps in a controlled asynchronous fashion
  • CEC (Common Execution Controller)
    • Manages step execution
    • Manages data flow
  • CEA (Common Execution Architecture) apps
    • datacenters: support complex quesries against archives
    • processing: consume data files and reduce them

NVO Summer School 2005

astrogrid workflow schematic

Registry

Command Line

CEA

Portal

CEC

JES

Datacenter CEA

MySpace

Astrogrid workflow schematic

Application list

Resolve application

Submit workflow

Client

library

Save/load workflow

Save/load data

NVO Summer School 2005

astrogrid workflow language
Astrogrid workflow language

<workflow name=“a workflow”>

<description>description of the workflow</description>

<sequence/flow>

<set var=“dec” value=“15”/>

<step name=“a” result-var=“a-results”>

<tool name=“toolA” interface=“simpleInterface”>

<input>

<parameter name=“RA”><value>21</value></parameter>

<parameter name=“Dec”><value>${dec}</value></parameter>

</input>

<output>

<parameter name=“results ”indirect=“true”> <value>ftp://aServer/myResults</value>

</parameter>

</output>

</tool>

</step>

<step name=“b”>…

</sequence/flow>

<script>…

<if test=…> <while test=…> <for var=… items=…> <parfor var=… items=…> <try> <catch>

</workflow>

NVO Summer School 2005

slide14
CEA
  • Create a uniform interface and model for an application and its parameters
  • Provides higher level description than WSDL:
    • Restrict how interfaces can be expressed
    • Provide specific semantics for astronomical quantitites
    • Extra information, such as default values, GUI labels
  • VOResource extensions for a general application
  • Provide asynchronous operation:
    • callback, polling and job identification
  • Allow separate data and control flows

NVO Summer School 2005

minimum cea compliance
Minimum CEA compliance
  • Must implement CommonExecutionConnector interface
  • Must send a message to services implementing ResultsListener interface
  • Should send messages to services implementing JobMonitor interface
  • Should perform basic type checking on all parameter types during init phase

NVO Summer School 2005