1 / 23

Programming Scientific and Distributed Workflow with Triana Services

Programming Scientific and Distributed Workflow with Triana Services . Matthew Shields, GGF10 Workflow Workshop , 9th March. Presentation Outline. Triana Overview Triana services and their distribution Distribution policies The GAP interface and its relation to the Gridlab GAT

annona
Download Presentation

Programming Scientific and Distributed Workflow with Triana Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming Scientific and Distributed Workflow with Triana Services Matthew Shields, GGF10 Workflow Workshop, 9th March

  2. Presentation Outline • Triana • Overview • Triana services and their distribution • Distribution policies • The GAP interface and its relation to the Gridlab GAT • Scientific Workflow • Binary Inspiral Algorithm Example • Dynamic Distributed Workflow • Service Composition on the Grid • Service Usage, dynamically distributing a Triana workflow • Conclusion

  3. What is Triana?

  4. G A P Any GAP service e.g. Web service Triana Distributed Work-flow Triana Service & Engine Triana Service & Engine Workflow, e.g. BPEL4WS Action Commands Network • Distributed Triana Work-flow • flexible distribution: based around Triana Groups • HPC and Pipelined distribution Triana Controlling Service (TCS) Triana Engine Other Engine Triana Gateway

  5. based around a series of Java interface classes Concrete implementations that form the GAP bindings The core interface is the Service Creation and Discovery Pipe Creation and Discovery MessageCommunication Information Job Submmission Data Management - transfers - logical lookup Will be become an adapter for the GridLab Java GAT, providing: Advertisement, Discovery, deployment and communication of services GRMS job submission adapter Data Management Services GAP Overview

  6. GAP (Java Prototype) Java GAT Prototype • Advertising • Discovery • Communication Web Services Jxta OGSA (planned) P2PS GSI Enabled Jxtaserve And more.. NS-2 Job Submission (GRMS) • Generic Job Submission • Virtual filename data access • Set of generic Java interfaces • high level abstractions to Grid services • Factory design – dynamic pluggable services Data Management GridLab GAT (www.gridlab.org)

  7. Task-Farming Distribution Triana Service Pipeline Distribution Triana Service Triana Service Triana Service Triana Service Triana Service Triana Prototype • Distributed Triana Prototype • Based around Triana Groups i.e. aggregate tools • Each group can be distributed • Distribution policies: • HTC - high throughput/task farming • Pipeline - allow node to node communication • Each service can be a gateway to finer granularities of distribution:

  8. Triana Workflow • Triana is inherently flow based • Data flow - data arriving at component triggers execution • Control flow - control commands trigger execution • Decentralised execution • Data or Control messages sent along communication “pipes” from sender to receiver causes receiver to execute • Synchronous or Asynchronous messaging (Implementation dependant) • Multiple inputs can block or trigger immediately (Component designer defined)

  9. Components and Definitions • Component is unit of execution • Components are defined in XML files: • Naming information • Input and output ports • Parameter information • Why Components? • To simplify the application design process and to speed up application development • The component model provides an infrastructure for the interaction of components

  10. Taskgraph • Internal object based workflow graph representation • Taskgraph - DAG • Tasks • Connections • External XML representation • Simple XML syntax • List of participating Task definitions • Parent/Child connection • Hierarchical (Compound components) • Alternative Languages & Syntax • e.g. BPEL4WS • Available through pluggable readers & writers.

  11. Workflow • No explicit language support for control constructs • Loops and execution branching handled by components • Loop component - controls loop over sub-workflow • Logical component - control workflow branching • Unlike BPEL4WS or similar • Flexibility of control - constraint based loops etc…

  12. Distributing Triana Workflow • Deploying Remote Services on Resources • Service application installation • Service execution • Service discovery • Mapping tasks or groups of tasks to Services • Workflow rewiring, XML definition for connections modified for remote location - sub-workflows duplicated • Data distribution, annotated sub-sections of taskgraph passed to resources

  13. GEO 600 Inspiral Search • Background • Compact binary stars orbiting each other in a close orbit • among the most powerful sources of gravitational waves • As the orbital radius decreases a characteristic chirp waveform is produced - amplitude and frequency increase with time until eventually the two bodies merge together • Computing • Need 10 Gigaflops to keep up with real time data (modest search..) • Data 8kHz in 24-bit resolution (stored in 4 bytes) -> Signal contained within 1 kHz = 2000 samples/second • divided into chunks of 15 minutes in duration (i.e. 900 seconds) = 8MB • Algorithm • Data is transmitted to a node • Node initialises i.e. generates its templates (around 10000) • fast correlates its templates with data

  14. Coalescing Binary Search GEO 600 Coalescing Binary Search Algorithm implemented as a Triana workflow

  15. Coalescing Binary Scenario Controller Email, SMS notification Logical File Name GW Data Distributed Storage GAT (GRMS, Adaptive) GW Data • Submit Job • Optimised Mapping GAT (Data Management) CB Search Gridlab Test-bed

  16. Triana Service Job Submission GAP GRMS Web Service rage1.man.poznan.pl Gridlab Testbed

  17. Triana GRMS Component • Front end to GridLab GRMS Web Service • Job Submission Service - interfaces with GRAM • GAP Web Service binding + GSI Authentication • Java CoG Kit • X509 Certificate handling • Axis authentication & communication • GRMS executes applications on GridLab Testbed • Heterogeneous hardware platforms • Default software - Globus 2.4, GSISSH, cc, cvs, c++, F90, make, perl, mpicc

  18. Service Composition Workflow • Multiple GRMS Components • Install Applications (ftp, tar, ant) • Start installed Triana Services

  19. The workflow is cloned/split/rewired to achieve the required distribution topology Local Triana Remote Services • Custom distribution units allow sub-workflows to be distributed in parallel or pipelined Wave Distribution Unit Grapher Gaussian FFT Gaussian FFT Dynamic Distributed Workflow • Distribution units are standard Triana tools, enabling users to create their own custom distributions

  20. Conclusion Controller Email, SMS notification Logical File Name GW Data Distributed Storage GAT (GRMS, Adaptive) GW Data • Submit Job • Optimised Mapping GAT (Data Management) CB Search Gridlab Test-bed

  21. Conclusion • Shown three distinct workflows • Service composition workflow to submit grid jobs that deploys multiple Triana Services on remote resources • Local scientific workflow representing the algorithm • Dynamic distributed workflow - rewire local workflow for data parallelism across multiple Triana Services • GAP API • Web Service binding + GSI - Grid Job Submission • P2PS binding - service discovery + service communication • Combined to perform parallel scientific computation

  22. Thanks ! • The Astronomers: Prof. B Sathyaprakash, David Churches, Roger Philp and Craig Robinson • The Triana team: Ian Wang, Andrew Harrison, Omer Rana, Diem Lam and Shalil Majithia • All the partners in the GridLab project

  23. Thanks ! Information & Software http://www.trianacode.org/ http://www.gridlab.org/

More Related