Presenter: Mladen A. Vouk

On Large Data-Flow Scientific Workflows: An Astrophysics Case Study Integration of Heterogeneous Datasets using Scientific Workflow Engineering Presenter: Mladen A. Vouk

Sangeeta Bhagwanani (MS student - GUI interfaces) John Blondin (NCSU Faculty,TSI PI) Zhengang Cheng (PhD student – services, V&V) Dan Colonnese (MS student, graduated, workflow grid and reliability issues) Ruben Lobo (PhD student, packaging) Pierre Moualem (MS student, fault-tolerance) Jason Kekas (PhD student, Technical Support) Phoemphun Oothongsap (NCSU, Postdoc, high-throughput flows) Elliot Peele (NCSU, Technical Support) Mladen A. Vouk (NCSU faculty, SPA PI) Brent Marinello (NCSU, workflows extensions, Others … Team(Scientific Process Automation - SPA)

NC State researchers are simulating the death of a massive star leading to a supernova explosion. Of particular interest is the dynamics of the shock wave generated by the initial implosion of the star which ultimately destroys the star as a highly energetic supernova.

Key Current TaskEmulating “live” workflows

Key Issue • Very important to distinguish between a custom-made workflow solution and a more cannonical set of operations, methods, and solutions that can be composed into a scientific workflow. • Complexity, skill level needed to implement, usability, maintainability, “standardization” • e.g., sort, uniq, grep, ftp, ssh on unix boxes vs. • SAS (that can do sorting), home-made sort, SABUL, bbcp (free, but not standard), etc.

Topic – Computational Astrophysics • Dr. Blondin is carrying out research in the field of Circumstellar Gas-Dynamics. The numerical hydrodynamical code VH-1 is used on supercomputers, to study a vast array of objects observed by astronomers both from ground-based observatories and from orbiting satellites. • The two primary subjects under investigation are interacting binary stars - including normal stars like the Algol binary, and compact object systems like the high mass X-ray binary SMC X-1 - and supernova remnants - from very young, like SNR 1987a, to older remnants like the Cygnus Loop. • Other astrophysical processes of current interest include radiatively driven winds from hot stars, the interaction of stellar winds with the interstellar medium, the stability of radiative shockwaves, the propagation of jets from young stellar objects, and the formation of globular clusters.

Logistic Network L-Bone Aggregate to ~500 files (< 10+GB each) Local Mass Storage 14+TB) Input Data Aggregate to one file (~1 TB each) Data Depot HPSS archive Local 44 Proc. Data Cluster - data sits on local nodes for weeks Highly Parallel Compute Output ~500x500 files Viz Software Viz Wall Viz Client

Workflow - Abstraction Model Merge Backup Move Split Viz Parallel Computation Mass Storage Fiber C. or Local NFS Data Mover Channel (e.g. LORS, BCC, SABUL, FC over SONET Head Node Services Head Node Services RecvData SendData To VizWall Split & Viz Merge & Backup Model Parallel Visualization Web Services Web or Client GUI Construct Orchestrate Monitor/Steer Change Stop/Start Control

Current and Future Bottlenecks Computing Resources and Computational Speed (1000+ Cray X1 processors, compute times of 30+ hrs, wait time) Storage and Disks (14+ TB, reliable and sustainable transfer speeds 300+ MB/s , Automation Reliable and Sustainable Network TransferRates (300+ MB/s)

Bottlenecks (B-specific) • Supercomputer, Storage, HPSS, Ensight Memory • Average per job wait time is 24-48 hrs (could be longer if more processors are requested or more time slices are calculated). • One run – a 6 hrs (run time) on Cray X1 currently uses 140 processors, and produces 10 time steps. Each time step has 140 Fortran-binary files (28 GB total). Hence, currently, this is 280 MB per 6hr run. Takes about 300 to 500 slices for full visualization (30 to 50 runs , and about 280x(300 to 500)= 10 to 14 TB of space). • The 140 files of a time step are merged into one (1) netcdf file (takes about 10 min) • BBCP the file to NCSU at about 30 MB/s, or about 15 min per time slice (this can be done in parallel with next time-slice computation). In the futurenetwork transfer speeds and disk access speeds may be an issues.

B-specific Top-Level W/F Operations • Operators: Create W/F (reserve resources), Run Model, Backup Output, PostProcess Output (e.g., Merge, Split), MoveData, AnalyzeData (Viz, other?), Monitor Progress (state, audit, backtrack, errors, provenance), Modify Parameters • States: Modeling, Backup, Postprocessing (A, .. Z), MovingData, Analyzing Remotely • Creators: CreateWF, Model?, Expand • Modifiers: Merge, Split, Move, Backup, Start, Stop, ModifyParameters • Behaviors: Monitor, Audit, Visualize, Error/Exception Handling, Data Provenance, …

Goal: Ubiquitous Canonical Operations for Scientific W/F Support • Fast data transfer from A to B (e.g., LORS, SABUL, GridFTP, BBCP?, other …) • Database access • Stream merging and splitting • Flow monitoring • Tracking, Auditing, provenance • Verification and Validation • Communication service (web services, grid services, xmlrpc, etc.) • Other …

Issues (1) • Communication Coupling (loose, tight, v. tight, code-level) and Granularity (fine, medium?, coarse) • Communication Methods (e.g., ssh tunnels, xmprpc, snmp, web/grid services,etc.) – e.g., apparently poor support for Cray • Storage issues (e.g., p-netcdf support, bandwidth) • Direct and Indirect Data Flows (functionality, throughput, delays, other QoS parameters) • End-to-end performance • Level of abstraction • Workflow description language(s) and exchange issues – interoperability • “Standard” scientific computing “W/F functions”

Issues (2) • Problem is currently similar to old-time punched-card job submissions (long turn-around time, can be expensive due to front end computational resource I/O bottleneck) - need up front verification and validation – things will change • Back-end bottleneck due to hierarchical storage issues (e.g., retrieval from HPSS) • Long term workflow state preservation - needed • Recovery (transfers, other failures) – more needed • Tracking data and files - needed • Who maintains equipment, storage, data, scripts, workflow elements? Elegant solutions my not be good solutions from the perspective of autonomy. • EXTREMELY IMPORTANT!!! – We are trying to get out of the business of totally custom-made solutions.

Workflow - Abstraction Model Merge Backup Move Split Viz Goal: 2 -3 Gbps TRates End-To-End Parallel Computation Mass Storage Fiber C. or Local NFS Data Mover Channel (e.g. LORS, SABUL, FC over SONET Head Node Services Head Node Services RecvData SendData To VizWall Goal: 1TB per Night Split & Viz Merge & Backup Model Parallel Visualization Web Services Web or Client GUI Construct Orchestrate Monitor/Steer Change Stop/Start Control

Communications • Web/Java-based GUI • Web Services for Orchestration - overall and less than tightly coupled sub-workflows • LSF and MPI for parallel computation • Scripts – (in this example csh/sh based, could be Perl, Python, etc.) on local machines – interpreted language • High-level programming language for simulations, complex data movement algorithms, and similar – compiled language

Presenter: Mladen A. Vouk