Experiences Running Seismic Hazard Workflows

Experiences Running Seismic Hazard Workflows Scott Callaghan Southern California Earthquake Center University of Southern California SC13 Workflow BoF

Overview of SCEC Computational Problem • Domain: Probabilistic Seismic Hazard Analysis (PSHA) • PSHA computation defined as workflow with two parts • “Strain Green Tensor Workflow” • ~10 jobs, largest are 4000 core/hrs • Output: Two (2) twenty (20) GB files = 40GB Total • “Post-processing Workflow” • High throughput • ~410,000 serial jobs, runtimes less than 1 min • Output: 14,000 files totaling 12 GB • Use Pegasus-WMS, HTCondor, Globus • Calculates hazard for 1 location

Recent PSHA Research Calculation Using Workflows – CyberShake Study 13.4 • PSHA calculation exploring impact of alternative velocity models and alternative codes was defined as 1144 workflows • Parallel Strain Green Tensor (SGT) workflow on Blue Waters • No remote job submission support in Spring 2013 • Created basic workflow system using qsub dependencies • Post-processing workflow on Stampede • Pegasus-MPI-Cluster to manage serial jobs • Single job is submitted to Stampede • Master-worker paradigm; work is distributed to workers • Relatively little output, so writes were aggregated in master

CyberShake 13.45 Workflow Performance Metrics • April 17, 2013 – June 17, 2013 (62 days) • Running jobs 60% of the time • Blue Waters Usage – Parallel SGT calculations: • Average of 19,300 cores, 8 jobs • Average queue time: 401 sec (34% job runtime) • Stampede Usage – Many-task post-processing: • Average of 1,860 cores, 4 jobs • 470 million tasks executed (177 tasks/sec) • 21,912 jobs total • Average queue time: 422 sec (127% job runtime)

Constructing Workflow Capabilities using HPC System Services • Our workflows rely on multiple services • Remote gatekeeper • Remote gridftp server • Local gridftp server • Currently issues are detected via workflow failure • Helpful if there were automated checks that everything is operational • Additional factors • Valid X509 proxy • Available disk space

Why Define PSHA Calculations as Multiple Workflows ? • CyberShake Study 13.4 used 1144 workflows • Too many jobs to merge into 1 workflow • Currently, we create 1144 workflows and monitor submission via cronjob

Improving Workflow Capabilities • Workflow tools could provide higher-level organization • Specify set of workflows as a group • Workflows are submitted and monitored • Errors detected and group paused • Easy to obtain status, progress • Statistics are aggregated over the group

Summary of SCEC Workflows Needs • Workflows tools for large calculations on leadership-class systems that do not support remote job submission • Need capabilities to define job dependencies, manage failure and retry, track files using dbms • Light-weight workflow tools that can be included in scientific software releases • For unsophisticated users, the complexity of workflow tools should not outweigh their advantages • Methods to define workflows with interactive (people-in-the-loop) stages • Needed to automate computational pathways that have manual stages • Methods to gather both scientific and computational metadata into databases • Needed to ensure reproducibility and support user interfaces

Experiences Running Seismic Hazard Workflows

Experiences Running Seismic Hazard Workflows

Presentation Transcript

Seismic Risk = f ( Hazard, Exposure, Vulnerability, Location )

US NRC Approach for Seismic Hazard Assessments

FIRST HAND EXPERIENCES IN RUNNING A LEARNERSHIP

Probabilistic Seismic Hazard Analysis

Use of Advanced Technologies for Seismic Hazard Mitigation

American Samoa Seismic Hazard Maps

A P-GRADE Portlet for Seismic Hazard Analysis

Experiences running a production Puppet infrastructure @CERN

Seismic Surveying: Predicting Earthquake Hazard; Finding Geothermal Energy

Probabilistic Seismic Hazard Analysis

Probabilistic Seismic Hazard Analysis Project

Tri-State Seismic Hazard Mapping -Kentucky Plan

PERM: Seismic Hazard Assessment

SEISMIC HAZARD

Seismic Hazard Maps

Global Seismic Hazard Map

Seismic Hazard of India Android application overview

NGA-East: National Seismic Hazard Mapping Perspective

Running LIGO workflows on the OSG

Global Seismic Hazard Map

Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid

Basics on Deterministic Seismic Hazard Assessment