slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Ilia Baldine , Charles Schmitt , University of North Carolina at Chapel Hill/RENCI PowerPoint Presentation
Download Presentation
Ilia Baldine , Charles Schmitt , University of North Carolina at Chapel Hill/RENCI

Loading in 2 Seconds...

play fullscreen
1 / 15

Ilia Baldine , Charles Schmitt , University of North Carolina at Chapel Hill/RENCI - PowerPoint PPT Presentation


  • 142 Views
  • Uploaded on

The ADAMANT Project: Linking Scientific Workflows and Networks “ Adaptive Data-Aware Multi-Domain Application Network Topologies ”. Ilia Baldine , Charles Schmitt , University of North Carolina at Chapel Hill/RENCI Jeff Chase , Duke University Ewa Deelman , University of Southern California.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Ilia Baldine , Charles Schmitt , University of North Carolina at Chapel Hill/RENCI' - dora


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

The ADAMANT Project:Linking Scientific Workflows and Networks“Adaptive Data-Aware Multi-Domain Application Network Topologies”

Ilia Baldine, Charles Schmitt, University of North Carolina at Chapel Hill/RENCI

Jeff Chase, Duke University

Ewa Deelman, University of Southern California

Funded by NSF under the Campus Cyberinfrastructure –

Network Infrastructure and Engineering (CC-NIE)Program

the problem
The Problem
  • Scientific data is being collected at an ever increasing rate
    • The “old days” -- big, focused experiments– LHC, LIGO, etc..

-- big data archives– SDSS, 2MASS, etc..

    • Today “cheap” DNA sequencers – and an increasing number of them in individual laboratories
  • The complexity of the computational problems is ever increasing
  • Local compute resources are often not enough (too small, limited availability)
  • The computing infrastructure keeps changing
    • Hardware, software, but also computational models
computational workflow managing application complexity
Computational workflow--managing application complexity
  • Helps express multi-step computations in a declarative way
  • Can support automation, minimize human involvement
    • Makes analyses easier to run
  • Can be high-level and portable across execution platforms
  • Keeps track of provenance to support reproducibility
  • Fosters collaboration—code and data sharing
  • Gives the opportunity to manage resources underneath
large scale data intensive workflows
Large-Scale, Data-Intensive Workflows
  • Montage Galactic Plane Workflow
    • 18 million input images (~2.5 TB)
    • 900 output images (2.5 GB each, 2.4 TB total)
    • 10.5 million tasks (34,000 CPU hours)
  • An analysis is composed of a number of related workflows– an ensemble
  • Smart data/network provisioning are important

John Good (Caltech)

× 17

slide5

Southern California Earthquake Center

CyberShake PSHA Workflow

  • Description
    • Builders ask seismologists: “What will the peak ground motion be at my new building in the next 50 years?”
    • Seismologists answer this question using Probabilistic Seismic Hazard Analysis (PSHA)

239 Workflows

  • Each site in the input map corresponds to one workflow
  • Each workflow has:
  • 820,000 tasks

MPI codes ~ 12,000 CPU hours,

Post Processing 2,000 CPU hours

Data footprint ~ 800GB

Coordination between resources is needed

environment how to manage complex workloads
EnvironmentHow to manage complex workloads?

Data Storage

Campus Cluster

XSEDE

Open Science Grid

Amazon Cloud

Work definition

Local Resource

use g iven resources
Use Given Resources

Data Storage

data

Campus Cluster

FutureGrid

XSEDE

Open Science Grid

Amazon Cloud

Work definition

As a WORKFLOW

Workflow Management System

work

Local Resource

workflow management
Workflow Management
  • You may want to use different resources within a workflow or over time
    • Need a high-level workflow specification
    • Need a planning capability to map from high-level to executable workflow
    • Need to manage the task dependencies
    • Need to manage the execution of tasks on the remote resources
  • Need to provide scalability, performance, reliability
pegasus workflow management system est 2001
Pegasus Workflow Management System (est. 2001)
  • A collaboration between USC and the Condor Team at UW Madison (includes DAGMan)
  • Maps a resource-independent “abstract” workflow onto resources and executes the “concrete” workflow
  • Used by a number of applications in a variety of domains
  • Provides reliability—can retry computations from the point of failure
  • Provides scalability—can handle large data and many computations (kbytes-TB of data, 1-106 tasks)
  • Infers data transfers, restructures workflows for performance
  • Automatically captures provenance information
  • Can run on resources distributed among institutions, laptop, campus cluster, Grid, Cloud

Pegasus makes use of available resources, but cannot control them

a way to make it work better
A way to make it work better

Data Storage

data

Grids and Clouds

Resources:

compute, data, networks

Work definition

Virtual Resource Pool

work

Resources requests

Resource

Provisioner

Pegasus WMS

Local Resource

slide11

Open Resource Control Architecture

  • ORCA is a “wrapper” for off-the-shelf cloud and circuit nets etc., enabling federated orchestration:
    • Resource brokering
    • VM image distribution
    • Topology embedding
    • Stitching
    • Authorization
  • Deploys a dynamic collection of controllers
  • Controller receive user requests and provisions resources

Jeff Chase, Duke University

what we would like to do
What we would like to do:

Expand to workflow ensembles

what is missing
What is missing
  • Tools and systems that can integrate the operation of workflow-driven science applications on top of dynamic infrastructures that link campus, institutional and national resources
  • Tools to manage workflow ensembles
  • Need to
    • orchestrate the infrastructure in response to the application
    • monitor various workflow steps and ensemble elements
    • expand and shrink resource pools in response to application performance demands
    • integrate data movement/storage decisions with workflows/resource provisioning to optimize performance
summary adamant will
Summary: ADAMANT will
  • Focus on data-intensive applications: astronomy, bioinformatics, earth science
  • Interleave workload management with resource provisioning
    • Emphasis on storage and network provisioning
  • Monitor the execution and adapt resource provisioning and workload scheduling
  • Experiment on exoGeni
    • http://networkedclouds.org
    • http://geni-orca.renci.org
    • http://pegasus.isi.edu