slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Town Hall Meeting Berkeley Lab April 17-18 PowerPoint Presentation
Download Presentation
Town Hall Meeting Berkeley Lab April 17-18

Loading in 2 Seconds...

play fullscreen
1 / 42

Town Hall Meeting Berkeley Lab April 17-18 - PowerPoint PPT Presentation


  • 511 Views
  • Uploaded on

Simulation and Modeling at the Exascale for Energy, Ecological Sustainability and Global Security Scientific Data Management and Software Integration Infrastructure (part of Goal 8) Position Statements Town Hall Meeting Berkeley Lab April 17-18

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Town Hall Meeting Berkeley Lab April 17-18' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Simulation and Modeling at the Exascale for Energy, Ecological Sustainability and Global SecurityScientific Data Managementand Software Integration Infrastructure(part of Goal 8) Position Statements

Town Hall Meeting Berkeley Lab April 17-18

transforming scientific computing through software

Transforming Scientific Computing Through Software

Rob Ross

Mathematics and Computer Science Division

Argonne National Laboratory

an integrative approach to hec software
An Integrative Approach to HEC Software

Vision: Create a software system that manages all aspects of achieving science goals through HEC, from application development, to managing the scientific workflow, to long-term storage of scientific data.

  • Fundamental change in how applications are built and maintained
  • Enables optimizations to be applied to best benefit the workflow as a whole
  • Software composition plays an important role in this
    • Using rich interfaces, from the compiler level up, to allow effective coupling both in terms of performance and in terms of functionality
  • Should bring new application domains into the fold as well
    • Not just increasing productivity of existing users
thrusts
Application Development

Programming models and languages

Development environments, debuggers

Compilers

Libraries (numerical, message passing, I/O, etc.)

Code composition, transformations

Fault tolerance

Application Maintenance

Validation and verification

Error quantification

Performance modeling and understanding

Application Execution

Workflow management

Co-scheduling

Computational steering

Data Management

Provenance management

Data access, portability, storage formats

Wide-area data movement

Long-term storage

Data Understanding

Visualization

Data analytics

Thrusts
challenges
Challenges
  • Training - How do we educate scientists on how to use this solution, why it is to their advantage?
  • Adoption - How do we ensure that our solutions make it into production?
  • Evolution - How do we make use of pieces as they are completed, rather than waiting until everything is “done”? How can these be integrated into existing codes?
  • Maintenance/Support - How do we ensure that solutions remain relevant, correct, and extract highest possible performance as systems evolve?
  • Architecture Interaction - How do we best adapt to emerging trends in HW?
  • Availability - How do we provide this solution to the largest possible collection of appropriate users (<cough> open source <cough>)?
storage i o and data management

Storage, I/O and Data Management

Alok Choudhary

Northwestern UNiversity

a-choudhary@northwestern.edu

storage i o and data management7
Storage, I/O and Data Management

Autonomic Storage

  • Self-healing
  • Self-reorganizing
  • Dynamic policies

Knowledge Discovery

  • In-place analytics
  • Customized acceleration
  • On-line-query
  • Intelligent
  • DB-like interfaces
  • Scientific databases

High-Performance

I/O

Analytical Appliance

remote access

Remote Access

Miron Livny,

University of Wisconsin

remote access miron livny university of wisconsin
Remote AccessMiron Livny, University of Wisconsin
  • Unified authentication and authorization mechanisms that are based on well defined trust and threat models
  • A symmetric and recursive framework for resource allocation and work delegation
  • Dependable and maintainable software stacks for heterogeneous compute/storage/communication resources
  • Uniform treatment of all resources
the storage challenge

The Storage Challenge

Arie Shoshani

Lawrence Berkeley National Laboratory

storage growth is exponential
Storage Growth is Exponential
  • Unlike compute and network resources, storage resources are not reusable
    • Unless data is explicitly removed
    • Need to use storage wisely
    • Checkpointing, etc.
    • Time consuming, tedious tasks
  • Data growth will scale with compute scaling
    • Storage will grow even with good practices (such as eliminating unnecessary replicas)
    • Not necessarily on supercomputers
    • but, on user/group machines
    • and archival storage
  • Storage cost is a consideration
    • Has to be part of science growth cost
    • But, storage costs going down at a rate similar to data growth
    • Need continued investment in new storage technologies

Storage Growth 1998-2006

at ORNL

(rate: 2X / year)

Storage Growth 1998-2006

at NERSC

(rate: 1.7X / year)

The challenges are in managing the data and the storage

the storage challenges
The Storage Challenges
  • Dynamic petascale storage management
    • reservation: support dynamic guarantees of required storage
    • manage allocation: storage lifetime management
    • automatic data replication
    • cleanup of storage for unused replicated data
  • Assume that tape technology will disappear
    • Disk-based archives (MAID)
    • Manage energy usage – maximize powering down of disks
    • Data reliability and recovery (error recovery when disks fail)
    • Manage two-level disk technology
      • slow inexpensive archival disks
      • fast more expensive active “cache” disks
  • Very high bandwidth file system
    • 10-100 GB/s to active disks and archive
indexing and subsetting technology

Indexing and Subsetting Technology

John Wu

Lawrence Berkeley National Laboratory

the need for subsetting
The Need for Subsetting
  • Data sets too large
  • Analysis too complex
  • “Really, I am only interested in a small subset”: e.g., rare events in high-energy collisions, hot spots in network traffic, shock wave in supernova explosion, flame front, etc..
current state of art
Current State-of-Art
  • Terminology: in database community, indexing technology refers to anything that can accelerate searching (subsetting is a form of searching)
  • Commonly used indexing technology
    • Data reorganization: vertical partition (projection index), materialized view
    • Trees: B-tree and variants, kd-tree, R-tree
    • Bitmap index: compression, binary encoding
  • Application examples
    • Sloan Digital Sky Survey, GenBank, …
    • GridCollector, Dexterous data Explorer, …
challenges and opportunities
Challenges and Opportunities
  • Better integration with applications
  • Index sizes: example 100003 mesh, one variable 4TB, projection index 4 TB, B-tree 12 – 16 TB, Bitmap 8 TB
  • Query response time dominated by reading part of indexes
  • Parallelism: some simple searches are embarrassingly parallel, need to work on others
  • Distributed data: database federation
  • Complex data, e.g., graphs
  • Interactivity / progressive solution
chandrika kamath lawrence livermore national laboratory

Scientific Data Mining(includes techniques from statistics, machine learning, pattern recognition, image and video processing, data mining…)

Chandrika Kamath

Lawrence Livermore National Laboratory

slide18
What is possible in 5-10 years?
    • Difficult to address
    • Analysis is not generic – dependent on domain, problem, data,….
  • State of the art
    • Lots of algorithms/software exist for various steps in scientific data mining
    • Lacking - application to real datasets with results validated by domain scientists; trained experts
slide19
Challenges
    • Items of interest may not be “defined”
    • Quality of data
    • Items of interest split across processors and evolving over time on an unstructured mesh
    • Results an artifact of algorithms, parameters, subjectivity, biases in the data => trusting the results of analysis
    • Robustness of algorithms
    • Where does analysis fit? - math or cs or application multi-disciplinary - usually falls through the cracks -
data integration challenges facing science

Data Integration Challenges facing Science

Dean Williams

Lawrence Livermore National Laboratory

data integration challenges facing science e g climate
Data Integration Challenges facing Science (e.g., Climate)
  • Many experimental fields will generate more data in the next 2 years that exist today
  • Large part of research consists of writing programs to analyze data
  • How best to collect, distribute, and find data on a much larger scale?
    • At each stage tools could be developed to improve efficiency
    • Substantially more ambitious community modeling projects (EBs) will require a distributed database
  • Metadata describing extended modeling simulations (e.g., atmospheric aerosols and chemistry, carbon cycle, dynamic vegetation, etc.) (But wait there’s more: economy, public health, energy, etc. )
  • How to make information understandable to end-users so that they can interpret the data correctly
  • More users from WGI-science. (WGII-impact on life and societies and WGIII-mitigation how to cool down the earth) (Policy makers, economists, health officials, etc.)
  • Client and Server-side analysis and visualization tools in a distributed environment (i.e., transformations, aggregation, subsetting, concatenating, regridding, filtering, …)
  • Coping with different analysis tools, formats, schemas, data from unknown sources
  • Trust and Security on a global scale (not just an agency or country, but worldwide )
evolving climate data archives for the future

2006

Early 2009

2011

  • Full data sharing (add to testbed…)
  • Synchronized federation
    • metadata, data
  • Full suite of server-sideanalysis
  • Model/observation integration
  • ESG embedded into desktop productivity tools
  • GIS integration
  • Model intercomparison metrics
  • User support, life cycle maintenance
  • Central database
  • Centralized curated data archive
  • Time aggregation
  • Distribution by file transport
  • No ESG responsibility for analysis
  • Shopping-cart-oriented web portal
  • ESG connection to desktop analysis tools
  • Testbed data sharing
  • Federated metadata
  • Federated portals
  • Unified user interface
  • Quick look server-side analysis
  • Location independence
  • Distributed aggregation
  • Manual data sharing
  • Manual publishing
Evolving Climate Data Archives for the future

ESG Data System Evolution

CCSM, AR5,satellite, In situ biogeochemistry,ecosystems

CCSMAR4

Data Archive

Terabytes

Petabytes

the growing importance of climate simulation data standards
The growing importance of climate simulation data standards
  • Global Organization for Earth System Science Portal (GO-ESSP)
    • International collaboration to develop new generation of software infrastructure
    • Access to observed and simulated data from climate and weather communities
    • Working closely together using agreed upon standards
  • NetCDF Climate and Forecast (CF) Metadata Convention standards
    • Specify syntax and vocabulary for climate and forecast metadata
    • Promotes the processing and sharing of data
    • The use of CF was essential for the success of the IPCC data dissemination
  • Semantic Web could create a “web of data”
    • Ontologies
managing streaming data

Managing streaming data

Scott Klasky

Oak Ridge National Laboratory

gtc data production
GTC data production
  • Increase output because of
  • Asynchronous metadata rich I/O.
  • Workflow automation.
  • More analysis/visualization services in the workflow.

TB

  • Only 250TB of disk on Jaguar, 20 projects. Can’t use 100TB.
  • Must stream data from simulation to HPSS and part of data over the WAN to PPPL for further analysis.
  • Need fault tolerant techniques for coupling components together.
managing streaming data and scientific workflow management
Managing streaming data and scientific workflow management
  • Complexity of Petascale/Exascale computation may start to require us to think in separation of main algorithms in simulations from secondary computed information.
    • Tradeoff between computation/bandwidth/storage.
  • How can we allow researchers to combine their analysis codes together with their simulation?
    • With 100 cores/socket researchers will have lots of computing power available at their desktop, and their local clusters.
    • We need to support a model to move datasets to researchers which can be integrated into their workflows, and provide autonomic methods. (QOS)
managing streaming data and scientific workflow management27
Managing streaming data and scientific workflow management
  • Raw volumes of data inside codes are increasing.
    • GTC example for petascale: 1T particles, 10K*10K*100 mesh/hour *100 hours= 730TB + analytics data O(100 TB).
    • Can’t store all of this for each run. Must reduce some of this.
  • Complexity of workflows used during simulations is increasing.
    • Code Coupling, complex code monitoring.
    • Provenance tracking becomes essential to scientific understanding of data.
  • Data can be streamed from main computational engines to secondary compute farms for analysis and code coupling scenarios. (2.8GB/sec)
  • Is there a programming model to support this which can go from the main computation to the analytics/analysis/coupling?
data movement
Data Movement
  • Data movement over WAN is critical for teams of researchers on Fusion Simulation Project project to work together.
    • Need to keep up with speeds from HPSS. (100MB/sec).
      • 12 days to move 100TB of data.
      • Organization of movement helps allow researchers to analyze during the data movement.
    • Workflows automate the movement of data to HPSS, and researchers analysis clusters.
    • Hope is that we can extract relevant features (more relevant particles,
challenges in scientific workflow

UCDAVIS

Department of

Computer Science

Challenges in Scientific Workflow

Bertram Ludäscher

Dept. of Computer Science

& Genome Center

UC Davis

intro scientific workflow
Intro: Scientific Workflow

Capture how a scientist works with data and analytical tools

  • data access, transformation, analysis, visualization
  • possible worldview: dataflow/pipline-oriented (cf. signal-processing)

Scientific workflow (wf) benefits (compare w/ script-based approaches) :

  • wf automation
  • wf & component reuse
  • wf modeling , design, docu
  • wf archival, sharing
  • built-in concurrency

(task-, pipeline-parallelism)

  • built-in provenance support
  • distributed execution

(Grid) support

different types of scientific workflow
Different types of Scientific Workflow
  • Data Analysis Pipelines
    • current automation-integration "glue-ware": Perl, Python, …
    • need support for Modeling & Design
      • from conceptual napkin drawings
      • … to optimized, task- & pipeline-parallel process networks
    • need support for integrated data & workflow management
      • data provenance (lineage)
        • efficiency ("smart rerun")
        • result interpretation & debugging
  • Simulation Support
    • now the workflow does not represent the science/analytics
    • but instead the workflow engineer's view (plumbing):
      • monitoring (& steering!?) simulations
      • file management (staging, transfer, migration, archival)
      • code-coupling
  • An integrated data & workflow management effort
    • fits right into Rob's HEC Software vision!
scientific workflows cyberinfrastructure upper ware

Upperware

Upper

Middleware

Middleware

Underware

Science Environment for Ecological Knowledge (“SEEK”)

Scientific Workflows = Extreme-Scale UPPER-WARE

Scientific Workflows = Cyberinfrastructure UPPER-WARE
support for modeling design pn is not the end

Author: Tim McPhillips, UC Davis

Support for Modeling & Design (PN is not the end)

Well-designed workflows rule!

Even (esp.!) over Perl …

Making things easy is hard…

workflow may still suck ...

Perl/Python rules!?

example chip chip workflow
Example: ChIP-chip workflow

Src: Ludaescher, Bowers, McPhillips (CS), Farnham, Bieda (Bio), UC Davis Genome Center

scientific workflow chip chip in context one size doesn t fit all scientist wf engineer
Scientific Workflow (ChIP-chip) in ContextOne size doesn't fit all… (scientist, wfengineer, … )
support for provenance e g inferring a phylogenetic tree from disparate data

Provenance

Store

Support for Provenance: e.g. Inferring a phylogenetic tree from disparate data

Aligned DNA

sequences

Maximum likelihood tree (DNA)

Discrete

morphological

data

Maximum parsimony tree

“Integrate”

Consensus

Tree(s)

Maximum likelihood tree (continuous characters)

Continuous

characters

Datasets

Actors

Datasets

plumbing with style norbert podhorszki uc davis scott klasky ornl
Plasma physics simulation on 2048 processors on Seaborg@NERSC (LBL)

Gyrokinetic Toroidal Code (GTC) to study energy transport in fusion devices (plasma microturbulence)

Generating 800GB of data (3000 files, 6000 timesteps, 267MB/timestep), 30+ hour simulation run

Under workflow control:

Monitor (watch) simulation progress (via remote scripts)

Transfer from NERSC to ORNL concurrently with the simulation run

Convert each file to HDF5 file

Archive files to 4GB chunks into HPSS

Transfer

Convert

Archive

Plumbing with Style … (Norbert Podhorszki UC Davis, Scott Klasky ORNL)

Monitor

cpes monitoring archiving workflow

Components (director/scheduler,params,

docu, generic remote-exec)

Initialization

NetCDF processing

pipeline

Image

archival

HDF5 image

generation

BP  HDF5 processing

pipeline

CPES Monitoring-Archiving Workflow
summary
Summary
  • Scientific Workflow is not Business Workflow
    • (yes, both are about automation … but vast differences)
  • Many challenges:
    • Support for high-level wf Modeling & Design
    • Smooth integration of Statistical and Data Mining tools
    • Support for data & wf Provenance

 Combined data and wf management

(e.g. COMAD  [refs])

    • Easy-to-use concurrent execution models:
      • e.g. PN has built in task and pipeline parallelism

 beyond DAG-style, BPEL-style, web-service-style wfs!

    • Optimization
      • e.g., linear workflow design by end-user
      • system generates task- & pipeline-parallel wf from it