Grid middleware for high performance computing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Grid Middleware for High Performance Computing PowerPoint PPT Presentation


  • 53 Views
  • Uploaded on
  • Presentation posted in: General

Grid Middleware for High Performance Computing. Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education and Research Centre (SERC) Indian Institute of Science (IISc) Bangalore - 560012. ATIP 1 st Workshop on HPC in India @ SC-09. Grid Applications Research Lab.

Download Presentation

Grid Middleware for High Performance Computing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Grid middleware for high performance computing

Grid Middleware for High Performance Computing

Sathish Vadhiyar

Grid Applications Research Lab (GARL)

Supercomputer Education and Research Centre (SERC)

Indian Institute of Science (IISc)

Bangalore - 560012

Workshop on HPC in India

ATIP 1st Workshop on HPC in India @ SC-09


Grid applications research lab

Grid Applications Research Lab

  • Grid and Parallel Computing with primary focus on

    • developing grid applications,

    • building strategies for checkpointing, migration, rescheduling, and fault-tolerance for parallel applications on grid systems, and

    • performance modeling of parallel applications on grids

ATIP 1st Workshop on HPC in India @ SC-09


Motivation

Motivation

  • Developing solutions for deployment and use of large-scale scientific applications on grids

  • Will result in exploration of large-sized problems and long-running applications

ATIP 1st Workshop on HPC in India @ SC-09


Grid applications climate modeling

Grid ApplicationsClimate Modeling

CCSM

  • Enable efficient executions of long-running climate modeling simulations on grid systems with the objective of solving climate science problems

  • Community Climate System Model (CCSM) – a multi-component global general circulation model

  • Analyzed the benefits of executing different components with checkpointing and rescheduling in different batch systems of a grid with a novel execution model

ATIP 1st Workshop on HPC in India @ SC-09


Grid applications climate modeling general idea ijhpca fgcs

Grid ApplicationsClimate Modeling – General IdeaIJHPCA, FGCS

Novel Execution Model

  • Job submission to a batch system incurs queue waiting time

  • Waiting time depends on processor requirements

  • How about decomposing a job into small subjobs with small processor requirements and submitting the subjobs to multiple batch systems of a grid?

  • Efficiency depends on effective system utilization using checkpointing, migration and rescheduling

  • Leads to 55% average increase in throughput

ATIP 1st Workshop on HPC in India @ SC-09


Grid applications dna sequence evolutions jpdc escience 2009

Grid ApplicationsDNA Sequence Evolutions JPDC, escience 2009

Master-Worker Architecture for Analyzing Mutations

  • Predictions of future sequences in an evolutionary tree important for drug discovery, pharmaceutical research and disease control

  • Different ways of an ancestor sequence to transform to a progeny sequence

  • Formulated as a search-space exploration problem and used computational grids for explorationsof the huge space of possible mutations

  • Used popular mutations to predict future evolutionary paths.

  • Performed predictions for hiv sequences and other protein sequences

  • 40% better than random methods

40% Better Predictions

ATIP 1st Workshop on HPC in India @ SC-09


Rescheduling

Rescheduling

  • It is necessary to adapt application execution to grid resource and application dynamics

  • SRS – a checkpointing library for malleable applications

  • Can allow processor reconfiguration between migrations

  • Supports different data distributions, storage infrastructure, active migration and fault tolerance

ATIP 1st Workshop on HPC in India @ SC-09


Grid middleware for high performance computing

Resheduling Strategies

  • Given a parallel application consisting of multiple phases and given a set of resources, the problem is to derive a rescheduling plan

    • Where to execute the different phases and when to migrate/reschedule

Application Phases

Cluster-1

2

3

Interval 1 (t1)

  • To find {I1, I2, …,ILopt} such that

Interval 2 (t2)

is minimized

where Lopt – number of intervals; ti – predicted execution time of each interval; rcost – rescheduling cost

Interval 3 (t3)

  • Developed 3 novel algorithms for deriving a rescheduling plan

    • Incremental algorithm, division heuristic and genetic algorithm

Interval i (ti)

Division heuristic

ATIP 1st Workshop on HPC in India @ SC-09


Rescheduling strategies

Rescheduling Strategies

  • Performed experiments with five large-scale multi-phase parallel applications

    • Molecular dynamics, n-body simulations, astrophysical gas dynamics, crack propagation, electromagnetics.

Huge Benefits due to Rescheduling

ATIP 1st Workshop on HPC in India @ SC-09


Performance modeling jpdc cpe

Performance ModelingJPDC,CPE

Performance Model Accuracy for Parallel QR

  • It is imperative to automatically derive “knowledge” (performance characteristics) of applications

  • Can be used for effective mapping of applications to resources

  • Built techniques for automatically deriving performance model functions for predicting execution costs of parallel applications on grids

  • First effort to deal with load changes during application executions

  • Less than 30% modeling errors – best reported for non-dedicated systems

  • Have also developed novel scheduling algorithms that use the model functions

  • Generates 80% better schedules than existing approaches

Scheduling Results

Box Elimination (BE) [red bars]

50-80% more efficient!

ATIP 1st Workshop on HPC in India @ SC-09

Scheduling Method


Grid middleware

Grid Middleware

  • Created a grid middleware for parallel multi-phase applications with rescheduling capabilities

  • Have successfully run multi-phase applications on grid consisting of multiple batch and interactive clusters in two geographically distributed sites

  • Also created a grid middleware for multi-component applications for coordinating the executions of the components on the different systems

Grid Middleware for Multi-Component Applications

Grid Middleware for Multi-Phase Applications

ATIP 1st Workshop on HPC in India @ SC-09


Other research

Other Research

  • Checkpointing Interval Selection

    • For efficient execution in the presence of failures

    • A Markov Model consisting of 3 kinds of states for performance prediction

    • Extensive simulations with 9-year real supercomputer failure traces on 8 parallel systems, 3 rescheduling policies, and 3 parallel applications

    • Our model’s checkpointing intervals lead to high amount of useful work by the applications in the presence of failures

  • Compiler-aided checkpointing instrumentation

    • A source-to-source precompiler for automatic insertion of checkpointing calls

    • Performs live-variable analysis for determining data and wrappers for finding data sizes

    • Can handle parallel applications with block-distribution (molecular dynamics)

ATIP 1st Workshop on HPC in India @ SC-09


Summary

Summary

  • Primary endeavor to aid scientific advancement in different domain areas using grid systems

  • Grid research in two different application areas that resulted in significant application benefits using grids

  • Contributed novel scheduling and rescheduling algorithms, performance modeling strategies and robust grid middleware for use by scientific community

ATIP 1st Workshop on HPC in India @ SC-09


Areas of collaborations

Areas of Collaborations

  • Scalability of large-scale and peta applications

  • Fault tolerance in high performance systems

  • Setting up Indo-US grids

  • Grid middleware collaborations

Thank You

ATIP 1st Workshop on HPC in India @ SC-09


  • Login