Hydra using windows desktop systems in distributed parallel computing
Download
1 / 21

HYDRA Using Windows Desktop Systems in Distributed Parallel Computing - PowerPoint PPT Presentation


  • 400 Views
  • Updated On :

NSF Site Visit. 2-23-2006. HYDRA Using Windows Desktop Systems in Distributed Parallel Computing . NSF Site Visit. 2-23-2006. Introduction…. Windows desktop systems at IUB student labs 2300 systems, 3 year replacement cycle

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'HYDRA Using Windows Desktop Systems in Distributed Parallel Computing' - Antony


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Hydra using windows desktop systems in distributed parallel computing l.jpg

NSF Site Visit

2-23-2006

HYDRAUsing Windows Desktop Systems in Distributed Parallel Computing


Introduction l.jpg

NSF Site Visit

2-23-2006

Introduction…

  • Windows desktop systems at IUB student labs

    • 2300 systems, 3 year replacement cycle

    • Pentium IV (>=1.6 GHz), 256/512/1024 MB memory, 10/100 Mbps/GigE, Windows XP

    • More than 1.5 TF


Possibly utilize idle cycles l.jpg

NSF Site Visit

2-23-2006

Possibly Utilize Idle Cycles?

Red: total ownerBlue: total idleGreen: total Condor


Problem description l.jpg

NSF Site Visit

2-23-2006

Problem Description

  • Once again... Windows desktop systems at IUB student labs:

    • As a scientific resource

    • Harvest idle cycles


Constraints l.jpg

NSF Site Visit

2-23-2006

Constraints

  • Systems dedicated to students using desktop office applications — not parallel scientific computing – making their availability unpredictable and sporadic

  • Microsoft Windows environment

  • Daily software rebuild (updates)


What could these systems be used for l.jpg

NSF Site Visit

2-23-2006

What could these systems be used for?

  • Many small computations and a few small messages

    • Foreman-worker

    • Parameter studies

    • Monte Carlo

  • Goal: High Throughput Computing (not HPC)

    • Parallel runs of the aforementioned small computations to make better use of resource

    • Parallel libraries – MPI, PVM, etc. – have constraints if availability of resources is ephemeral i.e. not predictable


Solution l.jpg

NSF Site Visit

2-23-2006

Solution

  • Simple Message Brokering Library (SMBL)

    • Limited replacement for MPI

      • Both server and client library based on TCP socket abstraction

    • Porting from MPI is fairly straight forward

  • Process and Port Manager (PPM)

  • Plus …

    • Condor for job management, file transfer, no checkpointing or parallelism

    • Web portal for job submission


The big picture we ll discuss each part in more detail next l.jpg

NSF Site Visit

2-23-2006

The Big PictureWe’ll discuss each part in more detail next…

The shaded box indicates components hosted on multiple desktop computers


Smbl server l.jpg

NSF Site Visit

2-23-2006

SMBL (Server)

SMBL Server Process Table for 4 CPU parallel session

  • SMBL server maintains a dynamic pool of client process connections

  • Worker job manager hides details of ephemeral workers at the application level


Smbl server10 l.jpg

NSF Site Visit

2-23-2006

SMBL (Server)

SMBL Server Process Table for 4 CPU parallel session

  • SMBL server maintains a dynamic pool of client process connections

  • Worker job manager hides details of ephemeral workers at the application level


Smbl client l.jpg

NSF Site Visit

2-23-2006

SMBL (Client)

  • Client library implements selected MPI-like calls

    • MPI_Send ()  SMBL_Send ()

    • MPI_Recv ()  SMBL_Recv ()

  • In charge of message delivery for each parallel process


Process and port manager ppm l.jpg

NSF Site Visit

2-23-2006

Process and Port Manager (PPM)

  • Starts the SMBL server and application processes on demand

  • Assigns port/host to each parallel session

  • Directs workers to their servers


Slide13 l.jpg

NSF Site Visit

2-23-2006

PPM (cont’d ...)

PPM with two SMBL servers (two parallel sessions)

Parallel Session 1

Parallel Session 2


Once again the big picture l.jpg

NSF Site Visit

2-23-2006

Once again … the big picture

The shaded box indicates components hosted on multiple desktop computers


Recent development l.jpg

NSF Site Visit

2-23-2006

Recent Development

  • Hydra cluster Teragrid enabled! (Nov 2005)

    • Allow TG users to use resource

    • Virtual Host based solution – two different URLs for IU and Teragrid users

    • Teragrid users authenticate against PSC’s Kerberos server


System layout l.jpg

NSF Site Visit

2-23-2006

System Layout

  • PPM, SMBL server, Condor and web portal running on Linux server

    • Dual Intel Xeon 3.0 GHz, 4 GB memory, GigE

  • Second Linux server running Samba to serve BLAST database


Portal l.jpg

NSF Site Visit

2-23-2006

Portal

  • Creates and submits Condor files, handles data files

  • Apache/PHP based

  • Kerberos authentication

  • URLs:

    • http://hydra.indiana.edu (IU users)

    • http://hydra.iu.teragrid.org (Teragrid users)


Utilization of idle cycles l.jpg

NSF Site Visit

2-23-2006

Utilization of Idle Cycles

Red: total ownerBlue: total idleGreen: total Condor


Summary l.jpg

NSF Site Visit

2-23-2006

Summary

  • Large parallel computing facility created at a low cost

    • SMBL parallel message passing library that can deal with ephemeral resources

    • PPM port broker that can handle multiple parallel sessions

  • SMBL Homepage

    • http://smbl.sourceforge.net (Open Source)


Links and references l.jpg

NSF Site Visit

2-23-2006

Links and References

  • Hydra Portal

    • http://hydra.indiana.edu (IU users)

    • http://hydra.iu.teragrid.org (Teragrid users)

  • SMBL home page: http://smbl.sourceforge.net

  • Condor home page: http://www.cs.wisc.edu/condor/

  • IU Teragrid home page – http://iu.teragrid.org


Links and references cont d l.jpg

NSF Site Visit

2-23-2006

Links and References (cont’d..)

  • Parallel FastDNAml: http://www.indiana.edu/~rac/hpc/fastDNAml

  • Blast: http://www.ncbi.nlm.nih.gov/BLAST

  • Meme: http://meme.sdsc.edu/meme/intro.html