hydra using windows desktop systems in distributed parallel computing
Download
Skip this Video
Download Presentation
HYDRA Using Windows Desktop Systems in Distributed Parallel Computing

Loading in 2 Seconds...

play fullscreen
1 / 21

Download latest version of presentation - PowerPoint PPT Presentation


  • 399 Views
  • Uploaded on

NSF Site Visit. 2-23-2006. HYDRA Using Windows Desktop Systems in Distributed Parallel Computing . NSF Site Visit. 2-23-2006. Introduction…. Windows desktop systems at IUB student labs 2300 systems, 3 year replacement cycle

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Download latest version of presentation' - Antony


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction

NSF Site Visit

2-23-2006

Introduction…
  • Windows desktop systems at IUB student labs
    • 2300 systems, 3 year replacement cycle
    • Pentium IV (>=1.6 GHz), 256/512/1024 MB memory, 10/100 Mbps/GigE, Windows XP
    • More than 1.5 TF
possibly utilize idle cycles

NSF Site Visit

2-23-2006

Possibly Utilize Idle Cycles?

Red: total ownerBlue: total idleGreen: total Condor

problem description

NSF Site Visit

2-23-2006

Problem Description
  • Once again... Windows desktop systems at IUB student labs:
    • As a scientific resource
    • Harvest idle cycles
constraints

NSF Site Visit

2-23-2006

Constraints
  • Systems dedicated to students using desktop office applications — not parallel scientific computing – making their availability unpredictable and sporadic
  • Microsoft Windows environment
  • Daily software rebuild (updates)
what could these systems be used for

NSF Site Visit

2-23-2006

What could these systems be used for?
  • Many small computations and a few small messages
    • Foreman-worker
    • Parameter studies
    • Monte Carlo
  • Goal: High Throughput Computing (not HPC)
    • Parallel runs of the aforementioned small computations to make better use of resource
    • Parallel libraries – MPI, PVM, etc. – have constraints if availability of resources is ephemeral i.e. not predictable
solution

NSF Site Visit

2-23-2006

Solution
  • Simple Message Brokering Library (SMBL)
    • Limited replacement for MPI
      • Both server and client library based on TCP socket abstraction
    • Porting from MPI is fairly straight forward
  • Process and Port Manager (PPM)
  • Plus …
    • Condor for job management, file transfer, no checkpointing or parallelism
    • Web portal for job submission
the big picture we ll discuss each part in more detail next

NSF Site Visit

2-23-2006

The Big PictureWe’ll discuss each part in more detail next…

The shaded box indicates components hosted on multiple desktop computers

smbl server

NSF Site Visit

2-23-2006

SMBL (Server)

SMBL Server Process Table for 4 CPU parallel session

  • SMBL server maintains a dynamic pool of client process connections
  • Worker job manager hides details of ephemeral workers at the application level
smbl server10

NSF Site Visit

2-23-2006

SMBL (Server)

SMBL Server Process Table for 4 CPU parallel session

  • SMBL server maintains a dynamic pool of client process connections
  • Worker job manager hides details of ephemeral workers at the application level
smbl client

NSF Site Visit

2-23-2006

SMBL (Client)
  • Client library implements selected MPI-like calls
    • MPI_Send ()  SMBL_Send ()
    • MPI_Recv ()  SMBL_Recv ()
  • In charge of message delivery for each parallel process
process and port manager ppm

NSF Site Visit

2-23-2006

Process and Port Manager (PPM)
  • Starts the SMBL server and application processes on demand
  • Assigns port/host to each parallel session
  • Directs workers to their servers
slide13

NSF Site Visit

2-23-2006

PPM (cont’d ...)

PPM with two SMBL servers (two parallel sessions)

Parallel Session 1

Parallel Session 2

once again the big picture

NSF Site Visit

2-23-2006

Once again … the big picture

The shaded box indicates components hosted on multiple desktop computers

recent development

NSF Site Visit

2-23-2006

Recent Development
  • Hydra cluster Teragrid enabled! (Nov 2005)
    • Allow TG users to use resource
    • Virtual Host based solution – two different URLs for IU and Teragrid users
    • Teragrid users authenticate against PSC’s Kerberos server
system layout

NSF Site Visit

2-23-2006

System Layout
  • PPM, SMBL server, Condor and web portal running on Linux server
    • Dual Intel Xeon 3.0 GHz, 4 GB memory, GigE
  • Second Linux server running Samba to serve BLAST database
portal

NSF Site Visit

2-23-2006

Portal
  • Creates and submits Condor files, handles data files
  • Apache/PHP based
  • Kerberos authentication
  • URLs:
    • http://hydra.indiana.edu (IU users)
    • http://hydra.iu.teragrid.org (Teragrid users)
utilization of idle cycles

NSF Site Visit

2-23-2006

Utilization of Idle Cycles

Red: total ownerBlue: total idleGreen: total Condor

summary

NSF Site Visit

2-23-2006

Summary
  • Large parallel computing facility created at a low cost
    • SMBL parallel message passing library that can deal with ephemeral resources
    • PPM port broker that can handle multiple parallel sessions
  • SMBL Homepage
    • http://smbl.sourceforge.net (Open Source)
links and references

NSF Site Visit

2-23-2006

Links and References
  • Hydra Portal
    • http://hydra.indiana.edu (IU users)
    • http://hydra.iu.teragrid.org (Teragrid users)
  • SMBL home page: http://smbl.sourceforge.net
  • Condor home page: http://www.cs.wisc.edu/condor/
  • IU Teragrid home page – http://iu.teragrid.org
links and references cont d

NSF Site Visit

2-23-2006

Links and References (cont’d..)
  • Parallel FastDNAml: http://www.indiana.edu/~rac/hpc/fastDNAml
  • Blast: http://www.ncbi.nlm.nih.gov/BLAST
  • Meme: http://meme.sdsc.edu/meme/intro.html
ad