1 / 12

Desktop Grid with XtremWeb: User experiences and feedback

SC2002 Panel: Desktop Grids: 10,000-Fold Parallelism for the masses. F. Cappello , A. Bouteiller, S. Djilali, G. Fedak, C. Germain, O. Lodygensky, F. Magniette, V. Neri, A. Selikhov Cluster et GRID group LRI, Université Paris sud. fci@lri.fr www.lri.fr/~fci/Group.

orde
Download Presentation

Desktop Grid with XtremWeb: User experiences and feedback

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC2002 Panel: Desktop Grids: 10,000-Fold Parallelism for the masses F. Cappello, A. Bouteiller, S. Djilali, G. Fedak, C. Germain, O. Lodygensky, F. Magniette, V. Neri, A. Selikhov Cluster et GRID group LRI, Université Paris sud. fci@lri.fr www.lri.fr/~fci/Group Desktop Grid with XtremWeb: User experiences and feedback Destop Grids Panel at SC2002, Baltimore

  2. hierarchical XW coordinator Peer to Peer Coordinator PC coordinator Global Computing (client) PC Worker XtremWeb: General Architecture • XtremWeb: Free, OpenSource, PC-Grid framework • For research studies and production • 3 entities : client/coordinator/worker (diff protect. domains) PC Client/worker Internet or LAN PC Worker PC Client/Worker Destop Grids Panel at SC2002, Baltimore

  3. XtremWeb: Userprojects C@sper • 1 CGP2PACI GRID(academic research on Desktop Grid systems), France • Industry research project (Airbus + Alcatel Space), France • 3 Augernome XtremWeb(Campus wide Desktop Grid), France • IFP (French Petroleum Institute), France • 5 EADS (Airplane + Ariane rocket manufacturer), France • 6 University of Geneva, (research on Desktop Grid systems), Switzerland • University of Winsconsin Madisson, Condor+XW, USA • University of Gouadeloupe + Paster Institute: Tuberculoses, France • 9 Mathematics lab University of Paris South (PDE solver research) , France • University of Lille(control language for Desktop Grid systems), France • 1 ENS Lyon: research on large scale storage, France Destop Grids Panel at SC2002, Baltimore

  4. Casper (Airbus, Alcatel Space) Building an open source Framework for sharing software and hardware resources Desktop Grid Component Execution Clusters Component Execution XtremWeb Numerical Component ASP Job submission Data Transfer Scheduler Java Data User management Data management Portal (pages web) Parallel Computers Job submission Data Transfer Java Job submission Data Transfer Component Execution Pc or Station Client Html SSL data Numerical Component Destop Grids Panel at SC2002, Baltimore Data

  5. Augernome XtremWeb • Particle Physics Laboratory (LAL-IN2P3) • Understanding the origin of very high energy cosmic rays (10^20 ev) • Air Shower Extended Simulator • Typical execution time: 10 hours • Number of simulations: Millions • Proteins Modeling and Engineering Laboratory • Structural Genomic: Numerical simulation of proteins conformation change • Charm Molecular dynamic • Various execution time • Movie generation (large number of simulations) Destop Grids Panel at SC2002, Baltimore

  6. Pierre AugerObservatory Understanding the origin of very high cosmic rays: • Aires: Air Showers Extended Simulation • Sequential, Monte Carlo.Time for a run: 5 to 10 hours Air shower parameter database (Lyon, France) Traditional Super Computing Centers XtremWeb Server CINES (Fr) Estimated PC number ~ 5000 air shower PC worker Internet and LAN Fermi Lab (USA) PC Client PC Worker PC worker Aires Destop Grids Panel at SC2002, Baltimore

  7. French Petroleum Institute Gibbs Application Molecular modeling • Monte-Carlo simulation • Task duration : ~ 48 hours on a typical PC Senkin Application Computation of the auto-inflaming time for a gas • Application to car Gasoline engines: • multi-parameter execution, 12000 of parameter set for 1 study • 10 minutes of computation, • input file size: 200k, output file size 5 Mo Destop Grids Panel at SC2002, Baltimore

  8. XtremWeb VS. MPI 00:23:02 00:21:36 00:20:10 00:18:43 00:17:17 00:15:50 00:14:24 00:12:58 temps h:m:s 00:11:31 XtremWeb 00:10:05 MPI 00:08:38 00:07:12 00:05:46 00:04:19 00:02:53 00:01:26 00:00:00 4 8 16 Number of processors EADS-CCR (Airbus, Ariane) Cassiope application: Ray-tracing Destop Grids Panel at SC2002, Baltimore

  9. Feedbacks What applications are you running on desktop Grids? What scale?  Scientific applications Multiparameters: Particle physics: AIRES (AIR shower Extended Simulation)  5000 nodes (expected) Molecular Dynamics: Charm  1000 nodes (expected) Computational Fluid Dynamic (at IFP):  2000 nodes (expected) etc. Master-Worker: Raytracing (EADS)  ??? nodes expected Destop Grids Panel at SC2002, Baltimore

  10. Application Range, & deployment What range of applications and scale of deployment do you expect?  What reduces the application range? Computational resource capacities:  Limited memory (128 MB, 256 MB),  Limited network performance (100baseT), Available programming models:  Master-Worker,  RPCs  Need for MPI  What makes the deployment a complex issue?  Human factor (system administrator, PC owner)  Use of network resources (backup during the night)  Dispatcher scalability (hierarchical, distributed?)  Complex topology (NAT, firewall, Proxy). Destop Grids Panel at SC2002, Baltimore

  11. Programmer’s view unchanged: PC client MPI_send() PC client MPI_recv() MPICH-V (Volatile) Goal: execute existing or new MPI Apps Problems: 1) volatile nodes(any number at any time) 2) firewalls(PC Grids) 3) non named receptions( should be replayed in the same order as the one of the previous failed exec.) Objective summary: 1) Automatic fault tolerance 2) Transparency for the programmer & user 3) Tolerate n faults (n being the #MPI processes) 4) Firewall bypass (tunnel) for cross domain execution 5) Scalable Infrastructure/protocols 6) Avoid global synchronizations (ckpt/restart) 7) Theoretical verification of protocols Destop Grids Panel at SC2002, Baltimore

  12. Concluding remarks What we have learned: • Deployment is critical and may take a long time for non specialist • Users don’t understand immediately the computational power potential of Desktop Grid • When they understand, their propose new utilization of their applications (similar to the transition from sequential to parallel) • They also rapidly ask for more resources!!! • Users ask for more programming model paradigms  MPI • Strong need for tools helping users browsing their mountain of results We need more feedback: • Deployment • Applications • Programming models • MPICH-V www.xtremweb.org Destop Grids Panel at SC2002, Baltimore

More Related