1 / 43

Faucets: the Charm++ Clusters Solution Tutorial

Faucets: the Charm++ Clusters Solution Tutorial. Laxmikant Kale, Jayant DeSouza, Sameer Kumar, Sindhura Bandhakavi, Mani Potnuru Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign http://charm.cs.uiuc.edu/. Motivation.

tale
Download Presentation

Faucets: the Charm++ Clusters Solution Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Faucets: the Charm++ Clusters SolutionTutorial Laxmikant Kale, Jayant DeSouza, Sameer Kumar, Sindhura Bandhakavi, Mani Potnuru Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign http://charm.cs.uiuc.edu/ Charm++ Workshop 2002

  2. Motivation • Demand for high end computational power, but • Dispersed • which machine would give me back my results quickest? • Hard to use • use ssh to login, ftp files, decide queue, create script, submit • because of the hassle, users just submit same script to same machine even if a better alternative exists • monitor a running job • Low operational efficiency of existing computing systems Charm++ Workshop 2002

  3. Outline • High-level • Faucets, Adaptive jobs and queuing system (AQS) • Demo • Usage and Installation • How to write an adaptive program • Installing and Using the AQS • Adding your cluster to an existing faucets server • Installing a faucets server Charm++ Workshop 2002

  4. Solution 1: Faucets • Motivation #1: dispersed, hard to use • Central source of compute power • Users • Providers of compute resources • User account not needed on every resource • Match users and providers • Market economy ? • QoS requirements, contracts and bidding systems • GUI or web-based interface • Submission • monitoring Charm++ Workshop 2002

  5. Faucets Job Specs Bids Job Specs File Upload File Upload Job Id Job Id Parallel systems need to maximize their efficiency! Cluster Job Submission Cluster Job Monitor Cluster http://charm.cs.uiuc.edu/research/faucets Charm++ Workshop 2002

  6. Allocate A ! 8 processors Conflict ! Job B Job A Job B 10 processors Job A B Queued Motivation #2: Inefficient Utilization 16 Processor system Current Job Schedulers can have low system utilization ! Charm++ Workshop 2002

  7. Motivation #2, contd. • Chun & Culler paper • Compares FirstPrice (market-based scheduling) with PrioFIFO. • Up to 2.5x improvement as degree of job parallelism increases • Both have “head-of-line” blocking • Adaptive jobs fix this Brent Chun and David Culler – User-centric Performance Analysis of Market-based Cluster Batch Schedulers, CCGrid 2002. Charm++ Workshop 2002

  8. Solution 2: Adaptive Jobs • Jobs that can shrink or expand the number of processors they are running on at runtime • Improve system utilization and response time • Properties • Min_pe, • related to the memory requirements of the job • Max_pe, • related to speedup • Scheduler can take advantage of this adaptivity Charm++ Workshop 2002

  9. Adaptive Job Scheduler • Maximize system utilization and minimize response time • Scheduling decisions • Shrink existing jobs when a new job arrives • Expand jobs to use all processors when a job finishes • Processor map sent to the job • Bit vector specifying which processors a job is allowed to use • 00011100 (use 3 4 and 5!) • Handles regular (non-adaptive) jobs Charm++ Workshop 2002

  10. B Finishes Shrink A Allocate B ! A Expands ! Allocate A ! Min_pe = 8 Max_pe= 16 Job B Job A Job B Max_pe = 10 Min_pe = 1 Job A Two Adaptive Jobs 16 Processor system Charm++ Workshop 2002

  11. Demonstration Charm++ Workshop 2002

  12. Outline • High-level • Faucets, Adaptive jobs and queuing system (AQS) • Demo • Usage and Installation • How to write an adaptive program • Installing and Using the AQS • Adding your cluster to an existing faucets server • Installing a faucets server Charm++ Workshop 2002

  13. Adaptive Jobs Charm++ Workshop 2002

  14. Adaptive Application Scheduler AMPI Proc. Map CHARM++ Loadbalancer Converse Adaptive Job Framework • Applications written in MPI or Charm++ • Scheduler controls the processor map for each job • Processor map is used by the job’s load balancer Charm++ Workshop 2002

  15. Charm++ • Charm++: Object based virtualization • Program written as a large number of objects which can migrate • Number of objects typically much larger than processors • Load-balancer can remap objects • Measurement based load balancing Charm++ Workshop 2002

  16. Adaptive Charm++ Programs • Charm++ program is adaptive automatically if a shrink expand enabled centralized load-balancing strategy is used • Currently CommLB and RandcentLB are shrink expand enabled • Compile with +balancer CommLB Charm++ Workshop 2002

  17. MPI Jobs • How do we make MPI jobs adaptive? • AMPI • AMPI maps the MPI processes to user level threads which can migrate • Each thread is embedded in a Charm++ object, thus allowing load balancing and shrink-expand Charm++ Workshop 2002

  18. Writing Adaptive AMPI Programs • Build AMPI with an adaptive load balancing strategies • Call MPI_MIGRATE() at regular intervals in each MPI process, because it will not listen to the processor map otherwise. Charm++ Workshop 2002

  19. Performance Results for Adaptive Jobs Charm++ Workshop 2002

  20. Processors Shrink Time (s) Expand Time (s) 128 64 0.61 0.50 64 32 0.66 0.54 32 16 0.59 0.46 16 8 0.56 0.49 Shrink Expand Overhead Performance for MD program with 10MB migrated data per processor on NCSA Platinum Charm++ Workshop 2002

  21. Residual Processes • Shrink • Objects are moved from the unallocated processors to the allocated processors • Leaves behind a residual process Charm++ Workshop 2002

  22. Utilization (%) Time (s) Effect of Residual Process Performance on a 16 processor system Performance of Job1 and Job2 Charm++ Workshop 2002

  23. Adaptive Queuing System Charm++ Workshop 2002

  24. Charm++ Workshop 2002

  25. AQS Features • Multithreaded • Reliable and robust • Tested on the cool.cs Linux cluster at PPL • Supports most features of standard queuing systems • Has the ability to manage adaptive jobs currently implemented in Charm++ and MPI • For more details check out http://charm.cs.uiuc.edu/research/faucets/faucets.html Charm++ Workshop 2002

  26. Components • Database • Job Scheduler • Compute Cluster Charm++ Workshop 2002

  27. Installing Database • Download latest version of MySql • http://www.mysql.com/ • Install, then: mysql> create database <dbname>; mysql> use <dbname>; mysql> create table jobInfo (id mediumint primary key NOT NULL DEFAULT '0' auto_increment, …..) mysql> grant all on *.* to <user> identified by <passwd>; Charm++ Workshop 2002

  28. Installing Scheduler • cd charm/net-linux/pgms/scheduler; • make scheduler; make client; • Edit Makefile, put correct path to MySql • Running scheduler as root • su • chown root scheduler; • chmod +s scheduler • ./startScheduler Charm++ Workshop 2002

  29. Installing Scheduler, contd. • Edit the startScheduler file: • Edit Database to match <dbname> used earlier. • Edit PORT to point to port of the scheduler • Edit DATABASE_HOST DATABASE_USER and DATABASE_PASSWD to point to the database host, user and password • NODELIST points to the nodelist for the scheduler Charm++ Workshop 2002

  30. Configuring The Cluster • User must have access to the cluster only through the queuing system • Each node runs an rsh daemon • Access to rsh through a restrictive group • Job switches to the rsh group before running the job • only head node can rsh to the other nodes • rsh disabled on the compute nodes • All connections through unix sockets Charm++ Workshop 2002

  31. Using the AQS locally • frun runs a job interactively • fsub submits a batch job • fkill kills the job • fjobs list the running and queued jobs Charm++ Workshop 2002

  32. Scheduling Events • When : • Job arrival • Job completion • Job requests change of number of processors • Job suspension • Scheduling Strategy • A plugable component that makes decisions on which jobs to schedule Charm++ Workshop 2002

  33. Scheduling Strategy Studied • Similar to equipartitioning [N Islam et al] • On job arrival and job completion • All running jobs and the new one are allocated their minimum number of processors • Leftover processors are shared equally subject to each job's maximum processor usage • If it is not possible to allocate the new job its minimum number of processors, it is queued Charm++ Workshop 2002

  34. 1/(λ) (s) Adaptive Jobs Traditional Jobs lf MRT (s) Utilization (%) MRT (s) Utilization (%) 500 68 13 165 9 0.13 200 76 31 185 23 0.32 100 60 233 46 0.65 96 64.5 143 88 396 71 1.0 60 164 92 488 76 1.08 Scheduler Performance Simulation results on 64 processors with mean job execution time of 64.5 sec λ=Arrival Rate, MRT=Mean Response Time Utilization=Processor utilization, Load Factor (lf)=Execution Time*λ Charm++ Workshop 2002

  35. 1/(λ) (s) Adaptive Jobs Traditional Jobs lf MRT (s) Utilization (%) MRT (s) Utilization (%) 500 89 17 109 9 0.12 200 70 29 108 23 0.3 100 68 116 49 0.6 76 60 211 99 303 74 1.0 Experimental Results Experiments on Linux cluster on 64 processors and mean job execution time of 60 sec Charm++ Workshop 2002

  36. Adding a Cluster to Faucets Charm++ Workshop 2002

  37. Charm++ Workshop 2002

  38. Adding new cluster • Prerequisites • Install Charm++ • Install Adaptive Queuing System • Then • Download the faucets software • http://charm.cs.uiuc.edu/ • Compile the cluster daemon (CD) • cd faucets/cd; make • Run the cluster daemon (CD) • cd .. • java cd.ClusterDaemon <central server> <central server port> -p <ClusterDaemon port> <working dir> Charm++ Workshop 2002

  39. Installing a Faucets Server Charm++ Workshop 2002

  40. Charm++ Workshop 2002

  41. Installing a Faucets Server • Install MySQL • create tables • grant permissions • Download JDBC driver • http://mmmysql.sourceforge.net/ • Install CS • download faucets code and unpack • cd faucets/cs; make • Edit faucets/cs/db.properties • cd faucets • java -cp .:/path/to/mm.mysql-2.0.8-bin.jar TheServer Charm++ Workshop 2002

  42. Installing Appspector • Installation is a little involved • Each application needs a display module written in Java • Contact us if you want to install Charm++ Workshop 2002

  43. Summary and Future Work • Demonstrated the system • Showed you how to use and install the Charm++/AMPI adaptive job system • Go tohttp://charm.cs.uiuc.edu/research/faucetsto download • Future • Extend the system to other parallel machines • Eliminate residual processes • Integrate the scheduler with Globus • More comprehensive QoS contracts being developed • Sophisticated bidding schemes for the faucets framework Charm++ Workshop 2002

More Related