1 / 39

Achieving Application Performance on the Computational Grid

This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation In Slide Show, click on the right mouse button Select “Meeting Minder” Select the “Action Items” tab

cree
Download Presentation

Achieving Application Performance on the Computational Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during your presentation • In Slide Show, click on the right mouse button • Select “Meeting Minder” • Select the “Action Items” tab • Type in action items as they come up • Click OK to dismiss this box This will automatically create an Action Item slide at the end of your presentation with your points entered. Achieving Application Performanceon the Computational Grid Francine Berman U. C. San Diego and NPACI

  2. data archives networks visualization instruments The Computing Landscape Wireless MPPs clusters PCs Workstations

  3. Computing Platforms • Combining resources “in the box” • focus is on new hardware • Combining resources as a “virtual box” • focus is on software infrastructure

  4. The Computational Grid Computational Grid = ensemble of distributed and heterogeneous resources Metaphor: Electric Power Grid • for users, power is ubiquitous • you can plug in anywhere • you don’t need to know where the power is coming from

  5. Better Toast • On the electric power grid, power is either adequate or it’s not • On the computational grid, applicationperformance depends on the underlying system state • Major Grid research and development thrusts: • Building the Grid • Programming the Grid

  6. Programming for Performance • Performance Paradigm:To achieve performance, applications must be designed and implemented to leverage the performance characteristics of the underlying resources. Performance Characteristics of the Grid • Resources are distributed, heterogeneous • Resources shared by multiple users • Resource performance may be hard to predict

  7. How Can Applications Achieve Performance on the Grid? • Build programs to be grid-aware • Leverage deliverable resource performance during execution • Scheduling is fundamental • Key Grid scheduling components • dynamic information • quantitative and qualitative predictions • adaptivity

  8. Performance feedback Perf problem Realtime perf monitor Software components Service negotiator Grid runtime system Config. object program Source appli- cation whole program compiler P S E negotiation Scheduler Dynamic optimizer libraries Achieving Application Performance • Many entities will schedule the application Grid Application Development System

  9. Application Scheduling • Application schedulers must • perceivethe performance impact of system resources on the application • adapt application execution schedule to dynamic conditions • optimizeapplication schedule for Grid according to the user’s performance criteria • Application scheduler tasked with promoting application performanceover the performance of other applications and system components

  10. Paradigm for Application Scheduling • Self-Centered Scheduling:Everything in the system is evaluated in terms of its impact on the application. • performance of each system component can be considered as ameasurable quantity • forecasts of quantities relevant to the application can be manipulated to determine schedule • This simple paradigm forms the basis for AppLeS.

  11. NWS (Wolski) User Prefs App Perf Model Resource Selector Planner Application Act. resources/ infrastructure Grid/cluster AppLeS Joint project with Rich Wolski at U. Tenn • AppLeS= Application-Level Scheduler • agent-based approach • each application integrated with its own AppLeS • each AppLeS develops and implements a custom application schedule

  12. Select resources For each feasible resource set, plana schedule For each schedule, predict application performance at execution time consider both the prediction and its qualitative attributes Implementthe “best” of the schedules wrt user’s performance criteria execution time convergence turnaround time NWS (Wolski) User Prefs App Perf Model Resource Selector Planner Application Act. resources/ infrastructure Grid/cluster AppLeS Approach

  13. Sensor Interface Reporting Interface Forecaster Model Model Model Network Weather Service (Wolski, U. Tenn.) • The NWS provides dynamic resource information for AppLeS • NWS is stand-alone system • NWS • monitors current system state • provides best forecast of resource load from multiple models

  14. P1 P2 P3 Using Forecasting in Scheduling • How much work should each processor be given? • Jacobi2D AppLeS solves equations forArea:

  15. Good Predictions Promote Good Schedules • Jacobi2D experiments

  16. SARA: An AppLeS-in-Progress • SARA = Synthetic Aperture Radar Atlas • application developed at JPL and SDSC • Goal:Assemble/process files for user’s desired image • thumbnail image shown to user • user selects desired bounding box for more detailed viewing • SARA provides detailed image in variety of formats

  17. Network shared by variable number of users Data Server Computation servers and data servers are logical entities, not necessarily different nodes Compute Server Data Server Data Server Computation assumed to be done at compute servers Simple SARA • AppLeS focuses on resource selection problem: Which site can deliver data the fastest? • Goal is to optimize performance by minimizing transfer time • Code developed by Alan Su

  18. Experimental Setup • Data for image accessed over shared networks • Data sets 1.4 - 3 megabytes, representative of SARA file sizes • Servers used for experiments • lolland.cc.gatech.edu • sitar.cs.uiuc • perigee.chpc.utah.edu • mead2.uwashington.edu • spin.cacr.caltech.edu via vBNS via general Internet

  19. Which is “Closer”? • Sites on the east coast or sites on the west coast? • Sites on the vBNS or sites on the general Internet? • Consistently the same site or different sites at different times?

  20. Which is “Closer”? • Sites on the east coast or sites on the west coast? • Sites on the vBNS or sites on the general Internet? • Consistently the same site or different sites at different times? Depends a lot on traffic ...

  21. Preliminary Results • Experiment with larger data set (3 Mbytes) • During this time-frame, general Internet provides data mostly faster than vBNS

  22. More Preliminary Results • Experiment with smaller data set (1.4 Mbytes) • During this time frame, east coast sites provide data mostly faster than west coast sites

  23. 9/21/98 Experiments • Clinton Grand Jury webcast commenced at trial 62

  24. What if File Sizes are Larger? Storage Resource Broker (SRB) • SRB provides access to distributed, heterogeneous storage systems • UNIX, HPSS, DB2, Oracle, .. • files can be 16MB or larger • resources accessed via a common SRB interface

  25. SRB Client AppLeS Network Network Weather Service SRB Server MCAT Distributed Physical Storage • Being developed by Marcio Faerman • Like Simple SARA, SRB focuses on resource selection • NWS probe is 64K, SRB file size is 16MB • How to predict SRB file transfer time? An SRB AppLeS

  26. Dec 3 Dec 11 Predicting Large File Transfer Times NWS and SRB present distinct behaviors Current approach:Use linear regression on NWS bandwidth measurementsto track SRB behavior

  27. Data Servers Compute Servers Move the computation or move the data? Which servers to use for multiple files? Client . . . Which compute servers to use? Distributed Data Applications Simple SARA and SRB representative of a larger class of distributed data applications Goal is to develop AppLeS scheduler for “end-to-end” applications

  28. A Bushel of AppLeS … almost • During the first “phase” of the project, we’ve focused on developing AppLeS applications • Jacobi2D • DOT • SRB • Simple SARA • magnetohydrodynamics • CompLib • INS2D • Tomography, ... • What have we learned?

  29. Compile-time Blocked Partitioning Run-time AppLeS Non- Uniform Strip Partitioning Lessons Learned From AppLeS Dynamic information is critical. Jacobi2D

  30. Lessons Learned from AppLeS • Program execution and parameters may exhibit a range of performance

  31. Lessons Learned from AppLeS • Knowing something about the “goodness” of performance predictions can improve scheduling SOR CompLib

  32. Lessons Learned from AppLeS • Performance of application sensitive to scheduling policy, data, and system characteristics

  33. Achieving Application Performance on the Grid • AppLeS uses adaptivity to leverage deliverable resource performance • Performance impact of all components considered • AppLeS agents target dynamic, multi-user distributed environments • AppLeS is leading project in application scheduling

  34. Related Work • Application Schedulers • Mars, Prophet/Gallop, VDCE, ... • Scheduling Services • Globus GRAM • Resource Allocators • I-Soft, PBS, LSF, Maui Scheduler, Nile, Legion • PSEs • Nimrod, NEOS, NetSolve, Ninf • High-Throughput Schedulers • Condor • Performance Steering • Autopilot, SciRun

  35. New Directions • AppLeS Templates • distributed data applications • parameter sweeps • master/slave applications • data parallel stencil applications Network Weather Service dynamic benchmarking suite selection API AppLeS Template Retargeting Engineering Environment API API Application Module Performance Module Scheduling Module Deployment Module

  36. New Directions • Expanding AppLeS target execution sites • interactive clusters • linux, NT • Globus, Legion • batch systems • high-throughput clusters (Condor) • all of the above AppLeS SCHED

  37. X New Directions • Real World Scheduling • scheduling with • partial information • poor information • dynamically changing information • Multischeduling • resource economies • scheduling “social structure”

  38. Performance feedback Perf problem Realtime perf monitor Software components Service negotiator Grid runtime system Config. object program Source appli- cation whole program compiler P S E negotiation Scheduler Dynamic optimizer libraries The Brave New World • Design, development, and execution of grid-aware applications Grid Application Development System

  39. AppLeS Corps: Fran Berman, UCSD Rich Wolski, U. Tenn Henri Casanova Walfredo Cirne Marcio Faerman Jaime Frey Jim Hayes Graziano Obertelli Gary Shao Shava Smallen Alan Su Dmitrii Zagorodnov The AppLeS Project • Thanks to NSF, NASA, NPACI, DARPA, DoD • AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html

More Related