1 / 22

MyGrid: A User-Centric Approach for Grid Computing

MyGrid: A User-Centric Approach for Grid Computing. Walfredo Cirne Universidade Federal da Paraíba. High-Performance Computing. High-Performance Computing means running faster than the typical machine du jour

Download Presentation

MyGrid: A User-Centric Approach for Grid Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MyGrid: A User-Centric Approach for Grid Computing Walfredo Cirne Universidade Federal da Paraíba

  2. High-Performance Computing • High-Performance Computing means running faster than the typical machine du jour • Unbeatable price/performance of microprocessors has killed specializedhigh-performance machines • Therefore, paralelism currently is the way to do High-Performance Computing • Parallel supercomputers

  3. Solving a Real Problem • I had hundreds of thousands of independent simulations to run • Parallel supercomputers are typically • hard to get acess to • slow (too much time in the queue) • Since my simulations were independent, I had the perfect application for the Computational Grid

  4. Grid Computing • Grid Computing aims to enable the execution of parallel applications over processors that are: • Geographically distributed • Under multiple administrative domains • Not dedicated • The potential for resource gathering is enormous • “Let´s run over the Internet”

  5. Grid Applications • Not all applications can benefit from the Grid • Loosely coupled applications match the Grid characteristics much better than tightly coupled applications

  6. State of Art in Grid Computing • Most services are provided by the Grid Infrastructure • Naming, remote execution/task control, security, etc • Scheduling is done at the application level • Globus • “Virtual Organizations”

  7. Back to the Real Problem • I had hundreds of thousands of independent simulations to run • I was working in a top research lab in Grid Computing • I could not manage to use the Grid • It is hard to get the Grid Infrastructure Software installed everywhere

  8. The Motivation for MyGrid • Users of loosely coupled applications could benefit from the Grid now • However, they don´t run on the Grid today because the Grid Infrastructure is not widely deployed • What if we build a solution at the user level? That is, a solution that does not depend upon installed infrastructure?

  9. MyGrid • MyGrid is a framework to build infrastructure-independent grid applications • The user provides: • A description of her Grid • A way to do remote execution and file transfer • “The application” • MyGrid provides: • Grid abstractions • Scheduling

  10. MyGrid Goals • open = do not require a particular infrastructure • self-installable = do not require manual installation on a given machine • extensible = simple to add refinements • complete = cover the whole production cycle

  11. MyGrid Concepts • Job = set of independent tasks • Tasks have three pieces: init, remote and final • Home machine  Grid machine • Grid abstractions • remote execution • file transfer • playpen • mirroring

  12. Defining My Personal Grid bagre.dsc.ufpb.br dsc, linux ssh %machine %command scp %localdir/%file %machine:%remotedir scp %machine:%remotedir/%file %localdir traira.dsc.ufpb.br dsc, linux ssh %machine %command scp %localdir/%file %machine:%remotedir scp %machine:%remotedir/%file %localdir quidam.ucsd.edu cse, linux ssh %machine %command scp %localdir/%file %machine:%remotedir scp %machine:%remotedir/%file %localdir

  13. Fatoring with MyGrid • Fatora n gerates tasks, init, remotei, and collect • User runs mygrid.ui.AddTask < tasks • tasks task: init= init remote= remote1 final= collect processor= linux playpensize= 0 cost = 1 task: init= init remote= remote2 …

  14. Fatoring with MyGrid • initjava mygrid.ui.MyGridUI p $PROC ./Fat.class $PLAYPEN • remote1java Fat 3 18655 34789789798 output-$TASK • remote2 java Fat 18655 37307 34789789798 output-$TASK • collect java mygrid.ui.MyGridUI g $PROC "" $PLAYPEN saida-$TASK .

  15. Home Machine Grid Machine add-task (1) Task Manager task-done (4) grid stask remote exec (3) (2) home stasks (3c) User Agent Server (3b) User Agent Daemon playpen, file xfer, and remote exec (3a) Running an MyGrid Task

  16. User Agent • User Agent provides the grid abstractions • User Agent Daemon runs on grid machines • User Agent Server runs on home machines • The Daemon and the Server rely upon public-key cryptography to authenticate each other

  17. Self Instalation • We are working on having MyGrid install and start-up User Agents everywere • The user provides a way to do remote execution and file transfer to make that possible

  18. Scheduling in MyGrid • Grid scheduling is application dependent and effort intensive • Most people don´t want to spend months to write good schedulers for their applications • MyGrid provides a sensible default scheduler • The user can of course replace the default scheduler

  19. Default Scheduler • How to provide good performance with no knowledge about the application or the current state of the Grid • The key is to avoid having the job waiting for a task that runs in a slow/loaded machine • Task replication is our answer for this problem • Task replication is only done when the jobs has no other tasks

  20. Preliminary Results • During a 40-day period, we ran 600,000 simulations using 178 processors located in 6 different administrative domains widely spread in the USA • MyGrid took 16.7 days to run the simulations • My desktop machine would have taken 5.3 years to do so • Speed-up is 115.8 for 178 processors

  21. Conclusions • Running Grid Applications at the user-level is a viable strategy • Bag-of-tasks parallel applications can currently benefit from the Grid • Is “upperware” the way to go for new middleware development?

  22. Future Work • Turn MyGrid into a production-quality software • Investigate the impact of task replication in resource consumption • Develop a default scheduler for data intensive applications • Such a scheduler should try to minimize data movement

More Related