1 / 18

Resource management system for distributed environment

Resource management system for distributed environment. B4. Nguyen Tuan Duc. Background. Emerging need for resource management system of clusters / grids Several systems exist, but have problems… Portable Batch System Sun Grid Engine …. Goal. Flexible resource management system

diallo
Download Presentation

Resource management system for distributed environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Resource management system for distributed environment B4. Nguyen Tuan Duc

  2. Background • Emerging need for resource management system of clusters / grids • Several systems exist, but have problems… • Portable Batch System • Sun Grid Engine • ….

  3. Goal • Flexible resource management system • Support clusters, grids • Fair-share scheduling • Maximize utilization of resources • Support parallel applications • Reduce load aggregation

  4. Agenda • Background • Goal • Related works • Proposal method • Problems

  5. Related works • Portable Batch System (MRJ 1990s) • Batch queuing system • Automatic load-balancing • Parallel jobs support • Job accounting

  6. Portable Batch System (PBS)

  7. Sun Grid Engine • Batch queuing system by Sun Microsystems • Same features with PBS, and • Job checkpoint • Several add-ons

  8. Problems of batch queuing systems • Resource utilization • Load aggregation • Server accept too many requests from clients • Limit of execution model • Cannot fork, since process created with fork() does not go into the queue • …

  9. Saito Dai’s system (STDS) • Flexible Resource Management System for Widely Distributed Environment (2006) • No load aggregation • Job scheduling on each node • Independent from execution model (fork, … OK) • Support parallel jobs

  10. STDS structure • Two main components • Node searching system (graph searching) • Scheduler (on each node) • Scheduler • Daemon on each node • CPU fair-sharing by ‘nice’ • Node searching system • Create graph from links • Node search  graph search

  11. STD node searching system

  12. Our approach • Similar to STD system • Node searching system • Scheduler on each node • But different in … • Node search: no graph searching • Scheduler: kernel scheduler with user accounting (budget scheduler)

  13. Scheduler: Budget scheduling • Budget scheduling • Normal queue & budget queue • Normal queue for interactive processes • Linux 2.6 default scheduler • Budget queue for CPU-hogging processes • Automatic detecting of CPU-intensive process • http://www.logos.ic.i.u-tokyo.ac.jp/~duc/pre/1107.ppt

  14. Node searching system • Client-server model • Daemon on each node • Daemon reports CPU state (process number, CPU utilization, …) directly to user • Reports maximum price • From where user can submit jobs? • From every where on the cluster, grids • From their desktop, via the Internet  Need of a job submitting system

  15. Node searching system (NSS) User

  16. Who will determine nodes? • User! • Users choose nodes appropriated to their jobs • Parallel jobs: idle CPUs or CPUs with low-price jobs • Long-last jobs: idle CPU, set low-price

  17. Node searching system (NSS) • NSS should report to users: • CPU utilization • Maximum price • Load (process number, ..) • … • Daemon on each node sends information about the node to client. • Client is on user’s machine  No heavy load aggregation

  18. Problems!!! • May be heavy load on user client • NAT, Firewall • How client can connect to server?? • Information need? • Only CPU utilization, maximum price, load, average-price?

More Related