Abdul Aziz Habib Ammari Pearl Thomas Vamsi Krishna

The Prospero Resource Manager: A Scalable Framework for Processor Allocation in Distributed Systems Abdul Aziz Habib Ammari Pearl Thomas Vamsi Krishna

Introduction • Poor performance of conventional techniques (Parallel Vs Distributed) • Prospero Resouce Manager (PRM) • Resource management techniques should scale: • numerically • geographically • admisintratively

Introduction- cont’d • Prospero Perspective: Multiple Resource Managers • System Manager • Job Manager and • Node Manager

Program Execution 5 Program Loading Common Libraries 4 Task to Processor Mapping 3 Processor Selection/Allocation 2 Configuration of Environment 1 Contemporary Approaches • Phases of execution • Distributed Environment: List of available nodes • Locus, NEST, Sprite, and V support processors allocation and remote program loading

Contemporary Approaches - cont’d • Locus: environment of initiating process • NEST: advertise availability • Sprite: shared file as a centralize database • V: server selects least loaded node • UCLA Benevolent Bandit Laboratory (BBL) • DQS and Lsbatch • Parallel Virtual Machine (PVM) and Net-Express

Scalable Resource Management • Virtual System Model: new model for organizing large distributed systems • Access of a subset of resources • Hiding the mapping of resources to physical locations • Partition of the resource management functions • System manager • Job manager • Node manager

Scalable Resource Management (con’t) • System managers • Managing subsets of resources (processors) • Hierarchical concept (layers of system managers) • Maintaining all information about resources • Reacting to status updates (node managers) and resources requests (job managers) • Assigning suitable resources upon requests, notifying job manager, node managers responsible for each resource (only a subset of the requested resources can be assigned)

Scalable Resource Management (con’t) • Job manager • Agent for tasks in a job • One job manager per job • Part of a job and aware of requirement and communication patterns of the managed tasks • Support fault-tolerant and real-time applications • debugging and performance tuning

Scalable Resource Management (con’t) • Identification of job’s resource requirements (job initiated) • Locating system managers and sending allocation requests • Monitoring the execution of the program

Scalable Resource Management (con’t) Node manager • Receiving messages from the system manager (identifying job managers to load, execute programs) • Notifying the job manager about events (termination and failure of tasks) • Informing the system manager about availability of the node for assignment • Caching information needed to direct messages for other tasks to the node on which the task runs

Implementation : Introduction • Prospero Resource Manager (PRM) Implementation • - Runs on a collection of work stations (Sun-3, HP 9000/700 etc.) • - Workstations connected by LAN/WAN • - Supports heterogeneous execution environment • - The system manager can manage nodes of more than one processor type • - Enables the user to place constraints (type, location etc) through job configuration options. • - Also supports parallel and remote sequential applications

Program Loading and I/O • PRM supports explicit loading of files when the nodes assigned to jobs don’t share common file system • - Performed by transferring the executables to the node’s local file system • - File I/O task handles access to files on the user’s local system • - A task has exclusive read/write access to a shared file • Terminal I/O task supports interactive execution • - Users can customize the task for job initialization functions such as interactive inputs and assigning inputs to appropriate task

Communication Libraries • Communication Library Functions • - Provides routines for sending, receiving and broadcasting tag messages • - Commonly used routines made available through set of macros & functions • - Provides routines for message passing, buffer manipulation, process control • data packing and unpacking • Approach • - ARDP protocol is used to transmit and receive sequence packets

Job Manager Supporting program development • Supports debugging of parallel applications • - Check point and replay approaches used • - Programs can be restored to their past states • - Tasks maintains a log of communications activities • - Task monitor exist for each task • - Individual task can be replayed in isolations

Performance • Communication Latencies • PVM library over ARDP Vs PVM ver 3.2.6 • Resource Allocation performance of PRM • Test Bed • SPARC-10s connected to ethernet • Exclusive machines • SunOS 4.1.3 with improved time facility • pvm_send() & pvm_recv()

Wide Area Network Simulation • Latency of 0msec, 10msec, and 100msec • USC, USC-ISI, ISI-MIT Table 1 : Average Time (in msecs) to execute a pvm_send() – pvm_recv() pair Table 2: Average time (in msecs) to execute a pvm_mcast() and matching pvm_recv() pair

Resource Allocation Results Table 3 : Allocation time as a function of the number of nodes allocated Table 4: Allocation time as a function of the number of system managers from which resources are requested. A total of 8 nodes were allocated in each case.

Future Directions • Alternative job managers • fault-tolerant and real time applications • Node manager • part of kernel • compiler generated resource list • preemptive scheduling of tasks • Integrated set of tools for developing and executing parallel and distributed applications • Security

Conclusion • Prospero : A different approach to resource management. • Scalable • provides framework for development and execution of parallel and distributed applications

Questions ? & Comments!

Abdul Aziz Habib Ammari Pearl Thomas Vamsi Krishna

Abdul Aziz Habib Ammari Pearl Thomas Vamsi Krishna

Presentation Transcript

Respiratory Failure Abdul-Aziz Ontok, Fritzie Rasonable, April Suzette Exile

Mohd Zaidan bin Abdul Aziz B050810281

Abdul Aziz ibn Saud

ITCS 6010 DATA INTEGRATION Krishna Kant Sri Harsha Pokala Vamsi Krishna Jamulapati

King Abdullah bin Abdul-Aziz

Abdul-Aziz .M Al- Yami Khurram Masood

Khurram Masood Abdul-Aziz .M Al-Yami

Abdul-Aziz .M Al- Yami Khurram Masood

KING abdul AZIZ bin ABDULrahman AL- saud

King Abdul aziz university Faculty of engineering Electrical and computer Department

Abdul Aziz was born

Azlan Abdul Aziz, Universiti Putra Malaysia

King Abdul Aziz University College of Engineering Electrical and Computer Department