1 / 21

Strategies for Implementing Dynamic Load Sharing

Strategies for Implementing Dynamic Load Sharing. Problems with Static Load Sharing. Algorithms must be distributed at compile time. For many algorithms, the processing time for the program is unknown. Load on each processor is unknown and can change at any time.

mills
Download Presentation

Strategies for Implementing Dynamic Load Sharing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strategies for Implementing Dynamic Load Sharing

  2. Problems with Static Load Sharing • Algorithms must be distributed at compile time. • For many algorithms, the processing time for the program is unknown. • Load on each processor is unknown and can change at any time. • If a processor finished, it will remain idle.

  3. Advantages to Dynamic Load Sharing • Partitions of work are modified at run time. • Hope is for load to shift during run time so that no processor is idle until all processors run out of work. • Sender and Receiver Based • depending on the algorithm, either the over-loaded or under-loaded processor can initiate the transfer of work.

  4. Hybrid Load Sharing • Load is initially distributed statically. • During run time, the work distribution is modified dynamically. • Saves the complexity of having to break up the workload at the start.

  5. Conditions for Dynamic Load Sharing to be Worthwhile • Work at each processor must be able to be partitioned into independent pieces as long it has more than a “minimal” amount. • Cost of splitting work and sending to another processor must be less than if it does not. • Method of splitting data must exist.

  6. Receiver Initiated Dynamic Load Sharing Algorithms

  7. Asynchronous Round Robin • Each processor keeps a “target” variable. • When a processor runs out of work, it sends a request to the processor in “target. • Not desirable because multiple processors could send work to the same processor near the same time. Depending on network topology, high overhead is possible to reach all processors from one node.

  8. Nearest Neighbor • An idle processor will send requests to its nearest neighbors in a round robin scheme. • If a network has an identical distance between processors, same as Asynchronous Round Robin method. • The major problem with this method is that if there is a localized concentration of data it will take a long time to be distributed

  9. Global Round Robin • Solves the problems of distributing work evenly and of a processor receiving work from multiple other processors. • Global “target” variable so that workload gets distributed relatively evenly. • Problem is that processors will be in contention for the target variable.

  10. Avoids contention problem when accessing the target variable All requests to read the target value are combined at intermediate nodes. Has only been used in research. Global Round Robin with Message Combining

  11. Random Polling • Simplest method of load balancing • processor requests a processor to send work to at random • equal probability of sending work to each processor, so, distribution of work is about equal

  12. Scheduler Based • One processor designated as the scheduler. • sends work from FIFO queue of processor that can donate work to nodes that request work. • Work request never sent to a node that does not have work to send. • Disadvantage is the routing time because every node must go through the scheduler node.

  13. Gradient Model • Based on 2 threshold parameters, High-Water-Mark (HWM) and Low-Water-Mark (LWM) • Determine if system load is light, moderate or heavy. • Proximity defined as shortest distance to a lightly loaded node

  14. Tasks are rounted through the system in toward the nearest underloaded processor. Gradient map may change while work passes through network Propagation to update gradient map can vary greatly. Gradient Map (cont…)

  15. Receiver Initiated Diffusion • Each processor only looks at the load information in its own domain. • An implementation of Near Neighbor, however, there is a threshold value where a processor will request work, so a processor will never become idle. • Eventually will cause each processor to have an even amount of the work

  16. Sender Initiated Dynamic Load Balancing Algorithms

  17. Single Level Load Balancing • Breaks down tasks into smaller subtasks. • Each processor responsible for more than one subtask. • This assures that each processor does roughly the same amount of work. • Manager processor controls creation and distribution of subtasks. • Not scalable because manager must distribute all work

  18. Multilevel Load Balancing • Processors arranged in trees. • Root processor of each tree is responsible for distributing super-subtasks to its successor processors. • If only one tree, will act the same as single level load balancing.

  19. Sender Initiated Diffusion • Each processor sends out a load update to its neighbors. If a load update shows that a processor does not have a lot of work, then work is sent by one of its neighbors. • Similar to Receiver Initiated Diffusion • If one area of the network has a lot of work, it will take a long time to become distributed.

  20. Organizes the processors into balancing domains. Specific processors given responsibility of balancing different levels of hierarchy. Tree structure works well. If a branch overloaded, it will send work to another branch of the tree, each node has a corresponding node in the opposite tree. Hierarchical Balancing Method

  21. Balances domains first, then larger domains. Domain defined as a dimension of a hypercube. Can be expanded to mesh by folding the mesh into sections. Dimensional Exchange Method

More Related