1 / 16

Load Balancing and Stability Issues in Algorithms for Service Composition

Load Balancing and Stability Issues in Algorithms for Service Composition. Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM 2003. Outline. One line comment Motivation Assumed environment & Challenges Proposed Mechanisms & Experiments Critique. One line comment.

kelli
Download Presentation

Load Balancing and Stability Issues in Algorithms for Service Composition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM 2003

  2. Outline • One line comment • Motivation • Assumed environment & Challenges • Proposed Mechanisms & Experiments • Critique

  3. One line comment • Propose mechanisms to perform scalable and stable service composition in a wide-area network

  4. Motivation (1/2) • Service composition • Enables quick & flexible development of new applications • Reuse existing services • Service Scenario 1 France Korea RRS RRS By SKT Many Many Users! Translation

  5. Motivation (2/2) • Service Scenario 2 • Scenario 1 • Many Users  Issues of scale • Scenario 2 • Multimedia Session  Availability (Failure detection & Recovery) Long living multimedia session VoD Server 9시 News Transcoder

  6. Assumed Environment • Services are deployed at service clusters • Mechanisms to handle failures and share load are leveraged • Service clusters have a cluster manager • Perform monitoring & computation required for management • Service clusters form an overlay network • Stretch across the wide-area Internet • Services are deployed by multiple service providers • Service clusters may be spread in many different ASes Exit node

  7. Assumed environment & Challenges • System Characteristics • Wide area service overlay network • Many client sessions • sessions last for a long time • Requirements • Scalability • Balance load among replicas • Stability • Rapid failure detection & recovery

  8. Exit node Proposed Mechanism – Scalability • Load balancing • Load definition & Load balancing mechanism • Metric for load estimation : LIAC (Least Inverse Available Capacity) • Side effect of LIAC • No cost for intermediary nodes • Path length comparison • 8000 paths cost Service 0 Service 1 Exit node Exit node cost cost Exit node

  9. Proposed Mechanism – Scalability • Enhanced metric for load estimation • Assign a cost to all links • Cost: proportional to the AC of the downstream node • Effect of the new metric – shorter path length & good load balancing

  10. Proposed Mechanism – Scalability • Load balancing • Load information dissemination • Propagating load information • Simple periodic flooding : incur load oscillation • Reduce link-state update period? • No! Increase the overhead • On-demand link-state update? • No! add load during an overloaded period

  11. Proposed Mechanism – Scalability • Piggybacking • Feedback load information along the established service path • Low control overhead Service 0 Service 1 Exit node Exit node Exit node Exit node

  12. Exit node Proposed Mechanism – Stability • Failure detection & Recovery • End-to-End recovery • Deliver failure notification to an exit node  reconstruct a service path • Local Recovery • Failure notice  find an alternate path

  13. Proposed Mechanism – Stability • Failure detection & Recovery • Heartbeat mechanism for failure monitoring • 300ms period • Packet losses are correlated within 1 sec • Timeout value • Timeout value to distinguish temporary failures and long term failures • Trade-off between early detection vs. false detection • Empirically found the appropriate value through experiment

  14. Proposed Mechanism – Stability • Failure detection & Recovery • Measured the failure gap period of a wide-area Internet path • Exchange heartbeat for a week • US-Berlin-Austrailia • 1.8 sec for the timeout value • Early detection & acceptable false detection rate

  15. Proposed Mechanism – Stability • Recovery time • End-to-End recovery vs. local recovery • Slightly longer recovery time • Failure notification is the only additional cost • Better path after reconstruction • Globally optimized path

  16. Critique • Strong points • Simplified the problem well • Scalability  load balancing  estimation & propagation of load info • Stability  Failure detection & recovery  timeout value selection • Emulation strategy • Lower cost than real experiment • More realistic compared to simulations • Weak points • Didn’t consider bandwidth in the metric • Target applications are bandwidth sensitive • Only applicable to service paths • There can be requests in the form of graphs • Limitation of piggybacking • Length of composition is limited to 2 • If the length gets longer, path length will be more important • α-value selection should be selected carefully

More Related