Load Balancing and Stability Issues in Algorithms for Service Composition

Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM 2003

Outline • One line comment • Motivation • Assumed environment & Challenges • Proposed Mechanisms & Experiments • Critique

One line comment • Propose mechanisms to perform scalable and stable service composition in a wide-area network

Motivation (1/2) • Service composition • Enables quick & flexible development of new applications • Reuse existing services • Service Scenario 1 France Korea RRS RRS By SKT Many Many Users! Translation

Motivation (2/2) • Service Scenario 2 • Scenario 1 • Many Users  Issues of scale • Scenario 2 • Multimedia Session  Availability (Failure detection & Recovery) Long living multimedia session VoD Server 9시 News Transcoder

Assumed Environment • Services are deployed at service clusters • Mechanisms to handle failures and share load are leveraged • Service clusters have a cluster manager • Perform monitoring & computation required for management • Service clusters form an overlay network • Stretch across the wide-area Internet • Services are deployed by multiple service providers • Service clusters may be spread in many different ASes Exit node

Assumed environment & Challenges • System Characteristics • Wide area service overlay network • Many client sessions • sessions last for a long time • Requirements • Scalability • Balance load among replicas • Stability • Rapid failure detection & recovery

Exit node Proposed Mechanism – Scalability • Load balancing • Load definition & Load balancing mechanism • Metric for load estimation : LIAC (Least Inverse Available Capacity) • Side effect of LIAC • No cost for intermediary nodes • Path length comparison • 8000 paths cost Service 0 Service 1 Exit node Exit node cost cost Exit node

Proposed Mechanism – Scalability • Enhanced metric for load estimation • Assign a cost to all links • Cost: proportional to the AC of the downstream node • Effect of the new metric – shorter path length & good load balancing

Proposed Mechanism – Scalability • Load balancing • Load information dissemination • Propagating load information • Simple periodic flooding : incur load oscillation • Reduce link-state update period? • No! Increase the overhead • On-demand link-state update? • No! add load during an overloaded period

Proposed Mechanism – Scalability • Piggybacking • Feedback load information along the established service path • Low control overhead Service 0 Service 1 Exit node Exit node Exit node Exit node

Exit node Proposed Mechanism – Stability • Failure detection & Recovery • End-to-End recovery • Deliver failure notification to an exit node  reconstruct a service path • Local Recovery • Failure notice  find an alternate path

Proposed Mechanism – Stability • Failure detection & Recovery • Heartbeat mechanism for failure monitoring • 300ms period • Packet losses are correlated within 1 sec • Timeout value • Timeout value to distinguish temporary failures and long term failures • Trade-off between early detection vs. false detection • Empirically found the appropriate value through experiment

Proposed Mechanism – Stability • Failure detection & Recovery • Measured the failure gap period of a wide-area Internet path • Exchange heartbeat for a week • US-Berlin-Austrailia • 1.8 sec for the timeout value • Early detection & acceptable false detection rate

Proposed Mechanism – Stability • Recovery time • End-to-End recovery vs. local recovery • Slightly longer recovery time • Failure notification is the only additional cost • Better path after reconstruction • Globally optimized path

Critique • Strong points • Simplified the problem well • Scalability  load balancing  estimation & propagation of load info • Stability  Failure detection & recovery  timeout value selection • Emulation strategy • Lower cost than real experiment • More realistic compared to simulations • Weak points • Didn’t consider bandwidth in the metric • Target applications are bandwidth sensitive • Only applicable to service paths • There can be requests in the form of graphs • Limitation of piggybacking • Length of composition is limited to 2 • If the length gets longer, path length will be more important • α-value selection should be selected carefully

Load Balancing and Stability Issues in Algorithms for Service Composition

Load Balancing and Stability Issues in Algorithms for Service Composition

Presentation Transcript

Load Balancing Part 1: Dynamic Load Balancing

Load Balancing and Intelligent Load Balancing

Load Balancing for Parallel Forwarding

Load Sharing and Balancing

Load Balancing

LOAD AND STABILITY PROGRAM

Load Balancing

Dynamic Topology Aware Load Balancing Algorithms for MD Applications

Load Balancing in Charm++

Load balancing

Partitioning and Load-Balancing in Trilinos

Load Balancing

Load balancing

Dynamic Topology Aware Load Balancing Algorithms for MD Applications

Load-Balancing

Load Balancing

Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems

Clustering and Load Balancing

Load Balancing

Support for Load Balancing in 802.11v