Cloud Control with Distributed Rate Limiting

Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, Alex C. Snoeren Defense: Rejaie Johnson, Xian Yi Teng Cloud Control with Distributed Rate Limiting

Outline • Introduction • Classes of Clouds • Limiter Design • Evaluation Methodology • Evaluation • Conclusion

Introduction • Distributed computing today: software as a service • Google Documents • Groove Office • Windows Live • Benefit for users: Easier management • Benefit for service provider: Leverage widely distributed computing infrastructures

Introduction • Barrier: Loss of cost control (how to bill?) • Amazon’s EC2: metered pricing (but customers prefer flat fee) • Flat fee => provider must be able to limit consumption to control costs (but difficult to do in a distributed environment) • Focus: Control aggregate network bandwidth, distributed rate limiting (DRL)

Introduction • Goal: Allow set of distributed traffic rate limiters to collaborate to subject a class of network traffic (e.g. one service) to single, aggregate global limit • Resource provider, 10 hosting centers, limit 100 Mbps, current options: • 100 Mbps each hosting center (might all use this limit simultaneously => 1 Gbps) • 10 Mbps each center (efficient use unlikely unless traffic perfectly balanced)

Introduction • Key challenge: Flows arriving at different limiters should achieve same rates as if they were all traversing a single shared rate limiter • We present illusion of passing all traffic through single token-bucket rate limiter • Key challenge: Measuring demand of aggregate at each limiter, apportioning capacity in proportion to that demand

Classes of Clouds • Limiting cloud-based services • Cloud-based services: Clients see unified service, transparent of independent physical sites • DRL provides providers ability to control network bandwidth as if sourced from single site => no migration necessary, bandwidth gravitates towards sites with most demand

Classes of Clouds • Content distribution networks • Content replication of third-party web sites at numerous geographically diverse locations, improve performance, scalability, reliability • With DRL, CDNs can set per-customer limits based on service-level agreements • Protective mechanism to rate limit nefarious users

Classes of Clouds • Internet testbeds • Planetlab currently has bandwidth limits at each individual site, cannot do across multiple machines • DRL provides effective limits for Planetlab service distributed across North America

Classes of Clouds • Assumptions and scope: • No QoS guarantees • Can identify traffic belonging to particular service • Discussion in single service without loss of generality

Limiter Design • Peer-to-peer limiter architecture • Tasks: • Estimation • Communication • Allocation • Periodically measure traffic arrival rate, communicate to other limiters, receive rates from other limiters, computes estimate of global rate, determine how to service local demand to enforce global rate

Limiter Design • Estimation: compute average arrival rate over fixed time intervals, use exponentially-weighted moving average (EWMA) filter to smooth out short-term fluctuations (settings determined later) • At the end of each estimate interval, local changes merged with global estimate, and each limiter disseminates local changes to other limiters – gossip protocol used with UDP

Limiter Design • Allocation • Global token bucket (GTB) • Global random drop (GRD) • Flow proportional share (FPS)

Limiter Design • Global token bucket

Token Bucket • Common trick used to control amount of data injected into network, allowing bursts • There is a bucket that can hold limited number of tokens • Tokens are added to bucket at some rate • If token comes when bucket is full, it is discarded • When packet arrives, some number of tokens removed, packet is sent to network • Packet arrives when bucket is empty => dropped

Limiter Design • Global token bucket • Emulate centralized token bucket • Each limiter’s token bucket refreshes at global rate • At every interval, local rate computed and sent, obtain local rates from other limits, sum, removes tokens at this global rate • Highly sensitive to stale observations, impractical at large scale or in lossy networks

Limiter Design • Global random drop • Instead of emulating central limiter, emulate drop rate of centralized case • Same as before, collect demand from other limiters, then compute drop probability – proportional to (demand-limit) • Is better over longer periods of time, does not capture short-term effects

Limiter Design • Flow proportional share

Evaluation Methodology • 3 metrics: Utilization, flow fairness, responsiveness • Basic goal: hold aggregate throughput across all limiters below global limit • Achieve fairness equal to or better than that of centralized token bucket limiter

Evaluation Methodology • Evaluation on emulation testbed with ModelNet • Simple mesh topology to connect limiters • Each source and sink pair routed through single limiter • 100 Mbps links

Evaluation • Flow Dynamics • FPS only requires updates as flows arrive depart, or change their behavior • Baseline • Loaded Limiters with 10 unbottledneck TCP flows • Chose a 3-7 skew • Aggregate apportioned between limiters in about to 3-7 split.

Evaluation • Mixed TCP flow round-trip times • FPS provides a higher degree of fairness between RTT’s • Traffic Distributions • Evaluated the effects of varying traffic demands • Bottlenecked TCP flows • Have the ability of FPS to correctly allocate rate across aggregates of bottlenecked and unbottlenecked flows.

Conclusion • Demands on traditional Web-hosting and ISP’s are likely to shift • Our experiments show that naïve implementations are unable to deliver adequate levels of fairness. • Our results demonstrate that it’s possible to recreate the flow behavior that end users expect from a centralized rate limiter.

Cloud Control with Distributed Rate Limiting