1 / 38

Managing Cloud Resources: Distributed Rate Limiting

Managing Cloud Resources: Distributed Rate Limiting. Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda , Barath Raghavan , Kashi Vishwanath , Sriram Ramabhadran , and Kenneth Yocum Building and Programming the Cloud Workshop 13 January 2010. Hosting with a single physical presence

hewitt
Download Presentation

Managing Cloud Resources: Distributed Rate Limiting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, BarathRaghavan, KashiVishwanath, SriramRamabhadran,and Kenneth Yocum Building and Programming the Cloud Workshop 13 January 2010

  2. Hosting with a single physical presence However, clients are across the Internet Centralized Internet services Mysore-Park Cloud Workshop – 13 January 2010

  3. Cloud-based services • Resources and clients distributed across the world • Often incorporates resources from multiple providers Windows Live Mysore-Park Cloud Workshop – 13 January 2010

  4. Resources in the Cloud • Distributed resource consumption • Clients consume resources at multiple sites • Metered billing is state-of-the-art • Service “punished” for popularity • Those unable to pay are disconnected • No control of resources used to serve increased demand • Overprovision and pray • Application designers typically cannot describe needs • Individual service bottlenecks varied but severe • IOps, network bandwidth, CPU, RAM, etc. • Need a way to balance resource demand Mysore-Park Cloud Workshop – 13 January 2010

  5. Two lynchpins for success • Need a way to control and manage distributed resources as if they were centralized • All current models from OS scheduling and provisioning literature assume full knowledge and absolute control • (This talk focuses specifically on network bandwidth) • Must be able to efficiently support rapidly evolving application demand • Balance the resource needs to hardware realization automatically without application designer input • (Another talk if you’re interested) Mysore-Park Cloud Workshop – 13 January 2010

  6. Ideal: Emulate a single limiter • Make distributed feel centralized • Packets should experience the same limiter behavior Limiters S D 0 ms S D 0 ms 0 ms S D Mysore-Park Cloud Workshop – 13 January 2010

  7. Accuracy (how close to K Mbps is delivered, flow rate fairness) + Responsiveness (how quickly demand shifts are accommodated) Vs. Communication Efficiency (how much and often rate limiters must communicate) Engineering tradeoffs Mysore-Park Cloud Workshop – 13 January 2010

  8. An initial architecture Limiter 1 Estimate interval timer Limiter 2 Packet arrival Gossip Limiter 3 Enforce limit Estimate local demand Gossip Gossip Limiter 4 Set allocation Global demand Mysore-Park Cloud Workshop – 13 January 2010

  9. Token bucket limiters Packet Token bucket, fill rate K Mbps Mysore-Park Cloud Workshop – 13 January 2010

  10. A global token bucket (GTB)? Demand info (bytes/sec) Limiter 1 Limiter 2 Mysore-Park Cloud Workshop – 13 January 2010

  11. A baseline experiment Limiter 1 3 TCP flows Single token bucket D S 10 TCP flows D S Limiter 2 7 TCP flows S D Mysore-Park Cloud Workshop – 13 January 2010

  12. GTB performance Global token bucket Single token bucket 10 TCP flows 7 TCP flows 3 TCP flows Problem: GTB requires near-instantaneous arrival info Mysore-Park Cloud Workshop – 13 January 2010

  13. Take 2: Global Random Drop Limiters send, collect global rate info from others 5 Mbps (limit) 4 Mbps (global arrival rate) Case 1: Below global limit, forward packet Mysore-Park Cloud Workshop – 13 January 2010

  14. Global Random Drop (GRD) 6 Mbps (global arrival rate) 5 Mbps (limit) Same at all limiters Case 2: Above global limit, drop with probability: Excess / Global arrival rate = 1/6 Mysore-Park Cloud Workshop – 13 January 2010

  15. GRD baseline performance Global token bucket Single token bucket 10 TCPflows 7 TCP flows 3 TCP flows Delivers flow behavior similar to a central limiter Mysore-Park Cloud Workshop – 13 January 2010

  16. GRD under dynamic arrivals Mysore-Park Cloud Workshop – 13 January 2010 (50-ms estimate interval)

  17. Returning to our baseline Limiter 1 3 TCP flows D S Limiter 2 7 TCP flows D S Mysore-Park Cloud Workshop – 13 January 2010

  18. Basic idea: flow counting Goal: Provide inter-flow fairness for TCP flows Local token-bucket enforcement “3 flows” “7 flows” Limiter 1 Limiter 2 Mysore-Park Cloud Workshop – 13 January 2010

  19. Estimating TCP demand Localtoken rate (limit) = 10 Mbps 1 TCP flow S Flow A = 5 Mbps 1 TCP flow S Flow B = 5 Mbps Flow count = 2 flows Mysore-Park Cloud Workshop – 13 January 2010

  20. FPS under dynamic arrivals Mysore-Park Cloud Workshop – 13 January 2010 (500-ms estimate interval)

  21. Comparing FPS to GRD FPS(500-ms est. int.) GRD(50-ms est. int.) • Both are responsive and provide similar utilization • GRD requires accurate estimates of the global rate at all limiters. Mysore-Park Cloud Workshop – 13 January 2010

  22. Estimating skewed demand Limiter 1 1 TCP flow S D 1 TCP flow S Limiter 2 3 TCP flows D S Mysore-Park Cloud Workshop – 13 January 2010

  23. Estimating skewed demand Localtoken rate (limit) = 10 Mbps Flow A = 8 Mbps Bottleneckedelsewhere Flow B = 2 Mbps Flow count ≠ demand Key insight: Use a TCP flow’s rate to infer demand Mysore-Park Cloud Workshop – 13 January 2010

  24. Estimating skewed demand Localtoken rate (limit) = 10 Mbps Flow A = 8 Mbps Bottleneckedelsewhere Flow B = 2 Mbps Local Limit Largest Flow’s Rate 10 8 = 1.25 flows = Mysore-Park Cloud Workshop – 13 January 2010

  25. FPS example Global limit = 10 Mbps Limiter 1 Limiter 2 1.25flows 3flows Global limit x local flow count Total flow count Set local token rate = 10 Mbps x 1.25 1.25 + 3 = = 2.94 Mbps Mysore-Park Cloud Workshop – 13 January 2010

  26. FPS bottleneck example Mysore-Park Cloud Workshop – 13 January 2010 Initially 3:7 split between 10 un-bottlenecked flows At 25s, 7-flow aggregate bottlenecked to 2 Mbps At 45s, un-bottlenecked flow arrives: 3:1 for 8 Mbps

  27. Real world constraints • Resources spent tracking usage is pure overhead • Efficient implementation (<3% CPU, sample & hold) • Modest communication budget (<1% bandwidth) • Control channel is slow and lossy • Need to extend gossip protocols to tolerate loss • An interesting research problem on its own… • The nodes themselves may fail or partition • In an asynchronous system, you cannot tell the difference • Need to have a mechanism that deals gracefully with both Mysore-Park Cloud Workshop – 13 January 2010

  28. Robust control communication Mysore-Park Cloud Workshop – 13 January 2010 7 Limiters enforcing 10 Mbps limit Demand fluctuates every 5 sec between 1-100 flows Varying loss on the control channel

  29. Handling partitions Mysore-Park Cloud Workshop – 13 January 2010 Failsafe operation: each disconnected group k/n Ideally: Bank-o-mat problem (credit/debit scheme) Challege: group membership with asymmetric partitions

  30. Following PlanetLab demand • Apache Web servers on 10 PlanetLab nodes • 5-Mbps aggregate limit • Shift load over time from 10 nodes to 4 5 Mbps Mysore-Park Cloud Workshop – 13 January 2010

  31. Current limiting options Demands at 10 apache servers on Planetlab Wasted capacity Demand shifts to just 4 nodes Mysore-Park Cloud Workshop – 13 January 2010 31

  32. Applying FPS on PlanetLab Mysore-Park Cloud Workshop – 13 January 2010 32

  33. Hierarchical limiting Mysore-Park Cloud Workshop – 13 January 2010

  34. A sample use case Mysore-Park Cloud Workshop – 13 January 2010 • T 0: A: 5 flows at L1 • T 55: A: 5 flows at L2 • T 110: B: 5 flows at L1 • T 165: B: 5 flows at L2

  35. Worldwide flow join Mysore-Park Cloud Workshop – 13 January 2010 • 8 nodes split between UCSD and Polish Telecom • 5 Mbps aggregate limit • A new flow arrives at each limiter every 10 seconds

  36. Worldwide demand shift Mysore-Park Cloud Workshop – 13 January 2010 • Same demand-shift experiment as before • At 50 sec, Polish Telecom demand disappears • Reappears at 90 sec.

  37. Where to go from here • Need to “let go” of full control, make decisions with only a “cloudy” view of actual resource consumption • Distinguish between what you know and what you don’t know • Operate efficiently when you know you know. • Have failsafe options when you know you don’t. • Moreover, we cannot rely upon application/service designers to understand their resource demands • The system needs to dynamically adjust to shifts • We’ve started to manage the demand equation • We’re now focusing on the supply side: custom-tailored resource provisioning. Mysore-Park Cloud Workshop – 13 January 2010

More Related