1 / 37

Bridging the Tenant-Provider Gap in Cloud Services

Bridging the Tenant-Provider Gap in Cloud Services. Virajith Jalaparti , Hitesh Ballani , Paolo Costa Thomas Karagiannis , Ant Rowstron. Today’s Interface to the Cloud. Resource-centric Interface “I want 100 small VMs” Per-VM Per-Hour pricing E.g.: $0.08 per hour in Amazon EC2.

toviel
Download Presentation

Bridging the Tenant-Provider Gap in Cloud Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bridging the Tenant-Provider Gap in Cloud Services VirajithJalaparti, Hitesh Ballani, Paolo Costa Thomas Karagiannis, Ant Rowstron

  2. Today’s Interface to the Cloud • Resource-centric Interface • “I want 100 small VMs” • Per-VM Per-Hour pricing • E.g.: $0.08 per hour in Amazon EC2 Simple but problematic! Bridging the Tenant-Provider Gap in Cloud Services

  3. Using the Resource-centric Interface • Cloud Provider Thrs, $40T Job 2T hrs,$80T!! 40 VMs Done in Thrs Private Cluster [40 machines] User Unpredictable/Variable Performance and Costs Bridging the Tenant-Provider Gap in Cloud Services

  4. Proposal: Job-centric Interface • Provider determines resources to use to meet tenants’ goal • VMs, Network Bandwidth etc. • Tenant specifies high-level goals they care about • Completion Time, Cost to run a job etc. • Cloud Provider Finish in Thrs Done in Thrs!! Dedicated Resources User Bridging the Tenant-Provider Gap in Cloud Services

  5. Proposal: Job-centric Interface Guaranteed performance for tenants Bridging the Tenant-Provider Gap in Cloud Services

  6. Proposal: Job-centric Interface Incentive for provider? Exploit multi-resource tradeoff Resource Trade-off Curve LinkGraph in 300sec Increases Goodput/Revenue <N,B> = <10, 150> <N,B> = <20, 100> Bridging the Tenant-Provider Gap in Cloud Services

  7. Outline • Motivation • Job-centric Interface • Multi-resource tradeoff • Bazaar: A Job-centric Cloud Framework • Performance Prediction • Resource Selection • Evaluation • Bazaar: Extensions and Opportunities • Conclusion Bridging the Tenant-Provider Gap in Cloud Services

  8. Bazaar: A Job-centric Cloud Framework <N1,B1> <N2,B2> … <Nk,Bk> Job Specification Resource tuple Performance Prediction Resource Selection Bazaar <N,B> Completion Time User Datacenter State • Focus on MapReduceApplications • Focus on two resources: Compute (N) and Bandwidth (B) • Notation: <N,B> denotes resources allocated Bridging the Tenant-Provider Gap in Cloud Services

  9. Performance Prediction • Well studied area • Run-time profiling, Static analysis, Simulations • Bazaar requirements • Fast prediction (trades-off with accuracy) • Account for Network along with Compute • Not addressed by Jockey, MRPerf, Aria • Dedicated N and B makes the problem tractable Bridging the Tenant-Provider Gap in Cloud Services

  10. MRCuTE: Performance Prediction in Bazaar Analytical Modeling + Profiling based approach Resource Parameters <N, B> MRCuTE Analytical Model Completion Time Profiler Job Specifications Program (P), Input data (I), Sample Data (Is) Job Specific Map • Completion Time determined by • Input data size • Rate of progress Reduce Map Program (P) profiled using sample data (Is) on one machine Reduce Map Map Phase Shuffle Phase Reduce Phase Bridging the Tenant-Provider Gap in Cloud Services

  11. Resource Prediction • MRCuTE( P, I, Is,N, B )  Completion Time • N = MRCuTE-1( P, I, Is, Completion Time , B) User specified Provider can determine multiple <N, B> resource tuples B1 N1 B2 N2 B3 N3 … Which <N,B> to use? Bridging the Tenant-Provider Gap in Cloud Services

  12. Resource Selection Which <N,B> tuple maximizes the provider’s ability to accept future requests? Increases Goodput/Revenue Bridging the Tenant-Provider Gap in Cloud Services

  13. Resource Selection: Example R1 = <3 VMs, 500Mbps> or R2 = <6 VMs, 200Mbps> <4 VMs, 400Mbps> TOR TOR Replica 1 Replica 2 Select the resource tuple leading to better goodput 4 Physical machines 2 VM slots each Greedy packing allocation algorithm : Oktopus [Sigcomm’11] Replica 1 will accept more requests than Replica 2 Bridging the Tenant-Provider Gap in Cloud Services

  14. Resource Selection • Similar to Multi-dimensional Bin Packing • Heuristic: Minimize Resource Imbalance Metric • Select <N,B> which balances the remaining capacity across resources Bridging the Tenant-Provider Gap in Cloud Services

  15. Outline • Motivation • Job-centric Interface • Resource Malleability • Bazaar: A Job-centric Cloud Framework • Performance Prediction • Resource Selection • Evaluation • Bazaar: Extensions and Opportunities • Conclusion Bridging the Tenant-Provider Gap in Cloud Services

  16. Evaluation • MRCuTE: Prediction accuracy • Benefits of Bazaar • Testbed Deployment • Simulations Bridging the Tenant-Provider Gap in Cloud Services

  17. MRCuTE: Prediction Accuracy • Setup: Hadoop on 35-node Emulabcluster Sort with 200GB of random data Average prediction error = 8.9% Bridging the Tenant-Provider Gap in Cloud Services

  18. MRCuTE: Prediction Accuracy 5 MapReduce Jobs Average Error < 12% Overcome prediction inaccuracies using slack Bridging the Tenant-Provider Gap in Cloud Services

  19. Evaluation: Benefits of Bazaar • Metrics • Fraction of rejected/accepted requests • Datacenter Goodput • Strategies • Bazaar: Select <N,B> using resource imbalance metric • Baseline: Select <N,B> randomly • Workload • Poisson job arrival process with a target arrival rate Bridging the Tenant-Provider Gap in Cloud Services

  20. Bazaar: Testbed Deployment • Working prototype on 26 node Emulab cluster • Workload: 100 Sort Jobs 15.5% more 11.4% more Bridging the Tenant-Provider Gap in Cloud Services

  21. Bazaar: Simulations • Datacenter scale: 16,000 machines • Cross-validated using testbed Bazaar is ~50% better than baseline Operational occupancy range for services like Amazon EC2 is 70-80% Requests arrive faster Bridging the Tenant-Provider Gap in Cloud Services

  22. Outline • Motivation • Job-centric Interface • Resource Malleability • Bazaar: A Job-centric Cloud Framework • Performance Prediction • Resource Selection • Evaluation • Bazaar: Extensions and Opportunities • Conclusion Bridging the Tenant-Provider Gap in Cloud Services

  23. Bazaar-T: An extension of Bazaar • Bazaar trades-off N and B • Finish jobs “on time” • Bazaar-T: Exploits flexibility with time • Finish jobs “before time” • More resources available in future • Extend resource imbalance metric to time domain SOCC 2012 - Bazaar

  24. Bazaar-T: More Flexibility, More Gains Bazaar vs. Bazaar-T Bazaar-T has 10-20% more goodput than Bazaar Bridging the Tenant-Provider Gap in Cloud Services

  25. Bazaar: Pricing implications • Today: Resource-based pricing • E.g: Using 20 VMs for 4hrs costs $80 • Extendable to multiple resources • No incentive for provider to finish in time • Bazaar enables job-based pricing • E.g.: Finish Sort over 200GB in 4hrs costs $100 • Tenants pay based on job characteristics • Aligns tenant and provider interests Bridging the Tenant-Provider Gap in Cloud Services

  26. Conclusion • Bazaar: Job-centric Framework for MapReduce • Win-win situation for provider and tenant • Tenants get predictable performance • Providers get increased revenue • Provides new avenues for pricing Bridging the Tenant-Provider Gap in Cloud Services

  27. Thank You! Bridging the Tenant-Provider Gap in Cloud Services

  28. Back-up Slides Bridging the Tenant-Provider Gap in Cloud Services

  29. Related Work • Performance Prediction • MRPerf [Mascots 2009], Mumak • Detailed Simulations • Elastisizer[SOCC 2011] • Detailed Modeling of MapReduce • SLOs • Jockey [Eurosys 2012]: • Simulations; Runtime monitoring to meet deadline • Conductor [NSDI 2012] • Solves optimization problem to meet goals • Proteus [Sigcomm 2012]: • Time varying network reservations • Aria [ICAC 2011] • Profiling and modeling Bridging the Tenant-Provider Gap in Cloud Services

  30. Hadoop Jobs Details Bridging the Tenant-Provider Gap in Cloud Services

  31. MRCuTE: Profiling Time Bridging the Tenant-Provider Gap in Cloud Services

  32. MRCuTE: Accounting for heterogeneity Bridging the Tenant-Provider Gap in Cloud Services

  33. Addressing Skew- Slack % of Late Jobs vs. Slack % of Rejected Requests vs. Slack Bridging the Tenant-Provider Gap in Cloud Services

  34. Goodput vs. Oversubscription Bridging the Tenant-Provider Gap in Cloud Services

  35. Rejected Requests vs. Mean BW Bridging the Tenant-Provider Gap in Cloud Services

  36. Rejected requests vs. Occupancy Bridging the Tenant-Provider Gap in Cloud Services

  37. Bazaar vs. Fair Sharing Bridging the Tenant-Provider Gap in Cloud Services

More Related