1 / 38

Powerpoint Templates

Autonomic Mix-Aware Provisioning on Amazon EC2 Tony Schneider, Derek Bender, Andrew Hahn Based on R. Singh et. al [1]. Powerpoint Templates. Powerpoint Templates. Introduction. Our Goals Background Implementation Results Future Work. Powerpoint Templates.

alden
Download Presentation

Powerpoint Templates

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Autonomic Mix-Aware Provisioning on Amazon EC2 Tony Schneider, Derek Bender, Andrew Hahn Based on R. Singh et. al [1] Powerpoint Templates Powerpoint Templates

  2. Introduction • Our Goals • Background • Implementation • Results • Future Work Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  3. Main Idea • Implement Singh et. al. • Gain some experience with EC2 • Implement a functional provisioning system • ...And gain some additional insight of SMCS • Full Disclosure: Not quite done yet (but close!) Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  4. Why? • Implement Singh et. al. • Deals primarily with workload provisioning • Recurring idea • Good entrance to the field • Promising theory Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  5. Why? • EC2 • Unknown quantity • Gain some idea towards its efficacy • How it works • What works and what doesn’t • Cost Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  6. Mix-aware provisioning • Many systems only reflect absolute number of requests • Ignores type of requests • Higher volume in requests ≠ higher demand (always) • Mix-aware solves this oversight Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  7. Mix-aware provisioning • So what is it? • Find frequencies of different request types • Long versus Short requests • Increase in long requests requires more resources • Uses these categories to determine “true” workload Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  8. Mix-aware provisioning • (Brief) example: • Short Request: 1ms, Long: 90ms • 100 Requests/Sec • 90short*1ms + 10long*90ms = 990ms • vs. • 10short*1ms + 90long*90ms = 3610ms Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  9. Amazon EC2 • Cloud computing platform • Scales with demand and application size • Applicable to wide array of uses • Some Jargon: • AMI (Amazon Machine Image) • Micro, Small, Large etc. instance Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  10. Amazon EC2 • How it works • Select an AMI (or make one) • VMs in EC2 automatically boot with the image • VMs hosted across several physical machines • Variety of options for provisioning, load balancing, etc. Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  11. K-Means Algorithm • Used to characterize mixes • Groups similarly sized workloads • Clusters used to determine capacity provisioning • Quick and dirty way to estimate service classes • Problem: Number of clusters required prior to running Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  12. K-Means Algorithm • Step 1: • Use unique request types to partition the request types into clusters • Step 2: • Adjust for tail-heavy workloads • Step 3: • Compute actual cluster means • Step 4: • Update cluster means Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  13. Step 1: Find N Clusters • Problem: Number of clusters required prior to running • Solution: Run K-means for every possible K, determine an optimal value • Optimal value based on variance • Can be somewhat slow • May not be necessary to run for all • K < N Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  14. Step 1: Find N Clusters • Variance between clusters: • Maximized • Provides “distinctness” between service classes • Variance within clusters: • Minimized • Ensures like elements grouped in correct service class • Assumes that variance is a good model for the service class Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  15. Step 2: Redistribute Loads • Tail ends of clusters often quite big • Caused by lots of infrequent but large service times • Causes too much change in a cluster mean when frequency changes • Alleviate naïvely • Force kth cluster to have at most n service classes Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  16. Step 3: Compute Means • Use all service times in a cluster • Not just the unique ones! • Offers a more accurate modeling of the cluster mean Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  17. Step 4: Recompute Means • System may begin to differ from starting to state • Need to recompute means on occasion (cluster centroids) • Not drastic changes (5%) • Generally not necessary Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  18. K-Means Implementation • Python • Runs once at the beginning of the server life • Successive runs simply generate the new cluster means • Number of clusters always stays constant • Works fairly well (but not perfectly) Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  19. Load Balancer • AWS Elastic Load Balancing • $0.025 per hour • $0.008 per GB • Easy setup • But not capable of advanced logging Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  20. Load Balancer • EC2 node running HAProxy • $0.02 per hour (or free!) • $0.150 per GB (first GB free!) • Non-trivial configuration • Advanced logging features • syslog Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  21. HAProxy • HAProxy chosen for more control • Configuration • Log all connection to syslog • Roundrobin dns balancing • Leave stats page on for monitoring • haproxy.cfg can be hot-reloaded Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  22. Web Server • Apache webserver running on all nodes • phusion_passenger (modrails.com) • Serve ruby applications in apache • Sinatra (sinatrarb.com) Sinatra is a Ruby DSL for quickly creating web applications: require 'sinatra' get '/' do 'Hello world!' end Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  23. Web Server • Short request: • get '/short' do • 'Hello world!'end • Long request: • get '/long' do • 10.times do BCrypt::Password.create("foo") end • 'done!'end Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  24. Web Server • BCrypt::Password.create("foo") • BCrypt • “Basically, it's slow as hell.” • Blowfish encryption variant • Hashing ten times takes ~800ms Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  25. Integration • Putting it all together: • EC2 launches our AMI • Pre-loaded with HAP and the K-means algorithm • Designated the “master” node • Waits for 200 requests, then executes the K-means to find the typical mix Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  26. Integration • After the cluster centroids are returned: • Ruby script (front-end) takes over • Tracks... • Inter-arrival time • Request rate • Number of requests per cluster • Service times • Information passed to provisioning algorithm to calculate the number of servers Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  27. Integration • Data Parsing • Metrics difficult to extract • /haproxy?stats helpful but doesn’t have everything • exports to a nice CSV file! • syslog used to log all haproxy requests but extensive data parsing needed • acquiring all metrics isn’t instantaneous Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  28. Integration • Ruby integration with AWS • Fog (ruby gem, library) • http://fog.io • allows for full access to AWS through ruby server = AWS.servers.create(:image_id => 'ami-5ee70037') Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  29. Provisioning 1. Estimate arrival rate per cluster: Use previous sampling periods request rate and percentage of requests ƛt = Request rate of entire cluster Pi[t] = Percent of requests at cluster i ƛi [t-1]= Pi [t-1] / ƛt-1 Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  30. Provisioning 2. Predict Capacity Probabilistically determine waiting time in the queue σ2a = inter-arrival time variance σ2b = service time variance ƛ = request arrival rate X = average service time y = SLA guarantee (seconds) Max rate of requests per server : Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  31. Provisioning 2.5: How many servers do we need? Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  32. Provisioning • 3: Applying this configuration • Currently adding only a single type of Amazon machine • Larger machines more expensive • Would be more difficult to configure • Use a basic round robin assignment system • EC2 offers their own load balancer Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  33. Results • Unfortunately, nothing discrete yet... • However: • Revised k-means works very well • Converges in fewer than 100 iterations (generally) • Variance may not be the best predictor Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  34. Results • What’s the hold up? • Log parsing significantly harder than we anticipated • Getting the raw data is tedious • Hard to acquire sensibly • Robustness proving hard to guarantee • Integration difficult because architecture was hard to plan in advance • EC2 Free tier - For new customers only Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  35. Results • What’s the hold up? • The amount of sysadmin footwork is enormous and very tedious • Constrained by always trying to find the free route • Amazon AMI is the only free one • Unfamiliarity with Amazon Linux distro Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  36. Results • EC2 • Thus far, feelings are mixed • Works well enough • Easily customizable • AMIs are readily available for use • Seems reliable (!) • However, everything comes with a price Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  37. Future Work • Finish? • Current task: Finish provisioning script, integrate all the components • Test with demands and rates of various sizes • Different machine sizes? Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

  38. Future Work • Questions? Powerpoint Templates Our Goals •Background• Implementation • Results • Future Work

More Related