slide1
Download
Skip this Video
Download Presentation
Overprovisioning for Performance Consistency in Grids

Loading in 2 Seconds...

play fullscreen
1 / 25

Overprovisioning for Performance Consistency in Grids - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Overprovisioning for Performance Consistency in Grids. Nezih Yigitbasi and Dick Epema. P arallel and Distributed Systems Group Delft University of Technology. http://guardg.st.ewi.tudelft.nl/. The Problem: Performance inconsistency in grids. Inconsistent performance common in grids

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Overprovisioning for Performance Consistency in Grids' - von


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Overprovisioning

for Performance Consistency

in Grids

Nezih Yigitbasi andDick Epema

Parallel and Distributed Systems Group

Delft University of Technology

http://guardg.st.ewi.tudelft.nl/

the problem performance inconsistency in grids
The Problem: Performance inconsistency in grids
  • Inconsistent performance common in grids
    • bursty workloads
    • variable background loads
    • high rate of failures
    • highly dynamic & heterogeneous environment

How can we provide consistent performance in grids?

Bag-of-Tasks with 128 tasks submitted every 15 minutes

~70X

our goals
Our goals

GOAL-1

Realistic performance evaluation of static and dynamic overprovisioning strategies

(system’s perspective)

GOAL-2

Dynamically determine the overprovisioning factor (Κ) for user specified performance requirements

(user’s perspective)

outline
Outline

Overprovisioning Strategies

Experimental Setup

Results

Dynamically Determining Κ

Conclusions

overprovisioning i
Overprovisioning (I)
  • Increasing the system capacity to provide better, and in particular, consistentperformance even under variable workloads and unexpected demands

Pros

    • simple
    • obviates the need for complex algorithms
    • easy to deploy & maintain

Cons

    • cost-ineffective
    • workloads may evolve (e.g., increasing user base)
    • lowly-utilized systems
overprovisioning ii
Overprovisioning (II)
  • High overprovisioning factors (Κ) are common in modern systems
    • Google: 450,000 (2005)
    • Microsoft: 218,000 (mid-2008)
    • Facebook: 10,000+ (2009)
  • Preferred way of providing performance guarantees
    • typical data center utilization is no more than 15-50%
    • telecommunication systems have ~30% on average

L. A. Barroso and U. Hölzle, The Case for Energy-Proportional Computing, IEEE Computer, December 2007.

slide7

Overprovisioning strategies

Dynamic

Static

Waste

1. Static

    • Largest
    • All
    • Number
    • Where should we deploy the resources?
    • Does it make any difference?

2. Dynamic

    • Dynamic overprovisioning
      • a.k.a. auto-scaling
      • low/high thresholds for acquiring/releasing resources
  • Given Κ, it is straightforward to determine the number of processors for a strategy

Demand

Capacity

Time

outline1
Outline

Overprovisioning Strategies

Experimental Setup

Results

Dynamically Determining Κ

Conclusions

system model
System model

global

queue

local

queues

GRM

  • DAS-3 multi-cluster grid
    • Global Resource Managers (GRM) interacting with Local Resource Managers (LRM)

LRM

LRM

LRM

global job

local jobs

workload
Workload
  • Realistic workloads consisting of Bag-of-Tasks (BoT)
  • Simulations using 10 workloads with 80% load
    • each workload has ~1650 BoTs and ~10K tasks
    • duration of each workload is [1 day-1week]
  • Real background load trace
    • DAS-3 trace of June’08 (http://gwa.ewi.tudelft.nl/)

(Distribution parameters are determined after base-two log transformation)

scheduling model
Scheduling model
  • We consider the following BoT scheduling policies
    • Static Scheduling
      • statically partitions tasks across clusters
    • Dynamic Scheduling
      • takes cluster load into account
      • Dynamic Per Task Scheduling
      • Dynamic Per BoT Scheduling
    • Prediction-based Scheduling
      • average of the last two runtimes for prediction
      • sends the task to the cluster which is predicted to lead to the earliest completion time (ECT)
methodology
Methodology
  • Compare the overprovisioned system with the initial system (NO)
  • For Dynamic
    • 69/129 s and 18/23 s for min/max acquisition/release
    • 60%/70% for low/high thresholds
    • Κvaries over time so for a fair comparison keep it in ± 10% range
traditional performance metrics
Traditional performance metrics

Makespan of a BoT

Difference between the earliest time of submission of any of its tasks, and the latest time of completion of any of its tasks

Normalized Schedule Length (NSL) of a BoT

Ratio of its makespan to the sum of the runtimes of its tasks on a reference processor (slowdown)

Makespan

First task submitted

Last task done

consistency metrics
Consistency metrics
  • We define two metrics to capture the notion of consistency across two dimensions
  • System gets more consistent as Cd gets closer to 1, Cs gets closer to 0
  • A tighter range of the NSL is a sign of better consistency
outline2
Outline

Overprovisioning Strategies

Experimental Setup

Results

Dynamically Determining Κ

Conclusions

performance of scheduling policies
Performance of scheduling policies

Dynamic Per Task

is the best

ECT is the worst

slide17

Performance of different strategies

Different

Strategies

Different Overprovisioning Factors (Κ)

  • Consistency obtained with overprovisioning is much better than the initial system (NO)
  • Static strategies provide similar performance (only K matters)
    • All and Largest are viable alternatives to Number as Number increases the administration, installation, and maintenance costs
  • Dynamic strategy has better performance compared to static strategies
  • K= 2.5 is the critical value
slide18

Cost of different strategies

  • Use CPU-Hours
    • time a processor is used [h]
    • round up a partial instance-hours to one hour similar to the Amazon EC2 on-demand instances pricing model
  • Significant reduction, as high as ~40%, in cost
outline3
Outline

Overprovisioning Strategies

Experimental Setup

Results

Dynamically Determining Κ

Conclusions

determining dynamically
Determining Κ dynamically
  • So far system’s perspective, now user’s perspective
  • How can we dynamically determine Κ given the user performance requirements?
  • We use a simple feedback-control approach to deploy additional resources dynamically to meet user performance requirements
evaluation
Evaluation
  • Simulated DAS-3 without background load
  • ~1.5 month workload consisting of ~33K BoTs
    • Empirically show that the controller stabilizes
  • Average makespan for the workload in the initial system (without the controller) is ~3120 minutes
  • Three scenarios from tight to loose performance requirements
    • [250m-300m]
    • [700m-750m]
    • [1000m-1250m]
results i
Results (I)
  • Significant improvement, as high as ~65%, when the performance requirements are tight
  • ~40%-50% improvement for loose performance requirements
results ii
Results (II)

[700m-750m]

[250m-300m]

[1000m-1250m]

conclusions
Conclusions

GOAL-1: Realistic Performance Evaluation of Different Strategies

  • Overprovisioning improves performance consistency significantly
  • Static strategies provide similar performance (only K matters)
  • Dynamic strategy performs better than the static strategies
  • Need to determine the critical value to maximize the benefit of overprovisioning

GOAL-2: Dynamically Determining Κ for Given User Performance Requirements

  • Feedback-controlled system tuning K dynamically using historical performance data and specified performance requirements
  • The number of BoTs meeting the performance requirements increases significantly, as high as 65%, compared to the initial system
thank you questions comments
Thank you! Questions? Comments?

[email protected]

http://www.st.ewi.tudelft.nl/~nezih/

  • More Information:
    • Guard-g Project: http://guardg.st.ewi.tudelft.nl/
    • PDS publication database: http://www.pds.twi.tudelft.nl
ad