Power aware consolidation of scientific workflows in virtualized environments
This presentation is the property of its rightful owner.
Sponsored Links
1 / 36

Power-aware Consolidation of Scientific Workflows in Virtualized Environments PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on
  • Presentation posted in: General

Power-aware Consolidation of Scientific Workflows in Virtualized Environments. Qian Zhu, Jiedan Zhu, Gagan Agrawal Presented by Bin Ren. 1. 3. 4. 5. 2. Background. Key points analysis. Design of pSicMapper. Experimental Evaluation. Motivating Applications. Outlines. 6.

Download Presentation

Power-aware Consolidation of Scientific Workflows in Virtualized Environments

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Power aware consolidation of scientific workflows in virtualized environments

Power-aware Consolidation of Scientific Workflows in Virtualized Environments

Qian Zhu, Jiedan Zhu, Gagan Agrawal

Presented by Bin Ren


Outlines

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Outlines1

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Background

Background

  • Scientific Workflow

  • Computing process model;

  • Input:

  • Various data;

  • Output:

  • Data product presentation and virtualization;

  • An important feature:

  • Different workflow modules may have different resource requirements;

1


Background1

Background

  • Cloud Computing

  • Some scientific workflows have been experimented on cloud environments;

  • Two important characteristics:

    • Virtualization technologies;

    • Pay-as-you-go model.

    • An important issue – tradeoff between:

    • Power consumption;

    • Performance.

2


Background2

Background

  • Topic of this work

  • Focus on: Effective energy and resource costs management for scientific workflows;

  • Goal: How to minimize the total power consumption and resource costs without a substantial degradation in performance;

  • Strategy: Consolidation of workflow tasks

    Why can we put tasks together without big performance degradation?

    Why this strategy can save power?

    How to decide which tasks can be put together?

3


Outlines2

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Motivating applications

Motivating Applications

  • Great Lake Forecasting System (GLFS)

  • Used to forecast the meteorological conditions of Lake Erie;

  • Directed acyclic graph;

  • Compute-intensive application;

  • An important feature:

  • Different workflow modules may have different resource requirements/Usage;

4


Motivating applications1

Motivating Applications

  • Resource usage of GLFS tasks

    Task 1:

5


Motivating applications2

Motivating Applications

  • Resource usage of GLFS tasks

    Task 2:

6


Motivating applications3

Motivating Applications

  • Analysis on Resource usage of GLFS tasks

  • The scientific workflow tasks have a periodic behavior with respect to CPU, memory, disk and network usage;

  • The resource usage of a workflow task is significantly smaller than its peak value for more than 80% of the time;

  • The resource consumption can be dependent on the application parameters and the characteristics of the host server.

7


Outlines3

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Key points analysis

Key Points Analysis

  • Resource Usage and Power Consumption;

  • Virtualization Technologies usage and power consumption;

  • Consolidation and power consumption.

    Two important explanations:

    1. Unit power denotes the power consumption every 1 second;

    2. When the system is idle, there is still a power consumption of 32watt.

8


Key points analysis1

Key Points Analysis

  • Resource Usage and Power Consumption

  • The CPU usage: the more CPU workload, the more unit power consumption. However, they are not proportional;

  • The memory usage: similar to CPU (important difference: cache misses ).

  • Disk and Network IO: roughly say, they add a constant unit power consumption to the whole one.

9


Key points analysis2

Key Points Analysis

  • Virtualization and Power Consumption

Using Virtualization technology doesn’t incur too much overhead on power consumption for our work.

10


Key points analysis3

Key Points Analysis

  • Consolidation and Power Consumption

  • There is a multiplication relationship between these two figures:

  • Nor_Power_Con = U_app1 * T_app1 + U_app2 * T_app2;

  • Con_Power_Con = U_con * T_con;

11


Key points analysis4

Key Points Analysis

  • Consolidation and Power Consumption

  • Important observations:

  • 1. Consolidation of dissimilar resource requirement workloads incurs a small slowdown in execution time and saves a large amount of total power;

  • 2. Consolidation of similar resource requirement workloads incurs a large slowdown in execution time, and the power consumption may not be decreased due to the longer execution time.

  • PS: The unit power consumption increasing of MM is partially coming from the cache misses.

12


Outlines4

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Design of pscimapper

Design of pSciMapper

  • Overview of the whole system

13


Design of pscimapper1

Design of pSciMapper

  • Explanation of pSciMapper

    The main algorithm -- Hierarchical clustering:

  • Data Modeling;

  • Distance Metric;

  • Clustering result evaluation

14


Design of pscimapper2

Design of pSciMapper

  • Data Modeling

  • Predict the resource usage time series:

  • Application parameters;

  • Hardware specification;

  • Hidden Markov Model(HMM)

  • Temporal Feature Extraction – Temporal Signature:

  • Peak value: Max value of the time series;

  • Relative variance: Normalized sample variance;

  • Pattern: A sequence of samples to represent the pattern

  • Notation:

15


Design of pscimapper3

Design of pSciMapper

  • Distance Metric

Disti,j: The distance between task i and j;

Ri: A kind of resource. There a four types, so (R1, R2) has 10 pairs;

aff_score: The pre-defined affect factor for consolidation R1 and R2, value is (0, 1);

Corr(peaki, peakj): The Pearson’s correlation between two workloads with regards to some kind of resource usage;

16


Design of pscimapper4

Design of pSciMapper

  • Clustering result evaluation

    The objective:

  • Map each clustering strategy to the servers set;

  • Evaluate the execution time and power consumption of each clustering strategy

    The method:

  • At the bottom level: Kernel Canonical Correlation Analysis(KCAA);

  • At other levels: Nelder-Mead, an optimization algorithm

17


Design of pscimapper5

Design of pSciMapper

  • Workflow Consolidation Algorithm

  • Initial one-to-one assignment;

  • Generate resource usage time series (HMM) and evaluate the Time and Power by KCCA;

  • Merge clusters according to the Distance Metric;

  • Optimal assignment and reevaluate the Time and Power by Nelder-Mead;

  • Repeat step 3 and 4 until the merge threshold is reached: the time degradation is too big or the power consumption saving is too small

    Optimization: Dynamic CPU provisioning

18


Design of pscimapper6

Design of pSciMapper

  • Example

C1

C2

C3

C4

C5

CPU: moderate

Mem: low

Disk: low

Net: low

CPU: moderate

Mem: low

Disk: low

Net: moderate

CPU: moderate

Mem: high

Disk: high

Net: low

CPU: high

Mem: moderate

Disk: low

Net: low

CPU: low

Mem: low

Disk: high

Net: moderate

Assignment <power, time>

{(C1, C2, C3), S2}, {(C4, C5), S1} <93.62, 92.87>

{(C1, C2), S2}, {C3, S5}, {(C4, C5), S1} <135.11, 88.03>

{C1, S2}, {C2, S3}, {C3, S5}, {C4, S1}, {C5, S4} <180.56, 83.93>

C1

C2

C3

C4

C5

A small modification of the figure in the paper: according to the description of the paper, we should switch the position of two times: 83.93 and 92.87

19


Outlines5

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Experimental evaluation

Experimental Evaluation

  • Experiment setup

  • Algorithm compared

  • Without consolidation

  • pSciMapper + Static Allocation

  • pSciMapper + Dynamic Provisioning

  • Optimal + Work Conserving

  • Metrics

  • Normalized Total Power Consumption

  • Execution Time

  • Virtualization Environment

  • Xen 3.0

20


Experimental evaluation1

Experimental Evaluation

  • Experiment applications

  • Two real applications: GLFS, VR

  • Three Synthetic applications: SynApp1, 2, 3

21


Experimental evaluation2

Experimental Evaluation

  • Normalized Power Consumption: GLFS

  • Four different configurations;

  • In all case, the pSciMapper can save power, (as much as 20%);

  • Without dynamic provisioning, pSciMapper is slightly worse than optimal method, with it, pSciMapper is much better

22


Experimental evaluation3

Experimental Evaluation

  • Normalized Power Consumption: VR and Syn

  • Similar to GLFS;

  • The dynamic provisioning doesn’t import much optimization to VR and Syn1 and Syn2, since it is used for CPU and Memory, especially to CPU.

23


Experimental evaluation4

Experimental Evaluation

  • Execution Time: GLFS

  • Clustering stop threshold is set: 15% degradation of execution time;

  • From the result, we know that the degradation of pSciMapper + dynamic provisioning is within 12%

24


Experimental evaluation5

Experimental Evaluation

  • Execution Time: VR and Syn

  • Similar to GLFS

25


Experimental evaluation6

Experimental Evaluation

  • Consolidation Overhead and Scalability

  • The overhead of pSciMapper is much smaller than the heuristic optimal method;

  • The scalability of pSciMapper is good.

26


Outlines6

1

3

4

5

2

  • Background

Key points analysis

  • Design of pSicMapper

  • Experimental Evaluation

  • Motivating Applications

Outlines

6

  • Conclusion


Conclusion

Conclusion

  • Design and implement a power-aware consolidation framework, pSciMapper, based on hierarchical clustering method;

  • pSciMapper is able to reduce the total power consumption by up to 56% with a most 15% slowdown for the workflow applications;

  • pSciMapper incurs low consolidation overhead so its scalability to large scale scientific workflow applications is good

27


Power aware consolidation of scientific workflows in virtualized environments

Thank you for your listening!

Any Questions?


  • Login