1 / 9

Balancing Performance and Power Consumption in Data-Intensive Computing Clusters

Balancing Performance and Power Consumption in Data-Intensive Computing Clusters. Project Proposal by: Shan Li Josh Sorchik Juzi Zhao. Problem Statement:.

angus
Download Presentation

Balancing Performance and Power Consumption in Data-Intensive Computing Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Balancing Performance and Power Consumption in Data-Intensive Computing Clusters Project Proposal by: Shan Li Josh Sorchik Juzi Zhao

  2. Problem Statement: • How can we balance performance and power consumption in a data-intensive computing cluster by optimizing the available nodes based upon the type of job being executed?

  3. Background: • Data-intensive computing is limited by power • Low-power hardware exists • APIs allow us to exploit low power features of modern processors • Research has been conducted that focuses on dynamically adapting the number of available nodes using “bidding”, machine-learning, and other methods

  4. Methodology: • Place unneeded nodes in low-power state • Use benchmarks to determine: • What types of operations are drawing the most power? (CPU, I/O, network) • How does power consumption vary when adding/removing nodes? • How does execution time vary when adding/removing nodes? • How does the power/execution time vary when changing the input data size? • Based upon test results, create scheduling policies to optimize energy consumption for different job types and apply them to applicable job types • Determine the trade-offs between power usage and acceptable performance

  5. Benchmarks • Vary the number of active nodes and data set sizes while running the following benchmarks: • RandomWriter • Grep • Sort • Analyze workload characteristics, e.g., data size, write/read ratio, execution time etc., and the relationship between workload characteristics and power consumption.

  6. Model • N = Total nodes • Na = Active nodes • Ni = Inactive nodes • E(N) = Total power consumption • Ea = Power consumption of one active node • Ei = Power consumption of one inactive node • D(N) = Execution time as a function of N • T = Time between jobs • E(N) = Ea*D(N)+Ei(T-D(N))

  7. Evaluating Results: • Apply derived scheduling policies to new jobs (CPU, I/O, etc.) to determine if the policies have an effect on the performance/power ratio

  8. Cluster specifications • Intel Atom 330 dual-core processors • Zotac Motherboard • Intel X-25M Solid State hard drives • Dell PowerConnect 2848 switch

  9. Milestones • Configure a cluster to utilize low-power hardware (CPU, motherboard, solid-state drive) and the Hadoop framework • Execute Hadoop benchmarks to analyze the impact of number of active compute nodes, Hadoop workload characteristics, and performance by collecting execution time and power consumption • Develop a scheduling policy based upon the experiment outcome depending upon the job type being executed • Evaluate the scheduling policy on the various Hadoop benchmarks

More Related