Energy consumption of small database systems
Sponsored Links
This presentation is the property of its rightful owner.
1 / 40

Energy Consumption of small Database Systems PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Energy Consumption of small Database Systems. Faculty : Computer Science and Engineering, York University (CA) Course : CSE 6421 - Advanced Database Systems Instructor: Prof. Jarek Gryz Speaker: Benedikt Iltisberger Date : 2010- 12-01. Presentation Time: 35-45 m inutes

Download Presentation

Energy Consumption of small Database Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Energy consumption of small database systems

Energy Consumption of small Database Systems

Faculty:Computer Science and Engineering, York University (CA)Course: CSE 6421- Advanced Database Systems

Instructor:Prof. JarekGryzSpeaker: Benedikt IltisbergerDate:2010-12-01

Energy consumption of small database systems

  • Presentation Time:

    • 35-45 minutes

  • Questions:

    • I am lookingforwardtoansweryourquestionsduringthepresentation.

Agenda of the presentation

Agenda ofthePresentation

What is the topic?

Why is it important?

How to solve the problem?

What do you have to keep in mind?

Does it work in the real world?

What is the conclusion?

What can be done in the future?

What is the topic

What is the Topic?



Show a way how to measure the power consumption of a single query!

What are the s cientific challenges

What are the Scientific Challenges?

Never done before[12]

Could/should be implemented in any DBMS

Multiple factors need to be considered

Upcoming and important research area

Energy consumption

Energy Consumption

Source: [7]

Why is it worth to focus on energy saving

Why is it worth to focus on energy saving?

  • Saving the environment

  • Growing market  Job opportunity

  • Saving costs for companies

    • Energy bills are rising quickly

  • Very important research area in Europe

  • Interesting from the technical point of view

Benchmarking organizations

Benchmarking Organizations

  • SPEC

    • “New SPECweb2009 benchmark adds ability to measure power consumption of web servers” (June 5, 2009)[4]

  • TPC

    • “Transaction Processing Performance Council Announces First Results for its TPC-Energy Specification” (July 7, 2010)[5]

Energy consumption in the it

Energy consumption in the IT

Source: [6]

What have dbs to do with that

What have DBs to do with that?

  • Core Service

    • Delivering data for any business process

    • Store the foundation of a company

    • Need to be accessible 24/7

  • Today integrated even in small projects

  • Data warehouses and Data marts

    • Long running energy consuming queries

Reasons for decision makers

Reasons for Decision-Makers

  • Reducing costs

  • Keep up a good image (Green IT)

    • Innovation

    • Keep up with the times

    • Being faster than competitors

  • Depending on the area of work

  • Additional company policies

Idea of this proposal

Idea of this Proposal

  • Estimate the energy consumption of a single query to

    • Compare queries in terms of efficiency

    • Optimize queries for energy saving

    • Get energy information in real time

    • Teach the optimizer to do this task

    • Green IT initiative

How to achieve this goal

How to achieve this goal?

  • DBMS (e.g. DB2) can generate reports about executed queries containing:

    • Original statement -> optimized statement

    • Access Plan with many details like:

      • CPU Cost CPU

      • I/OCost HDD

      • Bufferpool Buffers Main Memory

What information is given

What information is given?

  • IBM DATABASE 2 Explain Table Format Tool

  • DB2 version: 09.07.3

  • Detailed information about queries including

    • Access Plan (also graphical)

    • Original query  Optimized Query

    • Detailed information about the different costs

    • Detailed information about tables, indexes, …

Information from the statistic file

Information from the Statistic File

Database Context:


Parallelism: None

CPU Speed: 1.889377*107

Comm Speed: 100

Buffer Pool size: 6000

Sort Heap size: 1000

Database Heap size: 900

Lock List size: 4096

Maximum Lock List: 22

Average Applications: 1

Locks Available: 28835

4) SORT : (Sort)


Cumulative Total Cost: 27541.2

Cumulative CPU Cost: 5.39421*109

Cumulative I/O Cost: 23667.2

Cumulative Re-Total Cost: 27541.1

Cumulative Re-CPU Cost: 5.39388*109

Cumulative Re-I/O Cost: 0

Cumulative First Row Cost: 27541.2

Estimated Bufferpool Buffers: 585

Detailed information: [9, 10, 11]

Key metrics in detail

Key Metrics in Detail

  • CPU Cost

    • Instructions needed to execute a query

  • I/O Cost

    • Number of seeks and page transfers

  • Bufferpool Buffer

    • Number of used pages

  • Total Cost

    • Measured in Timerons (proprietary IBM algo.)

  • The optimizer s access plan

    The Optimizer’s Access Plan

    Energy consumption of small database systems

    | /------------------+------------------\

    18000 211.307 299.933


    ITEM ( 8) ( 18)

    Q2 26913.4 30.4522

    23078.2 4

    /----------+-----------\ /---+----\

    13889.9 29221.1 300 300


    ( 9) ( 17) ( 19) PROMOTION

    17710.9 9200.94 0.105098 Q1

    14832.2 8246 0

    /-------+--------\ | |

    352.257 39.4311 1.9208e+006 300


    ( 10) ( 13) CUSTOMER_DEMOGRAPHICS SQL100701103316930

    723.246 39.5703 Q4 Q1

    656 32.9996

    | /----+-----\

    352.257 39.4311 2.8804e+006


    ( 11) ( 14) STORE_SALES

    723.246 7.59231 Q5

    656 1

    | |

    352.257 39.4311


    ( 12) ( 15)

    Parse and compute

    Parse and Compute

    • Perl script to gather important information

      • Grab all CPU, HDD and Memory operations

      • Sum them up

    • Provide script with following parameters:

      • Energy consumption of 1 CPU cylce

      • Energy consumption of 1 HDD operation

      • Energy consumption of 1 Memory operation

    • Calculate the query's power consumption

      • Using the information from no. 1 and no. 2

    Perl script

    Perl Script

    Assumed db system

    Assumed DB System



    Source: [6], Page 1233, Section 3.2#) DIMM  Dual In-Line Memory Module

    • CPU[1]

      • Intel Xeon X7560 @2.266 GHz, 130W

      • 8 Cores, 16 Threads, 24MB L3 Cache

    • HDD[2]

      • Seagate Cheetah 15K, 600GB, 16MB Cache

      • I/O Data transfer rate 600MB/s, 16.35W (avg operating power consumption)

    • RAM[6]

      • 4GB DDR3 RAM (2 * 2GB DIMM#) with 9W per DIMM  18W

    • Additional Hardware[6]:

      • 30% of average power consumption of other components which leads to (130W + 16.35W + (2 * 9W)) * 0.3 = 49,305W

    How to measure the parameters i

    How to measure the parameters? I

    • CPU

      • Peak power consumption (p) of the manufacturer (Watts per hour)

      • Total MIPS per hour (c)

      • Calculate avg power consumption (a)

        • p / c = a

    • Problems

      • Slide calculation flaws may end in high differences

      • Many instructions take more the one cycle

        • This also depends on the CPU type

      • Current multicore architecture need special customization

    How to measure the parameters ii

    How to measure the parameters? II

    • HDD

      • Peak power consumption (p) of the manufacturer (Watts per hour)

      • Calculate the max. avg I/O operations per hour (c)

      • Calculate avg power consumption per I/O and hour (a)

        • p / c = a

    • Problems

      • Different energy consumption for read & write

      • RAID-n or other storage systems

    How to measure the parameters iii

    How to measure the parameters? III

    • Main Memory

      • Peak power consumption (p) of the manufacturer (Watts per hour)

      • Calculate the max. page change rate per hour (c)

      • Calculate avg power consumption per page change (a)

        • p / c = a

    • Problems:

      • Very complicated and inprecise

      • Strongly dependent on memory architecture and manufacturing process

    Exemplary technical issue samsung ddr3

    Exemplary Technical Issue:“Samsung DDR3”

    Source: [8]

    Automation of data collection about cpu hdd ram

    Automation of Data Collection about CPU, HDD, RAM

    • Proposal: “energyDB”

      • Name, Modell, Speed, Energy consumption, …

      • Online service to provide needed information

      • Comparable with CDDB[13] or freedb[14]

      • Useful for other projects as well

    Optimizing queries

    Optimizing Queries

    • Areas of optimization:

      • Execution speed (typical parameter)

        • When time is relevant

      • Resource usage

        • For many concurrent users/queries

      • New: Energy Consumption

    Optimizing queries no 1

    Optimizing Queries No. 1

    • Original Query:


      CAST(avg(CAST(ss_quantity AS DECFLOAT)) AS DECIMAL(10,6))

             agg1, CAST(avg(ss_list_price) AS DECIMAL(10,6)) agg2,

             CAST(avg(ss_coupon_amt) AS DECIMAL(10,6)) agg3,

             CAST(avg(ss_sales_price) AS DECIMAL(10,6)) agg4

      FROMtpcds.store_sales, tpcds.customer_demographics, tpcds.date_dim,

      tpcds.item, tpcds.promotion

      WHEREss_sold_date_sk= d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk

             = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = 'M' and

      cd_marital_status = 'S' and cd_education_status = 'College' and

             (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 2000

      GROUP BY i_item_id

      ORDER BY i_item_id


    Optimizing queries results

    Optimizing Queries (Results)

    *) Sum of all buffer pool operations for all involved queries.

    Optimizing queries no 2

    Optimizing Queries No. 2

    Original Query:

    SELECTi_item_id, CAST(avg(CAST(ss_quantity AS DECFLOAT)) AS DECIMAL(10,6)) agg1, CAST(avg(ss_list_price) AS DECIMAL(10,6)) agg2, CAST(avg(ss_coupon_amt) AS DECIMAL(10,6)) agg3, CAST(avg(ss_sales_price) AS DECIMAL(10,6)) agg4

    FROMtpcdsnp.store_sales, tpcdsnp.customer_demographics, tpcdsnp.date_dim,

    tpcdsnp.item, tpcdsnp.promotion

    WHEREss_sold_date_sk= d_date_sk and ss_item_sk = i_item_skand

    ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender= 'M’ and cd_marital_status= 'S' and cd_education_status = 'College' and (p_channel_email = 'N' orp_channel_event = 'N') and d_year= 2000

    GROUP BY i_item_id

    ORDER BY i_item_id


    Optimizing queries results1

    Optimizing Queries (Results)

    *) Sum of all buffer pool operations for all involved queries.

    Optimizing queries problems

    Optimizing Queries (Problems)

    • Better performance but higher energy usage:

      • Precomputation for faster response times

      • Materialized Views

      • CPU

        • Cheap in terms of time consumption but

        • Expensive in terms of energy consumption

    Conclusions i

    Conclusions I

    • Reducing costs is always welcome in companies

    • High need for this research in the long term

      • Due to corporate identity and rising energy costs

      • Environmental saving

    • Query optimization does not necessarily lead to power saving

    • Proposed technique can be added to the optimizer

    Conclusions ii

    Conclusions II

    • Cons

      • You need to measure the single components in the first place

      • Not easily applicable to complex systems

      • Accuracy must be determined

        • Checking results with hardware based measurement data

          How to measure VMs?

    Future work i

    Future Work I

    • Add power consumption optimization techniques directly to the optimizer

      • IBM seems to be working on that[12]

    • Optional output of estimate power consumption

    • Improve calculation techniques

      • Even slide inaccuracycan have a big impact

      • RAM usage is pretty inaccurate at the moment

    • Automization of the process

    Future work ii

    Future Work II

    • Implementation ofcomplexDataware House systemsorfordistributeddatabasesystems

    • Inaccuracywithmeasuringmulti CPU systems

    • „energyDB“ web serviceimplementation

    • Impact ofhardware power saving

      • GPU fordatabases, SSD, Dynamic voltage scaling, ...


    Concluding …

    This topic is worth for further research!





    M. Poess, R. O. Nambiar, Energy Cost, “The Key Challenge of Today’s Data Centers: A Power Consumption Analysis of TPC-C results”,OracleCorporation and Hewlett-Packard Company, N.A..

    Reference cont

    Reference (cont.):

    R. Ramahrishnan, J. Gehrke, „Database Management Systems”, Third Edition, McGraw-Hill, Boston, 2003.

  • Login