1 / 23

A. Chatzigeorgiou, G. Stephanides Department of Applied Informatics

Evaluating Performance and Power of Object-oriented vs. Procedural Programming in Embedded Processors. A. Chatzigeorgiou, G. Stephanides Department of Applied Informatics University of Macedonia, Greece. Processor Power. Memory Power. Motivation.

tariq
Download Presentation

A. Chatzigeorgiou, G. Stephanides Department of Applied Informatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating Performance and Power of Object-oriented vs. Procedural Programming in Embedded Processors A. Chatzigeorgiou, G. Stephanides Department of Applied Informatics University of Macedonia, Greece

  2. Processor Power Memory Power Motivation • Low Power Requirements for Portable Systems • - Battery Lifetime • - Integration Scale • - Cooling/Reliability Issues • Challenge: Increased performance  increased power • Widespread application of embedded systems • Existing Low-Level Tools for Energy Estimation University of Macedonia

  3. Does Software Affect Power Consumption ? • Until recently, power reduction was the goal of hardware optimizations (transistor sizing, supply voltage reduction etc) • Tiwari (1994, 1996) proved that software has a significant impact on the energy consumption of the underlying hardware, which can be measured • Software addresses higher levels of the design hierarchy  Therefore, energy savings are larger • Moreover, for software there is no tradeoff between performance and power: Fewer instructions lead to reduced power University of Macedonia

  4. switching activity Sources of Power Consumption • Power dissipation in digital systems is due to charging/discharging of node capacitances : • However: • Dynamic Power: University of Macedonia

  5. Sources of Power Consumption • Sources of power consumption in an embedded system • - Instruction level power consumption • (power consumed during the processor operation) • - Instruction and Data Memories • (power consumed when accessing memories) • - Interconnect switching • (power consumed when bus lines change state) University of Macedonia

  6. Energy consumption of a program (Tiwari et al.) Base Cost ADD R2, R0, #1 Overhead Cost ADD R2, R0, #1 CMP R2, #0 Instruction Level Power Models Instruction Energy University of Macedonia

  7. 6-8 % Processor Energy Consumption University of Macedonia

  8. Instruction Level Power Models University of Macedonia

  9. Memory Power Consumption • Energy cost of a memory access >> instruction energy • Depends on: • - number of accesses (directly proportional) • - size of memory (between linear and logarithmic) • - number of ports, power supply, technology • Instruction Memory Power, depends on • code size  required memory size • #executed instructions  #accesses • Data Memory Power depends on • Amount of data being processed  memory size • On whether the application is data-intensive  #accesses University of Macedonia

  10. OOPACK Benchmarks • Small suite of kernels that compares the relative performance of object oriented programming in C++ versus plain C-style code: Max: Computes the maximum over a vector Aim: To measure how well a compiler inlines a function within a conditional C-style: performs the comparison between two elements explicitly OOP: performs the comparison by calling an inline function. University of Macedonia

  11. OOPACK Benchmarks Matrix: multiplies two matrices containing real numbers Aim: to measure how well a compiler hoists simple invariants C-style: where, for example, the term L*i is constant for each iteration of k and should be computed as an invariant outside the k loop. University of Macedonia

  12. OOPACK Benchmarks OOP:performs the multiplication employing member functions and overloading to access an element, given the row and the column. Modern C compilers are good enough at this sort of optimization for scalars. However, in OOP style, invariants often concern members of objects. Optimizers that do not peer into objects miss the opportunities.   University of Macedonia

  13. OOPACK Benchmarks Iterator: computes a dot-product Aim: to measure how well a compiler inlines short-lived small objects (short-lived object should never reside in main memory; its entire lifetime should be spent inside registers) C-style: uses a common single index for( int i=0; i<N; i++ ) sum += A[i]*B[i]; OOP: employs iterators Iterators are a common abstraction in OOP. Although iterators are usually called "light-weight" objects, they may incur a high cost if compiled inefficiently. All methods of the iterator are inline and in principle correspond exactly to the C-style code. University of Macedonia

  14. OOPACK Benchmarks • Complex:multiplies the elements of two arrays containing complex numbers • Aim:to measure how well a compiler eliminates temporaries • C-style: the calculation is performed by explicitly writing out the real and imaginary parts • OOP: complex addition and multiplication is done using overloaded operations • Complex numbers are a common abstraction in scientific programming. The complex arithmetic is all inlined in the OOP-style, so in principle the code should run as fast as the version using explicit real and imaginary parts. University of Macedonia

  15. OOPACK Benchmarks SAXPY operation: Y = Y + c*X (c is scalar, X and Y are vectors) Calculation employing temporaries: tmp1.re = c.re * X[k].re – c.im * X[k].im; tmp1.im = c.re * X[k].im + c.im * X[k].re; tmp2. re = Y[k].re + tmp1.re; tmp2.im = Y[k].im + tmp1.im; Y[k] = tmp2; Dynamically allocating and deleting temporaries causes severe performance loss for small vectors Temporaries are eliminated: Y[k].re = Y[k].re + c.re*X[k].re – c.im*X[k].im; Y[k].im = Y[k].im + c.re*X[k].im + c.im*X[k].re; University of Macedonia

  16. Chip boundary Instruction memory ROM controller Memory interface signals Bus Interface ARM7 integer processor core (3stage-pipeline) A [31:0] RAM controller Data memory D [31:0] Target Architecture • Processing unit: ARM7 TDMI • Dedicated instruction memory(on-chip ROM) • On-chip data memory University of Macedonia

  17. Code size RAM requirements ARM STD 2.50 ARM Debugger #instructions Trace File Profiler #memory accesses Memory Model Total Power  Processor Energy Data Memory Energy Instruction Memory Energy OOPACK Benchmark

  18. Results – Performance Comparison University of Macedonia

  19. Results – Memory Comparison University of Macedonia

  20. Results – Energy Comparison (mJ) University of Macedonia

  21. OOPACK1 – Energy distribution (mJ) University of Macedonia

  22. Conclusions • Power Consumption should be taken into account in the design of an embedded system. • OOP can result in a significant increase of both execution time and power consumption. • If a compiler cannot optimize code to reach the level of procedural programming performance, the number of executed instructions increases, increasing proportionally the instruction level power consumption. • Especially in large programs, data abstraction can lead to a large code size increase, resulting in higher power consumption of the instruction memory. University of Macedonia

  23. Future Work • Currently building an accurate energy profiler (considering cache layers, pipeline stalls) • Compare large programs implemented following the object oriented and the procedural programming paradigm • Perform the comparisons for other compilers • Identify energy-consuming programming structures and automatically convert them to energy efficient ones University of Macedonia

More Related