1 / 26

Power Reduction Techniques for Microprocessor Systems by Timothy Goldberg

Power Reduction Techniques for Microprocessor Systems by Timothy Goldberg Paper by: Vasanth Venkatachalam and Michael Franz Published 2005. Power Consumption and its Importance. Saving Power Save money, save electricity, save the planet Heat Dissipation Heat density and cooling

hagop
Download Presentation

Power Reduction Techniques for Microprocessor Systems by Timothy Goldberg

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Power Reduction Techniques for Microprocessor Systems by Timothy Goldberg Paper by: Vasanth Venkatachalam and Michael Franz Published 2005

  2. Power Consumption and its Importance • Saving Power • Save money, save electricity, save the planet • Heat Dissipation • Heat density and cooling • Battery Life • Use less energy, extend battery running time

  3. Outline • Definition of Power and Energy • Power Reduction Techniques • From the Circuit level through Hardware to Compiler and Application level techniques • Commercial Systems • Emerging Technologies

  4. Power and Energy • Need to reduce both • Power = Work / Time • Affects heat • Energy = Power * Time • Affects battery • Dynamic Power Consumption: Circuit activity • Switched capacitance (depends on V, f, C, a) • Clock gating • Short-circuit current, transistors with opposite charges (10-15% of total power)

  5. Power and Energy • Leakage Power Consumption: Static/Idle power • Depends on Voltage and Leakage Current • Sub-threshold leakage: supply voltage, threshold voltage, temperature. • Reduce Voltage, Fewer transistors, increase Threshold voltage

  6. Power Reduction • From low level circuit changes • Low-Power Interconnect • Memories and Memory Hierarchies • Hardware/Architecture • Dynamic Voltage Scaling • Resource Hibernation • Compiler • Application • Cross-layer

  7. Circuit and Logic Level Techniques • Transistor Sizing: Reduce width • less dynamic power consumption, but increases delay • Transistor Reordering: Minimize switching activity • place frequently switching transistors closer to the circuit's outputs • Logic Gate Restructuring: Reduce switching • Gates must receive inputs at the same time

  8. Circuit and Logic Level Techniques • Technology Mapping: Software tools • Find best configuration, based on restraints • Design circuit out of logic gates to minimize total power consumption • NP-Hard DAG problem • Low Power Flip-Flops: • Self-gating flip-flop: Reduce switching activity • Dual-edge triggered: Reduce power dissipated by clock signal

  9. Circuit and Logic Level Techniques • Low Power Control: Processor as a FSM • Activate only the circuitry needed for current executing sub-FSM • Delay-Based Dynamic Supply Voltage • Look-up table of voltages and clock speeds has worst case • Adjust voltage based on the delay and monitor errors • Requires more hardware (shadow-latches)

  10. Low-Power Interconnect • Bus Encoding: inversion to reduce switching • Crosstalk: activity in neighbor wires (shield wire) • Low Swing Buses: +300mV and -300mV instead of +5V and -5V • Immune to crosstalk, but increased hardware at encoder and decoder • Bus Segmentation: allows most of bus to remain powered down when not communicating

  11. Low-Power Interconnect • Adiabatic Buses: Reuses existing charge • Reduce total capacitance • Delay in transferring charge • Network-On-Chip: • Functional units sharing buses: lack speed and volume of transfers • Generic Interconnection Networks replace buses • Concurrent connections

  12. Low-Power Memories and Memory Hierarchies • Reduce power regardless of type (ROM/RAM) • Split Memories into smaller Sub-Systems: activate only the needed circuits in accesses • Specialized cache to reduce accesses • Before first cache level, store application's working set • Block Buffering – store most recently accessed cache set • Scratch Pad Memories – determined by compiler • Trace cache: store instructions in executed order • Dynamic direction prediction-based trace cache • Selective Trace Cache: compiler helps

  13. Low-Power Processor Architecture Adaptations • Adaptive Caches: lines, blocks, or sets selectively activated based on miss threshold • Lost data and delay with No Voltage • Cache Decay turns off unused cache lines after interval • Hot Spot Detection: count branch taken, activate cache lines within hotspot • Dead Block: powers down cache lines containing basic blocks that have reached final use (compiler-directed)

  14. Architecture Adaptations • Adaptive Instruction Queues: partitions powered down when instructions aren't needed • Heuristics: measure IPC, with thresholds • Algorithms for reconfiguring Multiple Structures: • Adjust pipeline width and register update unit for hotspots • Tests configurations within hotspot • Offline Profiling • Occupancy-based • Selective Way Caches: measure cache hits in each way

  15. Dynamic Voltage Scaling • Modulate clock frequency and supply voltage • Dynamic, depending on workload • Difficulties: • Unpredictable workloads (tasks and I/O requests, predicting run-time) • Indeterminism – how to decide how fast? • Running an application at slowest speed may not be best • Non-linear effect of frequency

  16. Dynamic Voltage Scaling • Interval-Based approaches: measure how busy, and estimate future, workloads are not regular • Idling with a threshold, thrashing • Aged Averages, weighted intervals • Intertask Approaches: assign speeds for different tasks • Monitor hardware events • Frequency for tasks generated in offline mode, cannot be known perfectly beforehand • Unaware of program structure, such as memory access

  17. Dynamic Voltage Scaling • Intratask Approaches: Adjust processor speed and voltage within tasks • Split a task into fixed length Time Slots • Slow down away from critical path, help from compiler • Memory Bounded Code: memory accesses limit how fast program can execute • Heuristics through experimentation • Cache miss counter • Stall cycle counter, PC marked as hot • Measure rate of instructions, compute-intensive

  18. Dynamic Voltage Scaling • Multiple Clock Domain Architectures: • Globally Asynchronous Locally Synchronous chip: • Chip split into multiple domains with independent clock rates • Allows certain sections of CPU to scale down when not needed • Needs to be divided such that communication between domains doesn't waste more energy • Can scale voltage based on instruction issue queues

  19. Resource Hibernation • Disk Drives: Stop rotating platter during idle • An acceptable threshold • Delay non-urgent requests in a queue • Dynamic RPM Drives for servers • Network Interfaces: can it be turned off? • Track idleness of devices, enter listening or sleep mode • Allows network card to remain idle before shutting down • Displays: Dim display with no input • Face-off to recognize a face in front of display • Zoned Backlighting: Adjust brightness of display regions

  20. Compiler-Level Power Management • Code that reduces execution time • No fixed relationship between performance and power • Reduce memory accesses • Remote Compilation and Remote Execution • Server compiles and mobile device downloads • Cost of download must be less than compiling • Statically Optimized Compilers • Program's runtime behavior may differ from expected • Process will run on an unpredictable system

  21. Compiler-Level Power Management • Dynamic Compilation: Program recompiled as runtime environment changes • Resources levels such as battery capacity and energy budgets • Trade-off of recompilation

  22. Application-Level Power Management • Enable application to adapt to runtime environment • Trading off fidelity or quality of data to users • Lower QoS when resources are low • Interfaces to allow applications to provide hints • Allow application to communicate with OS, and OS with hardware • Expected execution of tasks, deadlines • Better DVS, power down disk for longer periods of time

  23. Cross-Layer Adaptations • Forge: integrated power management framework • Streams videos at most efficient QoS level • Frequency and voltage scaling, network card interface • Grace: adaptation framework • Global and local adaptations • Compiler and Operating System interaction • Compiler has a worst-case deadline • OS adjusts processor speed to meet deadline

  24. Conclusion of Techniques • Multifaceted effort from various disciplines • From transistors to applications, and across all layers • Still ongoing research, new algorithms and heuristics • Impossible to tell what new technologies will prove most successful

  25. Commercial Systems • Pentium 4: high performance goal • Internal temperature cap • Intel Speedstep – 2 frequency and voltage settings • Pentium M: mobile performance and low power • Reduce switching activity in circuit, idle units and buses • Low leakage transistors in cache • Enhanced Speedstep with 6 frequency/voltage settings • Intel PXA27x: wireless handheld devices • Uses memory boundedness to manage power modes

  26. Emerging Radical Technologies • Fuel Cells to replace batteries • Chemical reaction, but can supply energy indefinitely • Fuel enters anode, splits into proton + electron and generates charge • Fuel is abundantly available, such as hydrogen • Micro-electrical and Mechanical Systems • Convert mechanical to electrical energy • Millimeter scale turbine engines, ignite air with fuel • Produce hot exhaust gases and flammability

More Related