580 likes | 606 Views
Explore the latest power-saving mechanisms and dynamic power management strategies to optimize energy consumption. Learn about dynamic voltage scaling, sleep states, and lower power usage modes in devices to enhance efficiency and performance. Discover the power-latency tradeoff and how to balance energy consumption and task completion time for maximum benefits.
 
                
                E N D
Online Power Saving Strategies Sandy Irani Joint work with Rajesh Gupta, Sandeep Shukla, Dinesh Ramanathan
Motivation System Level/ Application controlled power management is gaining importance. • Power is becoming first class design parameter for software and applications • Greater power savings is possible if knowledge of the applications demands are taken into account..
Power Savings Mechanisms • Dynamic Power Management • When a device is idle, it can transition to low-power `sleep’ states. . • Dynamic Voltage Scaling • A device can be run at different speeds with different power usage rates. • Execution of jobs can be slowed down to save power as long as all jobs are completed by their deadline.
This Talk • Extend work on dynamic power management to handle devices with multiple sleep states. • Design and analyze algorithms for systems that allow both dynamic power management and dynamic voltage scaling.
Dynamic Power Management • Current Trend • Design Devices with sleep states • Provide driver hooks to change the power states under operating system control • For “power-hungry” peripheral devices it is common • Disk-Drives • Network Interface cards (Wireless card) • Display devices • DRAM • O/S designers design Dynamic Power Management Strategies to take advantage of that.
Dynamic Power Management • When a device becomes idle, it can transition to lower power usage state. • A fixed amount of additional time and energy are required to transition back to active state when a new request for service arrives. • What is the best time threshold to transition to the sleep state? • Too soon: pay start-up cost too frequently. • Too late: spend too much time in the high-power state
2-state vs. Multi-state • 2-state case • One idle state • One power saving state • Multi-State • Idle state, and multiple power saving States. • Each power saving state has different power characteristics, and transition penalty. • Example: IBM Disk Drive • Idle, standby, sleep
Previous Work • Deterministic algorithm (ski rental) • Transition to sleep state when the cost of being in active state is at least the cost of `waking up’. • Normalize cost of transitioning from sleep to active state to 1. • Power consumption rate of active state is . • This algorithm is 2-competitive. • 2 is the best possible competitive ratio for any deterministic algorithm.
Previous Work, cont. • Idle period length generated by known distribution with density function p(t). • Choose threshold T to minimize cost: • Theorem [Karlin, Manasse, McGeough and Owicki] • For any distribution p(t), the expected cost of the above algorithm is within e/(e-1) of the optimal cost. Furthermore, there is a distribution for which no algorithm can be better than e/(e-1) times optimal.
Multi-state Case • Let there be k+1 states • Let State k be the shut-down state and 0 be the active state • Let i be the power dissipation rate at state i • Let i be the total energy dissipated to move back to State k • States are ordered such that i+1  i • k = 0 and 0 = 0 (without loss of generality). • Power down energy cost can be incorporated in the power up cost for analysis (if additive).
Lower Envelope Idea State1 State2 State3 State 4 Energy t1 t2 t3 Time For each state i, plot:
Deterministic Lower Envelope Algorithm • The Lower Envelope Defines an ordering of the states. • Throw out states that do not appear on lower envelope • Given this ordering, only need to determine thresholds: • When to transition from state i to state i+1. • Lower Envelope Algorithm Transitions from one state to the next at the discontinuities of the lower envelope curve. • THEOREM: Lower Envelope Algorithm is 2-competitive.
Probabilistic Lower Envelope Algorithm • Use same order of states as determined by lower envelope function. • Our approach: • Determine threshold for transitioning from state i to state i+1 by solving the optimization problem where i and i+1 are the only states in the system.
Probabilistic Lower Envelope Algorithm • Can show that: • THEOREM: The Probabilistic Lower Envelope Algorithm is e/(e-1)-competitive.
Power-Latency Tradeoff • Tasks arrive through time and take time to run • If the device is busy when a task arrives, it waits in a queue • Idle period begins when device finishes current job and the queue is empty • If device transitions to sleep state in an idle period, some latency is incurred as device transitions to active state. • This in turn effects (shortens) the length of future idle periods. • Power-Latency tradeoff extremes: • Minimize latency: always stay in the active state. • Minimize energy usage: delay completing any tasks until they have all arrived.
State Power Consumption Start-up Energy (Joules) Transition Time to Active Sleep 0 4.75 5s Stand-by 0.2 1.575 1.5s Idle 0.9 0.56 40ms Active/Busy 2.4 0 0s Experimental Study: IBM Mobile Hard Drive Trace data with arrival times of disk accesses from Auspex file server archive.
Histogram for the Probabilistic Lower Envelope Algorithm • Create histogram • Partition possible idle period range (0,) into intervals • Let ri denote the left end of the ith interval • Keep a counter for the number of idle periods among last w idle periods that fall in the range [ri-1 , ri) • Update thresholds every r idle periods: • Use Probabilistic Lower Envelope Algorithm to calculate thresholds using histogram as estimate of probability distribution generating upcoming idle period. • Takes time O( #bins #states ) • Similar to [Keshav, Lund, Phillips, Reingold, Saran]
Histogram for the Probabilistic Lower Envelope Algorithm • How many partitions do we need for the histogram? • More partitions, more accurate estimation • More partition, more expensive computation • Where should we partition? • Our method: • Pick a constant c. (we chose c=5). • Let T1 ,…, Tk be discontinuities of Lower Envelope (i.e. thresholds for the Lower Envelope Algorithm). • Partition range [Ti, Ti+1 ] into c equal size bins.
Dynamic Voltage Scaling • Device which can run at any speed s. • Power consumed if running in state s is given by convex function P(s). • Jobs arrive through time. Job j has: • Arrival time: aj • Deadline: bj • Work required: Rj • Schedule S = (s, job) • s(t) is the speed of the device at time t. • job(t) is which job is executed at time t.
Dynamic Voltage Scaling(Dynamic Voltage Scaling - No Sleep: DVS-NS) • Schedule S is feasible for set of jobs J if for every j in J: • Cost of Schedule S is:
DVS with Sleep State (DVS-S) • Schedule S = ( s, job, h ): • h(t) = sleep or on • If h(t) = sleep, then s(t) = 0. • Power is a function of speed and state: • P(s, state) = P(s) if state = on. • P(s, state) = 0 if state = sleep. • P(0) =  is power required to keep device active with no tasks running.
DVS with Sleep State (DVS-S) • Requirements for a feasible schedule are the same. • Let k be the number of times the device transitions from sleep state to the on state • Cost of a schedule S is:
Processor Seismic Sensor Radio Power (mW) Capabilities: vibration, acoustic, accelerometer, magnetometer, Active On Rx 751.6 temperature sensing Active On Idle 727.5 Active On Sleep 416.3 Communication Rest of the Node Subsystem GPS Active On Removed 383.3 Active Removed Removed 360.0 Radio Micro CPU Sensor Modem Controller Active On Tx (36.3 mW) 1080.5 Tx (27.5 mW) 1033.3 Tx (19.1 mW) 986.0 Tx (13.8 mW) 942.6 Tx (10.0 mW) 910.9 Tx (3.47 mW) 815.5 Tx (2.51 mW) 807.5 Tx (1.78 mW) 799.5 Tx (1.32 mW) 791.5 Tx (0.955 mW) 787.5 Tx (0.437 mW) 775.5 Tx (0.302 mW) 773.9 Tx (0.229 mW) 772.7 Tx (0.158 mW) 771.5 Tx (0.117 mW) 771.1 Rockwell WINS Node Summary Processor = 360 mW l doing repeated n transmit/receive Sensor = 23 mW l Processor : Tx = 1 : 2 l Processor : Rx = 1 : 1 l Total Tx : Rx = 4 : 3 l at maximum range
SmartBadge • Battery-powered embedded system. • Sharp’s display, wireless local area network (WLAN) card, StrongARM-1100 processor, Micron’s SDRAM memory, FLASH memory, sensors, modem/audio analog front-end on printed circuit board. • Goal: allow computer or human user to provide location and environmental information to a location server through a heterogeneous network. • Operates as part of a client-server system: initiates and terminates communication sessions. • [Simunic, 2001, PhD Thesis, Stanford University]
Previous Work on DVS-NS • Yao, Demers and Shenkel: • Polynomial time offline algorithm to find the optimal schedule for a set of jobs. • Algorithms Average Rate: • sj (t) = Rj /(bj – aj) for t: aj <t<bj = 0 otherwise. • job(t): Earliest Deadline First. • Competitive ratio of Average Rate c, where power function p is a degree-d polynomial:
Our Results on DVS-S • Offline algorithm whose cost is within a factor of 3 of optimal • Online algorithm • Let A be an online algorithm for DVS-NS that achieves a competitive ratio of c. • Let d be the smallest constant such that for all x,y greater than 0, • Theorem: the Competitive ratio of the online algorithm is at most
Optimal Offline Algorithm for DVS-NS[Yao, Demers, and Shenker] • The algorithm schedules jobs as it goes and blacks-out intervals of time for which the device has already been scheduled. • A job j is contained in an interval [z,z’] if • For interval [z,z’], define l(z,z’) to be the length of the interval minus the blackout times. • Define the intensity of interval [z,z’] to be where the sum is taken over all unscheduled jobs j that are contained in [z,z’].
Optimal Offline Algorithm for DVS-NS • Repeat until all jobs are scheduled: • Find the interval [z,z’] with the maximum intensity. • Set s(t) = g(z,z’) for all t in [z,z’]. • Blackout the interval [z,z’]. • Remove all jobs that are contained in [z,z’].
Optimal Offline Algorithm for DVS-NS Example Speed 1 Time 2 3
Critical Speed • If the cost to transition from sleep state to the on state were 0, the optimal speed for all jobs would be the s that minimizes (Rj/s) P(s) This is the s that satisfies P(s) = s P’(s). Call this Scrit, the critical speed for . • If we compress the execution of a task by x, • we expend additional energy because we execute the job faster • we save  x. • Scrit is the point at which it is no longer beneficial to compress the execution of a task.
Offline Algorithm for DVS-S • Run the optimal offline algorithm for DVS-NS until the maximum intensity interval has intensity less than s. • Now we must decide how to schedule the remaining tasks. • There is a feasible schedule in which all jobs are run at a speed Scrit or less. • First decide on intervals of time in which device will sleep. Then run optimal DVS-NS algorithm with these intervals blacked out to determine device speed. • How to decide on the sleep intervals?
Idea • Run the device at speed 0 or Scrit. • Interval in which s(t) = 0 is an idle interval • Interval in which s(t) = Scrit is an active interval. • The active time is the same over all schedules. • The cost of an interval of length i is the minimum of i and 1. • Try and minimize the cost of all idle intervals. • Want fewer, longer intervals. • Ignoring the fact that compressing some jobs to a speed of s is more costly for some jobs than others.
Offline Algorithm for DVS-SExample Scrit Speed Time
Left-To-Right Algorithm • Decide on Active/Idle Intervals: • Sweep from left to right. • While active, run as many jobs as possible until there are no pending jobs in the system. Then device must become idle. • While idle, remain idle until it is necessary to start running jobs again in order to run all jobs by their deadline at a speed of Scrit • Decide on Sleep/On Intervals: • Active interval becomes an on interval. • Idle interval of length < 1/ becomes an on interval. • Idle intervals of length > 1/ becomes a sleep interval.
Results • Theorem: the cost of Left-To-Right on any set of jobs is within a factor of three of optimal. • Lemma: no idle interval for the optimal algorithm can contain two idle intervals of Left-To-Right.
OPT LTR
Proof for LTR • Divide LTR cost into three components: • ACTLTR: P(0) times the length of all the active components • RUNLTR : The cost to run the jobs beyond the energy spent keeping the device on: where the interval is taken over all active intervals. • IDLELTR : The cost of each idle period. • Either 1 or the length times .
P(Scrit) =P(0) Power ACTLTR IDLELTR RUNLTR
ACTLTR is at most ACTOPT . Optimal will not run any job faster than Scrit. RUNLTR is at most ACTOPT + RUNOPT . OPT LTR Speed • IDLELTR is at most ONOPT + 3SLEEPOPT .
IDLELTR is at most ONOPT + 3SLEEPOPT • Consider an interval I in which LTR is idle. • If OPT is ON during all of I, then the cost of I is covered by the cost incurred by OPT in keeping device on during I. • Consider all intervals I such that OPT is asleep during a portion of I. The number of such intervals is at most 3 times the number of times OPT is in sleep state: OPT: on/sleep LTR: active/idle
Online Algorithm for DVS-S • Decide on Active/Idle Intervals: • Sweep from left to right. • While active, run as many jobs as possible until there are no pending jobs in the system. Then device must become idle. • While idle, remain idle until it is necessary to start running jobs again in order to run all jobs (that we know about) by their deadline at a speed of Scrit • Algorithm name: PROCRASTINATOR • Decide on Sleep/On Intervals: • If idle, stay on until cost of staying equals cost of waking up.
Online Algorithm for DVS-S • Decide on device speed. • Let A be an online algorithm for DVS-NS. • Whenever feasible, run device at speed Scrit • If a job arrives which makes it impossible to complete all jobs at a speed of Scrit by their deadline, schedule new job according to A. Add the speed of this job to the speed already allocated to other jobs.
Procrastinator Example Scrit
Procrastinator Example Scrit
Procrastinator Example Scrit