SoC 저전력 설계 기법. 조 준 동 SungKyunKwan University VADA Lab. · Content. Introduction SOC Design Trends System Level Low Power Design Architecture Level Low Power Design Conclusion. · SOC Design Trends. Expected to integrate more and more complex
조 준 동
maximizing battery life and minimizing size
Digital still camera
“Top-down” - based on design resource constraints
(Reducing power requirements of high throughput portable applications)
(Reducing packaging costs and achieving memory savings)
(Excessive heat prevents the realization of high density chips and limits their functionalities)
(reduced power decreases packaging cost)
(each period has a 01 or a 1 0 transition)
where ng(t) is the number of transitions of g(t) in the time interval between –T/2 and T/2.
Factors Influencing Ceff:Circuit Function
the output of a gate is determined according to the probability of 1s (or 0s) in the logic
description of the gate
Factors Influencing Ceff:Circuit Function (Static CMOS)
1. Modify the architecture of the system so as to make it faster.
2. Reduce VDD so as to restore the original speed. Power consumption has decreased.
The additional circuitry required to compensate the speed degradation may dominate, and the power consumption may increase.
Parallelism and pipelining do not always pay-off.
Loop overhead is cut in half because two iterations are performed in each iteration.
If array elements are assigned to registers, register locality is improved because A(i) and A(i +1) are used twice in the loop body.
Instruction parallelism is increased because the second assignment can be performed while the results of the first are being stored and the loop variables are being updated.
Two output samples are computed in parallel based on two input samples.
Neither the capacitance switched nor the voltage is altered. However, loop unrolling enables several other transformations (distributivity, constant propagation, and pipelining). After distributivity and constant propagation,
The transformation yields critical path of 3, thus voltage can be dropped.
0 0 0 0 0
1 0 1 0 0
1 0 1 1 1
1 1 1 1 0
1 0 1 0 0
1 0 1 1 1
0 0 1 0 1
0 0 1 1 0Encoding
0 0 0 0
1 0 1 0
0 1 0 0
1 1 1 1
1 0 1 0
0 1 0 0
1 1 0 1
0 0 1 1
R. J. Fletcher, “Integrated circuit having outputs configured for reduced state changes,” May 1987, U.S. Patent 4667337.
M. R. Stan and W. P. Burleson, “Bus-invert coding for low-power I/O,” IEEE Tr. on VLSI Systems, Mar. 1995, pp. 49-58.
VADA Lab’s 저전력 IP’s
Low-Power Equalizer for xDSL
21% 전력 감소, SNR=40dB
스마트 카드용 차세대
저전력 보안 프로세서 칩 설계
ECC, Rijndael, DES, SHA
Maximizing Memory Data Reuse for Lower Power Motion Estimation
33% 전력 감소, 52Mhz 2.1배 면적증가
OFDM-based high-speed wireless LAN platform
20.7Mhz, 237000 gates
IS-95 기반 CDMA의Double Dwell Searcher저전력 및 co-design 설계
67% 전력 감소, 41% 면적감소
Fast and Low Power Viterbi Search
Engine using Inverse Hidden Markov Model
68% 전력 감소, 71%속도개선,
삼성 휴먼 테크 우수논문상, ‘02
High-Flexible Design of OFDM
Tranceiverfor DVB-T (개발 중)
video processing: edge detection
voice-processing (data transmission like xDSL)
Telephony: 50% (70%/30%) idle,
동시에 이야기하지 않음.
with every clock cycle, data are loaded into the working register banks, even if there are no data changes.
timeWireless Interface Power-Saving Ronny Krashinsky and Hari BalakrishnanMIT Laboratory for Computer Science
Measurements of Enterasys Networks RoamAbout 802.11 NIC
Ronny Krashinsky and
Hari Balakrishnan, MIT
If PSM-static is too coarse-grained, it harms performance by delaying network data
If PSM-static is too fine-grained, it wastes energy by waking unnecessarily
Solution: dynamicallyadapt to network activity to maintain performance while minimizing energy
Compromise between performance and energy
Motion Vector Distributions
From P. Pirsch et al, VLSI Architectures for Video Compression, Proc. Of IEEE, 1995
1st Iter 2nd Iter 3rd Iter
Worst-case error -25% -6% -1.6%
Prob. of Error<1% 10% 70% 99.8%
With an 8 by 8 multiplier, the exact result can be obtained at a maximum of seven iteration steps (worst case)
CDMA 단말기에 사용하기위한 MSM
(Mobile Station Modem) 칩의 Searcher Engine에 대한 RTL수준 저전력 설계 구현. 동작 주파수 : 12.5MHz
Data flow graph를 사용하여 rescheduling, pre-computation 및 strength reduction, Synchronous Accumulator를 이용한 저전력 설, area와 power를 각각 최대 67.68%, 41.35% 감소 시킴. San Kim and Jun-Dong Cho, “Low Power CDMA Searcher”, CAD and VLSI Workshop, May. 1999.
그림 1). 상세 블록도
The three input ALU consumes much less power than an ALU and an ASU
A drawback of using a 3I-ALU is the added complexity in calculating the carry and overflow.
max 값 선택
Hull of AMC