180 likes | 371 Views
Linda Brackenbury APT GROUP, Computer Science University of Manchester l.brackenbury@manchester.ac.uk. Asynchronous Signal Processing Systems. Agenda. Why asynchronous? Applications suited to asynchronous Design examples DSP design Viterbi decoder Future work What have we learnt?.
E N D
Linda Brackenbury APT GROUP, Computer Science University of Manchester l.brackenbury@manchester.ac.uk Asynchronous Signal Processing Systems
Agenda • Why asynchronous? • Applications suited to asynchronous • Design examples • DSP design • Viterbi decoder • Future work • What have we learnt?
System Timing • Synchronous • uses global clock • any state changes occur on clock edge • system states predictable so good tools • Asynchronous • uses events to control timing • timing is more unpredictable • tool support not as good
Why Asynchronous? • No clock generation or distribution • timing uses local handshake signals • Power only consumed when doing useful work • No overhead between idle and active • Low EMI – switching is spread
Applications • Async - no help to some applications • power/performance at full activity similar for synchronous and asynchronous! • Async good for portable systems • battery size and lifetime is important • workload is highly variable • lots of idle time • low EMI requirement
Low Power DSP • GSM chipsets are typically based on microprocessor + DSP • DSP performs intensive calculations • Challenge is to meet required throughput without excessive power • throughput met with parallelism • area traded for increased speed
Asynchronous Contribution • Design Philosophy • optimize design for typical operation • support design for rarer conditions • usually at expense of increased operation time • Simpler logic within processing units • energy and area reduction
DSP Design Examples 1 • Data dependent adder on critical path • detect completion of carry path • average carry path only half word length • Address wrap around in circular buffer • synchronous calculates new and possible corrected value in parallel - two adders • asynchronous new value only – one adder • if correction required (rare) this done after
DSP Design Examples 2 • Register File has eight single-read single-write ported 32-word banks • efficient parallel access to sequential registers from 4 Functional Units (typical) • Request conflicts to same bank rare • broadcast mechanism available • genuine conflicts take 1 read cycle per request rather than 1 clock cycle each
Viterbi Decoder • Two data streams transmitted depends on current and previous data • State transitions of encoder with time can be drawn as a trellis • Decoder reconstructs trellis
Asynchronous Decoder • Clock only used to input and output data • all internal operation is asynchronous • FIFOs buffer data to meet clock demand
Branch Metric Unit • Calculates gap between input symbol and four ideal symbols
Path Metric Unit node node j+32 BMa 2j+1 • add-compare-select operation BMb BMb BMa 2j j next node metric previous node metric
Node Arithmetic • Serial arithmetic • Counts events • Unary numbers • Change of state equals count • one=1111 two=0001 three=1101 etc. When smaller count empties merge stops
History Unit • Records PMU node winners and global winner over many timeslots • No error -global winner is child of last winner • Error – need to reconstruct good path • compute parent of global winner and repeat ONLY until it agrees with good path • can have many backtraces in parallel • backtraces decoupled from placing data into HU
Low Power Contribution • History Unit • much smaller • highly concurrent independent operation • computation performed minimised • Path Metric Unit – most of power • smaller, simple, fast +/- units replace add-compare-select • idea simple but a lot of control complexity so dissipated a lot of power!
What Have We Learnt? • Asynchronous is advantageous to some applications • Asynchronous design looks very different from synchronous design • Get very poor results if just translate from a synchronous design • Design of asynchronous is harder • timing and control more complex to design