1 / 60

Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders

ECE 645: Lecture 5. Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders. Required Reading. Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design. Chapter 6 .4, Carry Determination as Prefix Computation

apaiva
Download Presentation

Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 645: Lecture 5 Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders

  2. Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 6.4, Carry Determination as Prefix Computation Chapter 6.5, Alternative Parallel Prefix Networks Chapter 7.4, Conditional-Sum Adder Chapter 7.5, Hybrid Adder Designs Chapter 7.1, Simple Carry-Skip Adders Note errata at: http://www.ece.ucsb.edu/~parhami/text_comp_arit_1ed.htm#errors

  3. Recommended Reading J-P. Deschamps, G. Bioul, G. Sutter, Synthesis of Arithmetic Circuits: FPGA, ASIC and Embedded Systems Chapter 11.1.9, Prefix Adders Chapter 11.1.3, Carry-Skip Adder Chapter 11.1.4, Optimization of Carry-Skip Adders Chapter 11.1.10, FPGA Implementations of Adders Chapter 11.1.11, Long-Operand Adders

  4. Parallel Prefix Network Adders

  5. Parallel Prefix Network Adders Basic component - Carry operator (1) g p B” B’ B g” g’ p” p’ g = g” + g’p” p = p’p” (g, p) = (g’, p’) ¢ (g”, p”) = (g” + g’p”, p’p”)

  6. Parallel Prefix Network Adders Basic component - Carry operator (2) g p overlap okay! B” B’ B g” g’ p” p’ g = g” + g’p” p = p’p” (g, p) = (g’, p’) ¢ (g”, p”) = (g” + g’p”, p’p”)

  7. Properties of the carry operator ¢ Associative [(g1, p1) ¢ (g2, p2)] ¢ (g3, p3) = (g1, p1) ¢ [(g2, p2) ¢ (g3, p3)] Not commutative (g1, p1) ¢ (g2, p2)  (g2, p2) ¢ (g1, p1)

  8. Parallel Prefix Network Adders Major concept Given: (g0, p0) (g1, p1) (g2, p2) …. (gk-1, pk-1) Find: (g[0,0], p[0,0]) (g[0,1], p[0,1]) (g[0,2], p[0,2]) … (g[0,k-1], p[0,k-1]) block generate from index 0 to k-1 ci = g[0,i-1] + c0p[0,i-1]

  9. Similar to Parallel Prefix Sum Problem Parallel Prefix Sum Problem Given: x0 x1 x2 … xk-1 Find: x0 x0+x1 x0+x1+x2 … x0+x1+x2+ …+ xk-1 Parallel Prefix Adder Problem Given: x0 x1 x2 … xk-1 Find: x0 x0 ¢ x1 x0 ¢ x1 ¢ x2 … x0 ¢ x1 ¢ x2 ¢ … ¢ xk-1 where xi = (gi, pi)

  10. Parallel Prefix Sums Network I

  11. Parallel Prefix Sums Network I – Cost (Area) Analysis Cost = C(k) = 2 C(k/2) + k/2 = = 2 [2C(k/4) + k/4] + k/2 = 4 C(k/4) + k/2 + k/2 = = …. = = 2 log k-1C(2) + k/2 (log2k-1) = = k/2 log2k 2 C(2) = 1 Example: C(16) = 2 C(8) + 8 = 2[2 C(4) + 4] + 8 = = 4 C(4) + 16 = 4 [2 C(2) + 2] + 16 = = 8 C(2) + 24 = 8 + 24 = 32 = (16/2) log2 16

  12. Parallel Prefix Sums Network I – Delay Analysis Delay = D(k) = D(k/2) + 1 = = [D(k/4) + 1] + 1 = D(k/4) + 1 + 1 = = …. = = log2k D(2) = 1 Example: D(16) = D(8) + 1 = [D(4) + 1] + 1 = = D(4) + 2 = [D(2) + 1] + 2 = = 4 = log2 16

  13. Parallel Prefix Sums Network II (Brent-Kung)

  14. Parallel Prefix Sums Network II – Cost (Area) Analysis Cost = C(k) = C(k/2) + k-1 = = [C(k/4) + k/2-1] + k-1 = C(k/4) + 3k/2 - 2 = = …. = = C(2) + (2k - 2k/2(log k-1)) - (log2k-1) = = 2k - 2 - log2k 2 C(2) = 1 Example: C(16) = C(8) + 16-1 = [C(4) + 8-1] + 16-1 = = C(2) + 4-1 + 24-2 = 1 + 28 - 3 = 26 = 2·16 - 2 - log216

  15. Parallel Prefix Sums Network II – Delay Analysis Delay = D(k) = D(k/2) + 2 = = [D(k/4) + 2] + 2 = D(k/4) + 2 + 2 = = …. = = 2 log2k - 1 D(2) = 1 Example: D(16) = D(8) + 2 = [D(4) + 2] + 2 = = D(4) + 4 = [D(2) + 2] + 4 = = 7 = 2 log2 16 - 1

  16. 8-bit Brent-Kung Parallel Prefix Network

  17. 4-bit Brent-Kung Parallel Prefix Network x7’ x5’ x3’ x1’ 2 –bit B-K PPN s3’ s1’ s7’ s5’

  18. 8-bit Brent-Kung Parallel Prefix Network Critical Path

  19. GP c C S Critical Path gi = xi yi pi = xi yi 1 gate delay g= g” + g’ p” p = p’ p” 2 gate delays ci+1 = g[0,i] + c0 p[0,i] 2 gate delays si = pi ci 1 gate delay

  20. Brent-Kung Parallel Prefix Graph for 16 Inputs

  21. Kogge-Stone Parallel Prefix Graph for 16 Inputs

  22. Parallel Prefix Network Adders Comparison of architectures Hybrid Network 2 Brent-Kung Kogge-Stone Delay(k) 2 log2k - 2 log2k log2k+1 Cost(k) 2k - 2 - log2k k/2 log2k k log2k - k + 1 6 5 Delay(16) 4 32 49 Cost(16) 26 Delay(32) 8 6 5 80 129 57 Cost(32)

  23. Latency vs. Area Tradeoff

  24. Hybrid Brent-Kung/Kogge-Stone Parallel Prefix Graph for 16 Inputs

  25. Conditional-Sum Adders

  26. One-level k-bit Carry-Select Adder

  27. Two-level k-bit Carry Select Adder

  28. Conditional Sum Adder • Extension of carry-select adder • Carry select adder • One-level using k/2-bit adders • Two-level using k/4-bit adders • Three-level using k/8-bit adders • Etc. • Assuming k is a power of two, eventually have an extreme where there are log2k-levels using 1-bit adders • This is a conditional sum adder

  29. Conditional Sum Adder:Top-Level Block for One Bit Position

  30. Three Levels of a Conditional Sum Adder xi xi+3 yi+3 yi+2 yi+1 yi xi+2 xi+1 branch point 1-bit conditional sum block concatenation c=1 c=1 c=1 c=1 c=0 c=0 c=0 c=0 2 2 2 2 2 2 2 2 1 1 1+1 1 1 2 2 1 1 2 2 1 1 c=0 c=0 c=1 c=1 3 3 3 3 1 2+1 1 2 2 3 3 block carry-indetermines selection 5 4+1 5 c=0 c=1

  31. 16-Bit Conditional Sum Adder Example

  32. Conditional Sum Adder Metrics

  33. Hybrid Adders

  34. A Hybrid Ripple-Carry/Carry-Lookahead Adder

  35. A Hybrid Carry-Lookahead/Carry-Select Adder

  36. Carry-Skip Adders Fixed-Block-Size

  37. 7.1 Simple Carry-Skip Adders Fig. 7.1 Converting a 16-bit ripple-carry adder into a simple carry-skip adder with 4-bit skip blocks. Computer Arithmetic, Addition/Subtraction

  38. Another View of Carry-Skip Addition Street/freeway analogy for carry-skip adder. Computer Arithmetic, Addition/Subtraction

  39. 0 1 c4j+4 p[4j, 4j+3] Mux-Based Skip Carry Logic Fig. 10.7 of arch book The carry-skip adder with “OR combining” works fine if we begin with a clean slate, where all signals are 0s at the outset; otherwise, it will run into problems, which do not exist in mux-based version Computer Arithmetic, Addition/Subtraction

  40. 1 Tfixed-skip-add = (b – 1) + 0.5 + (k/b – 2) + (b – 1) in block 0 OR gate skips in last block  2b + k/b – 3.5 stages dT/db = 2 – k/b2 = 0 bopt = k/2 Topt = 22k – 3.5 . . . Carry-Skip Adder with Fixed Block Size Block width b; k/b blocks to form a k-bit adder (assume b divides k) Example: k = 32, bopt = 4, Topt = 12.5 stages (contrast with 32 stages for a ripple-carry adder) Computer Arithmetic, Addition/Subtraction

  41. Fixed-Block-Size Carry-Skip Adder (1) Notation & Assumptions Adder size - k-bits Fixed block size - b bits Number of stages - t Delay of skip logic = Delay of one stage of ripple-carry adder = 1 delay unit Latency of the carry-skip adder with fixed block width k Latencyfixed-carry-skip = ( b - 1 ) + 0.5 + - 2 + ( b - 1 ) b skips in last block in block 0 OR gate k = 2b + - 3.5 b

  42.    2 k 2 k 2 k 2 k   opt k k Latencyfixed-carry-skip 2 2 Fixed-Block-Size Carry-Skip Adder (2) Optimal fixed block size dLatencyfixed-carry-skip k = 2 - = 0 b2 db k  = bopt k topt = = bopt 2 k 2 + - 3.5 = = + - 3.5 = 2 - 3.5 =

  43. Fixed-Block-Size Carry-Skip Adder (3) Latencyfixed-carry-skip k bopt topt Latencylook-ahead Latencyripple-carry 4 32 8 12.5 32 6.5 128 8 28.5 16 128 8.5 8 2 8.5 16 16 4.5 7.5 5 3 5 13 64 6.5 64 18.5 6 18.5 11

  44. Carry-Chain & Carry-Skip Adders in Xilinx FPGAs

  45. xi yi ci+1 0 0 1 1 0 1 0 1 yi yi ci ci Basic Cell of a Carry-Chain Adder in Xilinx FPGAs ci+1 xi pi LUT 0 1 yi si ci

  46. cc(i+1)b xib+(b-1) pib+(b-1) LUT 0 1 yib+(b-1) sib+(b-1) Carry-Skip Adder b-bit block . . . . . . . xib+1 pib+1 LUT 0 1 yib+1 sib+1 cib+1 xib pib LUT 0 1 yib sib cib

  47. ck p[k-b, k-1] p[k-b, k-1] LUT 0 1 cck-b ck-b Carry-Skip Adder: Carry Skip Multiplexers . . . . . . . p[b, 2b-1] p[b, 2b-1] LUT 0 1 cc2b cb cb p[0, b-1] p[0, b-1] LUT 0 1 ccb c0 c0

  48. p[ib, ib+(b-1)] xib+(b-1) yib+(b-1) xib+(b-2) yib+(b-2) LUT 0 1 Carry-Skip Adder: Computation of the Block Propagate Signals p[ib, ib+(b-3)] 0 . . . . . . . p[ib, ib+3] xib+3 yib+3 xib+2 yib+2 LUT 0 1 p[ib, ib+1] 0 cb xib+1 yib+1 xib yib LUT 0 1 0 1

  49. Complete Carry-Skip Adder ss-1..0 s2s-1..s s3b-1..2b sk-1..k-b cck cc2b cc3b ccb xb-1..0 x2b-1..b x3b-1..2b xk-1..k-b . . . . . y3b-1..2b yk-1..k-b yb-1..0 y2b-1..b cb c0 c2b ck-b ck 0 0 0 0 . . . c0 1 1 1 1 p[0,b-1] p[b,2b-1] p[2b,3b-1] p[k-b,k-1] xb-1..0 x2b-1..b x3b-1..2b xk-1..k-b . . . . . y3b-1..2b yk-1..k-b yb-1..0 y2b-1..b

  50. Carry-Skip Adders in Xilinx Spartan II FPGAs by J-P. Deschamps, G. Bioul, G. Sutter Delay in ns b=k b=8 b=16 b=32 Max. Frequency Increase k 64 14 13 12 13% 96 16 14 13 21% 128 23 14 14 63% 256 38 - 16 17 141% 512 77 - 20 20 296% 1024 159 - 28 25 531%

More Related