1 / 13

Carry-Lookahead Addition. Ripple-Carry Adder Current design uses a “ripple-carry” adder technique Cout propagates into the Cin of next adder What is the associated electrical delay for this scheme? Assume each gate (AND/OR only) has a delay of T units

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

• Current design uses a “ripple-carry” adder technique
• Cout propagates into the Cin of next adder
• What is the associated electrical delay for this scheme?
• Assume each gate (AND/OR only) has a delay of T units
• Two level logic implementation of a single FA:
• delay of 2T to compute Cout:
• A 16-bit Ripple-Carry adder has 15 * 2T + T = 31 T total delay to compute the sum!
• Grows linearly with size of adder
• Is there a faster way to add? yes.
• Real ALUs use this style
• Idea is to compute needed carry-in to a bit position with only a very small delay (smaller than in the R.C. case)
Generating a Carry
• An adder will “generate” a carry-out on the sum of the bits ai and bi if ai• bi = 1 (i.e. a and b are both 1)

Define gi = ai• bi (generate)

• Hence: couti cini+1 = 1 if gi = 1
• Let ci = “carry-in to position i”
• Note ci+1 = carry-in to position i+1 = carry-out from position i
• Delay to compute each g = 1T
Propagating a Carry
• An adder will “propagate” a carry-in (ci) by the sum of the bits ai and bi if ci = 1 and ai+ bi = 1 (i.e. cin is 1 and at least one of a or b is 1)

Define pi = ai+ bi (propagate)

• Hence: couti ci+1 = gi + pi• ci
• A carry-out occurs from position i if it is either
• generated by position i, or
• a carry-in is propagated by position i
• Delay to compute each p = 1T
Propagate / Generate
• Ex: using 4 bits

c0 = initial carry-in

c1 = g0 + p0 c0

c2 = g1 + p1 c1 = g1 + p1 (g0 + p0 c0) = g1 + p1 g0 + p1 p0 c0

c3 = g2 + p2 c2 = g2 + p2 (g1 + p1 g0 + p1 p0 c0) = g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 c0

• Delay to compute each c = 2T  fixed delay!
• I am assuming that g and p are pre-computed
• They take a total of 1T to pre-compute
An Abstraction of Propagate / Generate
• These equations require large gate “fan-in” to implement in 2T delay  therefore stop expansion at 4 bits as above
• Delay to compute each c = 2T  fixed delay!
• pre-computed: 1T for each p and g (in parallel!)
• 1T for the AND to create the subgroups (minterms)
• 1T for the OR of all subgroups
• Each sum bit Si can now be computed in 3T delay:
• Combine these ideas to design a 4-bit adder with 3T delay for the entire 4-bit Sum (assuming p & g are pre-computed)
• What are P0 and G0?
• P0 = p3 p2 p1 p0 (super-propagate)
• G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 (super-generate)
• Note: the device has no “carry-out”, only P0 and G0
Super-Generate / Super-Propagate

P0 = p3 p2 p1 p0 (super-propagate)

G0 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 (super-generate)

• P0 represents the propagate for the entire 4-bit unit
• P0 takes 1T delay units to compute
• G0 represents the generate for the entire 4-bit unit
• G0 takes 2T delay units to compute
• P0 and G0 represent a higher level of hardware abstraction of propagation and generation
• Cin(0) = c0 (initial carry-in)
• Cin(1) = G0 + P0 c0
• Cin(2) = G1 + P1 G0 + P1 P0 c0
• Cin(3) = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 c0
• Delay for Cin is (2T + 2T) = 4T
• Delay for Sum = 4T + 3T (per unit) + 1T (for pre-computation of p & g) = 8T
• Compare to 31T for R.C. adder

A: 0110 0011 1101 0101B: 1110 1101 1000 0011g: 0110 0001 1000 0001  1T p: 1110 1111 1101 0111 P0: 0·1·1·1 = 0  1T (2T total) P1: 1·1·0·1 = 0  P2: 1·1·1·1 = 1  P3: 1·1·1·0 = 0  G0: 0 + 0 0 + 0 1 0 + 0 1 1 1 = 0  2T (3T total)G1: 1 + 1 0 + 1 1 0 + 1 1 0 0 = 1  G2: 0 + 1 0 + 1 1 0 + 1 1 1 1 = 1 G3: 0 + 1 1 + 1 1 1 + 1 1 1 0 = 1 

• Computing the actual sum (red bits only):

A: 0110 0011 1101 0101B: 1110 1101 1000 0011g: 0110 0001 1000 0001p: 1110 1111 1101 0111P: 0100 G: 1110

a6  b6  c6 = 1  0  ( g5 + p5g4 + p5p4Cin(1) ) = 1  0  (0 + 0 0 + 0 1 (G0 + P0 c0)) = 1  0  (0 + 0 0 + 0 1 (0 + 0 0)) = 1

Delay to compute S6 = 1T + (2T + 2T) + (2T + 1T) = 8T

Test Yourself
• Compute sum bit S10 (red bits only):

A: 0110 1001 1001 0101B: 1011 0101 1000 1011g: p: P: G:

a10  b10  c10 =