1 / 66

Advanced Dividers

Lecture 10. Advanced Dividers. Required Reading. Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design. Chapter 13, Basic Division Schemes 13.4, Non-Restoring and Signed Division Chapter 15 Variation in Dividers 15.6, Combined Multiply/Divide Units

Download Presentation

Advanced Dividers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 10 Advanced Dividers

  2. Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 13, Basic Division Schemes 13.4, Non-Restoring and Signed Division Chapter 15 Variation in Dividers 15.6, Combined Multiply/Divide Units 15.3, Combinational and Array Dividers Chapter 16, Division by Convergence

  3. Division versus Multiplication Division is more complex than multiplication: Need for quotient digit selection or estimation Overflow possibility: the high-order k bits of z must be strictly less than d; this overflow check also detects the divide-by-zero condition. Pentium III latencies Instruction Latency Cycles/Issue Load / Store 3 1 Integer Multiply 4 1 Integer Divide 36 36 Double/Single FP Multiply 5 2 Double/Single FP Add 3 1 Double/Single FP Divide 38 38 The ratios haven’t changed much in later Pentiums, Atom, or AMD products* *Source: T. Granlund, “Instruction Latencies and Throughput for AMD and Intel x86 Processors,” Feb. 2012 Computer Arithmetic, Division

  4. Classification of Dividers Array Dividers Dividers by Convergence Sequential Radix-2 High-radix • Restoring • Non-restoring • regular • SRT • using carry save adders • SRT using carry save adders

  5. Notation and Basic Equations

  6. Notation z Dividend z2k-1z2k-2 . . . z2 z1 z0 d Divisor dk-1dk-2 . . . d1 d0 q Quotient qk-1qk-2 . . . q1 q0 s Remainder sk-1sk-2 . . . s1 s0 (s = z - dq)

  7. Basic Equations of Division z = d q + s | s | < | d | sign(s) = sign(z) z > 0 0  s < | d | z < 0 - | d | < s  0

  8. Sequential Integer Division Basic Equations s(0) = z s(j) = 2 s(j-1) - qk-j (2k d) for j=1..k s(k) = 2k s

  9. Restoring Unsigned Integer Division

  10. Restoring Unsigned Integer Division s(0) = z for j = 1 to k if 2 s(j-1) - 2k d > 0 qk-j = 1 s(j) = 2 s(j-1) - 2k d else qk-j = 0 s(j) = 2 s(j-1)

  11. Example of restoring unsigned division

  12. Shift/subtract sequential restoring divider

  13. Non-Restoring Unsigned Integer Division

  14. Non-Restoring Unsigned Integer Division s(1) = 2 z - 2k d for j = 2 to k if s(j-1) 0 qk-(j-1) = 1 s(j) = 2 s(j-1) - 2k d else qk-(j-1) = 0 s(j) = 2 s(j-1) + 2k d end for if s(k) 0 q0 = 1 else q0 = 0 Correction step

  15. Non-Restoring Unsigned Integer Division Correction step s = 2-k · s(k) z = q d + s z, q, d ≥ 0 s<0 z = (q-1) d + (s+d) z = q’ d + s’

  16. Example of nonrestoring unsigned division

  17. Partial remainder variations for restoring andnonrestoring division

  18. Non-Restoring Unsigned Integer Division Justification s(j-1) ≥ 0 2 s(j-1) - 2k d < 0 2 (2 s(j-1) )- 2k d ≥ 0 Restoring division Non-Restoring division s(j) = 2 s(j-1) s(j+1) = 2 s(j) - 2k d = = 4 s(j-1) - 2k d s(j) = 2 s(j-1) - 2k d s(j+1) = 2 s(j) + 2k d = = 2 (2 s(j-1) - 2k d)+ 2k d = = 4 s(j-1) - 2k d

  19. Partial quotient 1111 1110 1101 q(2) 1100 Nonrestoring q(4) q 1011 1010 q(3) 1001 Restoring 1000 q(1) q(2) 0111 0110 0101 0100 0011 0010 0001 q(0) Iteration 0000 0 1 2 3 4 Convergence of the Partial Quotient to q Example (0 1 1 1 0 1 0 1)two / (1 0 1 0)two (117)ten/(10)ten = (11)ten = (1011)two In restoring division, the partial quotient converges to q from below In nonrestoring division, the partial quotient may overshoot q, but converges to it after some oscillations Computer Arithmetic, Division

  20. Non-RestoringSigned Integer Division

  21. Non-Restoring Signed Integer Division s(0) = z for j = 1 to k if sign(s(j-1)) == sign(d) qk-j = 1 s(j) = 2 s(j-1) - 2k d = 2 s(j-1) - qk-j (2k d) else qk-j = -1 s(j) = 2 s(j-1) + 2k d = 2 s(j-1) - qk-j (2k d) q = BSD_2’s_comp_conversion(q) Correction_step

  22. Non-Restoring Signed Integer Division Correction step s = 2-k · s(k) z = q d + s sign(s) = sign(z) z = (q-1) d + (s+d) z = q’ d + s’ z = (q+1) d + (s-d) z = q” d + s”

  23. ======================== z 0 0 1 0 0 0 0 1 24d 1 1 0 0 1 –24d 0 0 1 1 1 ======================== s(0) 0 0 0 1 0 0 0 0 1 2s(0) 0 0 1 0 0 0 0 1 sign(s(0))  sign(d), +24d 1 1 0 0 1 so set q3 = -1 and add –––––––––––––––––––––––– s(1) 1 1 1 0 1 0 0 1 2s(1) 1 1 0 1 0 0 1 sign(s(1)) = sign(d), +(–24d)0 0 1 1 1 so set q2 = 1 and subtract –––––––––––––––––––––––– s(2) 0 0 0 0 1 0 1 2s(2) 0 0 0 1 0 1 sign(s(2))  sign(d), +24d 1 1 0 0 1 so set q1 = -1 and add –––––––––––––––––––––––– s(3) 1 1 0 1 1 1 2s(3) 1 0 1 1 1 sign(s(3)) = sign(d), +(–24d)0 0 1 1 1 so set q0 = 1 and subtract –––––––––––––––––––––––– s(4) 1 1 1 1 0 sign(s(4))  sign(z), +(–24d)0 0 1 1 1 so perform corrective subtraction –––––––––––––––––––––––– s(4) 0 0 1 0 1 s 0 1 0 1 q-1 1-1 1 ======================== Example of nonrestoring signed division p = 0 1 0 1 Shift, compl MSB 1 1 0 1 1 Add 1 to correct 1 1 0 0 Check: 33/(-7) = -4

  24. BSD  2’s Complement Conversion q = (qk-1 qk-2 . . . q1 q0)BSD = = (pk-1 pk-2 . . . p1 p0 1)2’s complement where Example: qBSD 1 -1 1 1 qi pi p -1 0 1 0 1 1 1 1 q2’scomp 0 0 1 1 1 = 0 1 1 1 no overflow if pk-2 = pk-1 (qk-1 qk-2)

  25. Nonrestoring Hardware Divider Computer Arithmetic, Division

  26. Multiply/Divide Unit

  27. Multiply-Divide Unit The control unit proceeds through necessary steps for multiplication or division (including using the appropriate shift direction) The slight speed penalty owing to a more complex control unit is insignificant Fig. 15.9 Sequential radix-2 multiply/divide unit.

  28. Fractional Division

  29. Unsigned Fractional Division zfrac Dividend .z-1z-2 . . . z-(2k-1)z-2k dfrac Divisor .d-1d-2 . . . d-(k-1) d-k qfrac Quotient .q-1q-2 . . . q-(k-1) q-k sfrac Remainder .000…0s-(k+1) . . . s-(2k-1) s-2k k bits

  30. Integer vs. Fractional Division For Integers: z = q d + s  2-2k z 2-2k = (q 2-k) (d 2-k) + s (2-2k) For Fractions: zfrac = qfrac dfrac + sfrac where zfrac = z 2-2k dfrac = d 2-k qfrac = q 2-k sfrac = s 2-2k

  31. Unsigned Fractional Division Overflow Condition for no overflow: zfrac < dfrac

  32. Sequential Fractional Division Basic Equations s(0) = zfrac s(j) = 2 s(j-1) - q-j dfrac for j=1..k 2k · sfrac = s(k) sfrac = 2-k · s(k)

  33. Fig. 13.2 Examples of sequential division with integer and fractional operands.

  34. Array Dividers

  35. Sequential Fractional Division Basic Equations sfrac(0) = zfrac s(j) = 2 s(j-1) - q-j dfrac s(k)frac= 2k sfrac

  36. Restoring Unsigned Fractional Division s(0) = z for j = 1 to k if 2 s(j-1) - d > 0 q-j = 1 s(j) = 2 s(j-1) - d else q-j = 0 s(j) = 2 s(j-1)

  37. Restoring Array Divider Computer Arithmetic, Division

  38. Non-Restoring Unsigned Fractional Division s(-1) = z-d for j = 0 to k-1 if s(j-1) > 0 q-j = 1 s(j) = 2 s(j-1) - d else q-j = 0 s(j) = 2 s(j-1) + d end forif s(k-1) > 0 q-k = 1 else q-k = 0

  39. Nonrestoring Array Divider Similarity to array multiplier is deceiving Critical path Computer Arithmetic, Division

  40. Division by Convergence

  41. Division by Convergence Chapter Goals Show how by using multiplication as the basic operation in each division step, the number of iterations can be reduced Chapter Highlights Digit-recurrence as convergence method Convergence by Newton-Raphson iteration Computing the reciprocal of a number Hardware implementation and fine tuning Computer Arithmetic, Division

  42. q 1 0.101101 Digit 0 16.1 General Convergence Methods Sequential digit-at-a-time (binary or high-radix) division can be viewed as a convergence scheme As each new digit of q = z / d is determined, the quotient value is refined, until it reaches the final correct value Convergence is from below in restoring division and oscillating in nonrestoring division Meanwhile, the remainder s = z – qd approaches 0; the scaled remainder is kept in a certain range, such as [–d, d) Computer Arithmetic, Division

  43. q 1 0.101101 Digit 0 Elaboration on Scaled Remainder in Division The partial remainder s(j) in division recurrence isn’t the true remainder but a version scaled by 2j Division with left shifts s(j) = 2s(j–1) – qk–j(2kd) with s(0) = z and |–shift–| s(k) = 2ks |–––subtract–––| Quotient digit selection keeps the scaled remainder bounded (say, in the range –d to d) to ensure the convergence of the true remainder to 0 Computer Arithmetic, Division

  44. Constant Desired function Recurrence Formulas for Convergence Methods u(i+1) = f(u(i), v(i)) v(i+1) = g(u(i), v(i)) u(i+1) = f(u(i), v(i), w(i)) v(i+1) = g(u(i), v(i), w(i)) w(i+1) = h(u(i), v(i), w(i)) Guide the iteration such that one of the values converges to a constant (usually 0 or 1) The other value then converges to the desired function The complexity of this method depends on two factors: a. Ease of evaluating f and g (and h) b. Rate of convergence (number of iterations needed) Computer Arithmetic, Division

  45. Idea: Converges to q Force to 1 16.2 Division by Repeated Multiplications Motivation: Suppose add takes 1 clock and multiply 3 clocks 64-bit divide takes 64 clocks in radix 2, 32 in radix 4  Divide faster via multiplications faster if 10 or fewer needed Remainder often not needed, but can be obtained by another multiplication if desired: s = z – qd To turn the identity into a division algorithm, we face three questions: 1. How to select the multipliers x(i)? 2. How many iterations (pairs of multiplications)? 3. How to implement in hardware? Computer Arithmetic, Division

  46. Idea: Converges to q Force to 1 u(i+1) = f(u(i), v(i)) v(i+1) = g(u(i), v(i)) Fits the general form Formulation as a Convergence Computation d(i+1)= d(i)x(i) Set d(0) = d; make d(m) converge to 1 z(i+1)= z(i)x(i) Set z(0) = z; obtain z/d = qz(m) Question 1: How to select the multipliers x(i)? x(i)= 2 – d(i) This choice transforms the recurrence equations into: d(i+1)= d(i)(2 -d(i)) Set d(0) = d; iterate until d(m) 1 z(i+1)= z(i)(2 -d(i)) Set z(0) = z; obtain z/d = qz(m) Computer Arithmetic, Division

  47. Determining the Rate of Convergence d(i+1)= d(i)x(i) Set d(0) = d; make d(m) converge to 1 z(i+1)= z(i)x(i) Set z(0) = z; obtain z/d = qz(m) Question 2: How quickly does d(i)converge to 1? We can relate the error in step i + 1 to the error in step i: d(i+1)= d(i)(2 -d(i)) = 1 – (1 – d(i))2 1 – d(i+1)= (1 – d(i))2 For 1 – d(i) e, we get 1 – d(i+1) e2: Quadratic convergence In general, for k-bit operands, we need 2m – 1 multiplications and m 2’s complementations where m = log2 k Computer Arithmetic, Division

  48. Quadratic Convergence Table 16.1 Quadratic convergence in computing z/d by repeated multiplications, where 1/2 d = 1 – y < 1 ––––––––––––––––––––––––––––––––––––––––––––––––––––––– i d(i) = d(i–1)x(i–1), with d(0) = dx(i) = 2 – d(i) ––––––––––––––––––––––––––––––––––––––––––––––––––––––– 0 1 – y = (.1xxx xxxx xxxx xxxx)two 1/2 1 + y 1 1 – y2 = (.11xx xxxx xxxx xxxx)two 3/4 1 + y2 2 1 – y4 = (.1111 xxxx xxxx xxxx)two 15/16 1 + y4 3 1 – y8 = (.1111 1111 xxxx xxxx)two  255/256 1 + y8 4 1 – y16 = (.1111 1111 1111 1111)two = 1 – ulp ––––––––––––––––––––––––––––––––––––––––––––––––––––––– Each iteration doubles the number of guaranteed leading 1s (convergence to 1 is from below) Beginning with a single 1 (d ½), after log2k iterations we get as close to 1 as is possible in a fractional representation Computer Arithmetic, Division

  49. Graphical Depiction of Convergence to q Fig. 16.1 Graphical representation of convergence in division by repeated multiplications. Computer Arithmetic, Division

  50. Fig. 16.6 Two multiplications fully overlapped in a 2-stage pipelined multiplier. 16.5 Hardware Implementation Repeated multiplications: Each pair of ops involves the same multiplier d(i+1)= d(i)(2 -d(i)) Set d(0) = d; iterate until d(m) 1 z(i+1)= z(i)(2 -d(i)) Set z(0) = z; obtain z/d = qz(m) Computer Arithmetic, Division

More Related