1 / 18

VLSI Arithmetic

VLSI Arithmetic. Multiplication. A = a n-1 a n-2 … a 1 a 0. eg). 1 1 0 0 1 1. . B = b n-1 b n-2 … b 1 b 0. . 1 0 1 0 0 1. 1 1 0 0 1 1. 0 0 0 0 0 0. 0 0 0 0 0 0. 1 1 0 0 1 1. Shift and add Area O(N) Time O(NlogN) Too slow. 0 0 0 0 0 0. 1 1 0 0 1 1.

marie
Download Presentation

VLSI Arithmetic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VLSI Arithmetic

  2. Multiplication A = an-1 an-2 … a1 a0 eg) 1 1 0 0 1 1  B = bn-1 bn-2 … b1 b0  1 0 1 0 0 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 Shift and add Area O(N) Time O(NlogN) Too slow 0 0 0 0 0 0 1 1 0 0 1 1

  3. an-1 an-2 … a1 a0 bn-1 bn-2 … b1 b0 an-1 an-2 … a1 a0  b0 = B0 an-1 an-2 … a1 a0  b1 = B1 an-1 an-2 … a1 a0  b2 = B2 an-1 an-2 … a1 a0 an-1 an-2 … a1 a0  bn-1 = Bn-1 Sum

  4. b0 b1 bn-1 2n bit processors Algorithm 1. Broadcast A to n processors (log n time) 2. Compute Bi(i=0, …, n-1) simultaneous 3. Compute sum (using redundant binary number) Time O(logn) Space O(N2)

  5. 3M - multiplication (Why not 4M - multiplication?) P = X • Y U = (X1+X0) (Y1+Y0) V = (X1 • Y1) W = (X0 • Y0) P = (X• Y) = V • 2N + (U-V-W) • 2N/2 + W A = O(N2) T = O(log N) X Y Input distribution (X1+ X0) (Y1+ Y0) recursively Routing X1 • Y1 (X1+ X0) (Y1+ Y0) X0 • Y0 Routing Adder (n) Output network

  6. Area, Time, Period Complexity, and Optimality Virsion Lower bound 4M 3M 2M, LABC Area N2 N2 log2N N2 MN logN Time log N log N log N log N Period 1 1 1 1 AP2 N2 N2 log2N N2 MN logN AP2T2 N2 log2N N2 log4N N2 log2N MN log3N Remark -- Time-optimal Time, AP2, and AP2T2 optimal Time-optimal and regular layout

  7. Redundant Binary Number (Signed Digit) where ai {0, 1,1} Example. 1 1 0 1 1 1 = 25 - 24 + 22 + 21 - 20 1. Binary number is a redundant binary number 2. Note that 1 = 1 1 3. Redundant binary number  Binary Number by subtraction (in log n time)

  8. Example 1 1 1 0 1 = 10100 - 1001 = 15 Example addition 1 1 1 1 0 1 (5)10 1 0 0 1 1 0 (38)10 + S = 0 1 1 0 1 1 (sum) 1 1 0 0 1 0 1 1 1 1 1 1 1

  9. Addition (Subtraction): carry propagation is limited to one bit left Type 1 2 3 4 5 6 Augend ai 1 1 0 0 1 1 0 1 1 Augend bi 1 0 1 0 1 1 1 0 1 Carry 1 1 if there is carry 1 from lower end 0 otherwise 0 no carry 1, if there is a carry 1 from lower end 0, otherwise 1

  10. bi 1 0 1 0 1 1 1 0 1 Next lower position ai-1, bi-1 if (1,0), (0,1), (1,1) else if (1,0), (0,1), (1,1) else ai 1 1 0 0 1 1 0 1 1 ci 1 1 0 0 1 0 1 si 0 1 1 0 1 1 0 SD addition rule table

  11. R0 A R1 A1 R2 C0 C2 C3 C1 A2 R3 A3 Hardware for multiplication Mesh of Trees Number of PEs = O(n*n) Area  (n2log2n) Multiplication A*B Ri A shift i bitif bi  0 Column Ciadd logn bits Use redundant binary, add these numbers

  12. R0 1 0 1 1 A R1 1 1 1 A1 R2 C0 C2 C3 C1 A2 0 0 R3 1 A3 Example of multiplication on mesh of trees with augmented mesh edges A=0111 B=1011 Consider only last 4 bits 1

  13. Example of multiplication on mesh of trees R0 Ci contains the sum at most logn bits long Note that Ci starts from i-th bit. So the k-th bit of Ci is pipelined to the row i+k Each bit ci is computed at (i,i) The pipelined value will be added one by one using Redundant binary system in a constant step. Then the number is converted to a binary number Total: 2logn: Ri 2logn : add to (i,i) location 2logn : covert to binary 1 0 1 1 A R1 1 1 1 R2 C0=1 C2 =10 A1 C3=10 C1= 10 0 0 A2 R3 A3 1 1

  14. Integer Division • Not as easy • O(logn) algorithm exist with table look up • Hardware circuit exist? => open question

  15. To find , let Newton Rapson Method To solve f(x) = 0, Newton Rapson Method converges quadratically, That is, i+1 = i2

  16. Eg. When D = 4 set x0 = 0.4 x1 = 0.16 x2 = 0.2176 x3 = 0.245801 0 = 0.15 1 = 0.09 2 = 0.0324 3 = 0.004199 To get n precision reciprocal of D, we need logn iterations. 1st iteration: 1 digit correct 2nd iteration: 2didit correct 3rd iteration: 4 digit correct logn iterations: n digit correct

  17. where Proof that Newton Rapson Method converges quadratically. Let X be the solution of f(x) = 0. But Since f(X) = 0, we have Thus, Since f”() is bounded and f’(xi) is bounded, |i+1| = c |i|2, for some c For is bounded if D  0

  18. Complexity • Each *: O(logn) time • to obtain n digit precision, O(logn) iteration • => O(log2n) complexity • A/D => A * (1/D) • Question: logn algorithm for division?

More Related