1 / 33

Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

Hardware Implementation of Algorithms Sprzętowa Implementacja Algorytmów Układy mnożące, konwolwery Multipliers, convolvers. Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH. 1. 0. 0. 1. X. 1. 0. 1. 1. 1. 0. 0. 1. 1. 0. 0. 1. 0. 0. 0. 0. +. 1. 0. 0. 1.

bina
Download Presentation

Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hardware Implementation of AlgorithmsSprzętowa Implementacja AlgorytmówUkłady mnożące, konwolweryMultipliers, convolvers Ernest Jamro Kat. Elektroniki AGH, Kraków Dep. Of Electronics, AGH

  2. 1 0 0 1 X 1 0 1 1 1 0 0 1 1 0 0 1 0 0 0 0 + 1 0 0 1 1 1 0 0 0 1 1 Mnożenie / Multiplication 9 x 11= 99

  3. Parallel Array MultipliersMnożenie równoległe

  4. FPGA, Built-in multiplier DSP48

  5. Sequential Multiplier /Mnożenie sekwencyjne

  6. Wallace Tree Multiplier(with Carry Save Adders) W układach FPGA nie zaleca się stosowania CSA In FPGA the CSA are not recommended

  7. Mnożenie ze znakiem / Multiplication of Sign numbers Znak, Moduł / Sign-Module Standardowe mnożenie liczb dodatnich / Standard unsigned multiplication Znak= Znak1 XOR Znak2 Sign= Sign1 xor Sign2 W kodzie uzupełnień do dwóch Two’s Complement (a1+a2)*(b1+b2)= a1b1+ a1b2+a2b1+a2b2 C. R. Baugh and B. A.Wooley, “A two’s complement parallel array multiplication algorithm,” IEEE Trans. Comput., vol. C-22, pp. 1045–1047, Dec. 1973.

  8. Mnożenie w kodzie uzupełnień do 2 / Two’s complement multiplication

  9. Układ mnożący o zredukowanej szerokości / Reduced-width multiplier

  10. Kompensacja błędu redukcji / Truncation error compensation

  11. Mnożenie przez stały współczynnik / Constant Coefficient MultiplierLook UpTable (LUT) Example: Y= 5*X Address Data 0 0 1 5 2 10 3 15 ...

  12. LUT-based Multiplier Constant Coefficient: CY = CA = CA(0:3) + 24 CA(4:7)

  13. Different ROM sizesInput data width = 6 bits

  14. Heteregenous memory usage Virtex: 161, 321, 4k1, 2k2, 1k4, 5128, 25616Input data and coefficient width= 14

  15. Exchange distributed RAM to BRAM CLB BRAM

  16. Equvalent cost of 1 BRAM Only CLB, scale 1:10 # of BRAM Area [CLB] for different input and coeffitinent width K

  17. MM (Multiplierless Multiplication)Mnożenie bezmnożne • Binary Representation, example B= 14= 11102 • M= AB= (A<<1)+(A<<2)+(A<<3) • Sub-structure Sharing (SS) example B= 27= 110112 • tmp= A + (A<<1) • M= AB= tmp + (tmp<<3) • Canonic Sign Digit (CSD) • set {0, 1, -1} (0 – no operation, 1 – addition, -1 (1) – subtraction) • example: B= 7 = 1112 B= 1001CSD • M=B·A= (A<<2) + (A<<1) + AM= (A<<3)-A

  18. BINARNIE CSDinsert symbol ‘1’ only if the total number of operation is reduced Standard Modified

  19. Applience of different techniques of MM

  20. The MM cost for different coefficients

  21. Filters FIR

  22. Filter FIR (sposób pośredni/ transposed)

  23. FIR 2D

  24. 1 1 -1 2 1 -2 1 1 -1 0 2 1 0 4 -8 0 2 1 1 1 1 2 2 1 1 1 1 Examples of 2D FIR Filters Low-Pass Sobel Laplace

  25. 8 z-1 In 4 4 4 4 LUT M0 LUT L0 LUT M1 LUT L1 12 12 12 12 12 12 12 12 4 Adder1 Adder0 Adder1 Adder0 4 13 13 13 9 Multiplier 1 Multiplier 2 Adder2 Adder2 4 Adders Block 14 18 18 FIR Filter N=2LUT-based multipliers

  26. FIR, Arytmetyka w innej kolejności(Parallel) Distributed Arithmetic different bits of the input input coefficient

  27. Arytmetyka Rozproszona (Distributed Arithmetic) The same input bit weight (smaller LUT widths)

  28. Filtry FIR z liniową fazą / Linear Phase Filters(symetryczne/ symmetric: h(0)=h(N-1), h(1)=h(N-2), ...)

  29. FPGA, Built-in multiplier DSP48

  30. Example of sub-structure sharing for FIR filters H(z)= 5 + 13z-1 + 5z-2 = 1012 + 11012z-1 + 1012z-2 Example 1: A= 5 = 1012- temporary expression H(z)= A + (1000 + A)z-1 + Az-2 Example 2: A= 1 + z-1 H(z)= 5A + 8z-1 + 5z-2

  31. Materiały dodatkoweThe END

  32. Szybkie mnożenie w układach FPGA 26·(2·a7 ·b + a6 ·b)

  33. Układy mnożące w FPGA (a7 and bi) xor (a6 and bi+1) Przykład: G4 - a7 G3 - bi G2 - a6 G1 - bi+1 F4 – a7 F3 – bi-1 F2 – a6 F1 – bi Fragment of Virtex Configurable Logic Block (CLB)

More Related