1 / 18

Enhancing FPGA Performance for Arithmetic Circuits

This paper discusses the state of the art in FPGA technology, proposes a solution using a Field Programmable Counter Array (FPCA), and presents experimental results. It demonstrates significant improvement in area utilization and highlights the importance of using counters as building blocks for multi-input additions.

pcastellano
Download Presentation

Enhancing FPGA Performance for Arithmetic Circuits

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enhancing FPGA Performance for Arithmetic Circuits Philip Brisk1 Ajay K. Verma1 Paolo Ienne1 Hadi Parandeh-Afshar1,2 2University of Tehran 1 Department of Electrical and Computer Engineering

  2. Outline • State of the Art: FPGAs • Proposed Solution • Field Programmable Counter Array (FPCA) • New Lattice for Accelerating Arithmetic Computations • Integrate on Same Die as FPGA • Experimental Results • Conclusion

  3. ASIC FPGA Performance Area Utilization Power Consumption Flexibility Time-to-Market FPGA vs. ASIC √ √ √ √ √

  4. FPGA Commentary • Poor Performance for Arithmetic Operations Compared to ASIC • IP Cores • Limited Flexibility; 18-bit Adder/Multiplier • Full Adder Implemented in CLB Structure • Fast Carry-Chain (Xilinx and Altera) • Reduces Routing Delay • Cannot Use Compressor Trees to Add k>2 Values • Wallace/Dadda/3-Greedy

  5. Proposed Solution • Transform a DFG to Expose Multi-Input Addition Ops • [Verma and Ienne, ICCAD ’04] • Map Addition Ops onto New Lattice (FPCA) • Proposed Here • Map Everything Else onto Traditional FPGA • Standard Approach • Integrate FPGA+FPCA Onto Same Die • Ongoing Research at EPFL

  6. step 3 delta 7 delta 4 delta 2 delta 1 >> 4 0 0 0 + step 1 0 & = = = SEL >> = step 0 step 1 step 2 step 3 2 + 0 & step 2 >> >> >> >> 0 0 0 SEL = >> SEL SEL SEL 1 & & & & 0 & + ∑ Compressor Tree SEL = + vpdiff vpdiff Verma-Ienne Transformation [ICCAD ’04] ADPCM

  7. Proposed Hybrid Lattice FPGA FPCA ∑ + Final Adder (Programmable IP or FPGA) • FPCA : Field Programmable Counter Array • Novel Lattice for Accelerating Large Sums

  8. Counters Counters You Know 2:2 – Half Adder 3:2 – Full Adder Count #of Input Bits Set to 1 Output # as a Binary Value (Carry-Save Adder) m:n counter m The correct building block for computing sums of k>2 numbers n Better than LUTs! n = log2(m+1)

  9. Field Programmable Counter Array (FPCA) • Same Structure as an FPGA • Replace CLBs with Counters • Integrate onto Same Die as FPGA FPGA: (CLB) FPCA: (Counter)

  10. Experimental Methodology • Xilinx Virtex-4, Altera Stratix-II, With/Without FPCA • 90nm CMOS Technology • For Multi-Input Addition Ops • FPGA – Adder Tree • Binary Adders in Virtex-4 • Ternary Adders in Stratix-II • FPCA – Build Compressor Trees From Counters • Use Modified Wallace Algorithm • Place-and-Route Using VPR • Use FPGA for Final Addition

  11. Experimental Results Delay (ns)

  12. Experimental Results

  13. Experimental Results Delay (ns) Virtex-4 Stratix-II Virtex-4 Stratix-II Virtex-4 Stratix-II

  14. Experimental Results Virtex-4 Stratix-II Virtex-4 Stratix-II Virtex-4 Stratix-II FPCA – Register Placed on Every Counter Output

  15. Experimental Results

  16. Experimental Results

  17. Experimental Results

  18. Conclusion • FPGA Performance for Arithmetic Circuits is Lacking • Hybrid FPGA/FPCA Accelerates Arithmetic Circuits • Significant Improvement in Area Utilization • Counters are the Correct Building Blocks for Multi-Input Additions • Marginal Improvements in Delay • FPGA – Fast Carry-Chain (No Routing Delay) • FPCA – All Wires Having Routing Delays • Naïve/No Retiming in These Experiments

More Related