1 / 16

Mapping into LUT Structures

Mapping into LUT Structures. Sayak Ray , Alan Mishchenko, Niklas Een, Robert Brayton Department of EECS, UC Berkeley Stephen Jang, Chao Chen Agate Logic Inc. Contributions (in a nutshell). New mapping algorithm for FPGAs, which maps into LUT structures , instead of LUTs

Download Presentation

Mapping into LUT Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mapping into LUT Structures Sayak Ray, Alan Mishchenko, Niklas Een, Robert Brayton Department of EECS, UC Berkeley Stephen Jang, Chao Chen Agate Logic Inc.

  2. Contributions (in a nutshell) New mapping algorithm for FPGAs, which maps into LUT structures, instead of LUTs It has two applications: (1) Improving the quality of mapping into LUTs Area improves by 7.4% on average Delay improves by 11.3% on average (2) Improving delay for specialized hardware, which supports non-routable connections Delay improves by 40% on average With some area penalty

  3. LUT Structure LUT-structure – a group of LUTs connected by direct, non-routable wires Non-routable Wire Non-routable Wire Non-routable Wire 7-input LUT structure “44” 10‑input LUT structure “444”

  4. Some Terminology Let (X) be a Boolean function Let X1  X be a subset of its support Suppose {q1(X), q2(X), …, q(X)} is the set of distinct cofactors of  w.r.t. X1  is called the column multiplicity of  w.r.t X1 Given a partition of X into two disjoint subsets X1and X2, we say that Ashenhurst-Curtis decomposition of(X) exists if(X) can be expressed as (X) = h(g1(X1), g2(X1), …, gk(X1), X2) X1 : bound set X2 : free set

  5. Flow of performLutMatchingXY 1 SupportMinimize removes vacuous variables 2 findOutputDecomposition Checks for f = x  G • Variable reordering in truth table • Allows cases  = 2, 3, 4 • For  = 3, 4, consider special decomposition with one shared variable only 3 findGoodBoundSet 4 checkSpecialNonDisjoint 5 reverseVariableOrder A heuristic to find suitable decomposition 6 findGoodBoundSet 7 checkSpecialNonDisjoint

  6. Checking for XYZ decomposition X, Y, and Z are sizes of the main/fanin LUTs Two step process Checking for XW where W = Y + Z – 2 If it exists, then check the remainder function G for YZ Priority cut-based technology mapper is modified to accommodate the algorithm for XY and XYZ The results of decomposition checking are cached This substantially reduces runtime on large designs

  7. Experiment 1

  8. Experiment 2

  9. Experiment 3

  10. Experiment 4 – Delay Optimization

  11. Experiment 5 – Delay Optimization

  12. Experiment 6 – Delay Optimization

  13. Experiment 7 : industrial design

  14. Experiment 8 : industrial design

  15. Future Work • Improving Implementation • Handling delay driven decomposition • Currently we ignore arrival time, and just care about detecting any decomposition • Using semi-canonical form to increase the number of hits in the hash table of computed results • Making truth-table based decomposition even faster • Combining Boolean decomposition into LUT structures with structural mapping of LUTs into clusters • Evaluating results after place and route • This will be especially interesting when specialized hardware is available

  16. Questions • Questions….

More Related