1 / 42

EDA (CS286.5b)

EDA (CS286.5b). Day 19 Covering and Retiming. “Final”. Like Assignment #1 longer more breadth focus since assignment #2 …but ideas are cummulative open notes, books, etc. Make available before 12:01am Mon. 3/13 Due 11:59pm Wed. 3/15. Previously. Cover (map) LUTs for minimum delay

martha
Download Presentation

EDA (CS286.5b)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EDA (CS286.5b) Day 19 Covering and Retiming

  2. “Final” • Like Assignment #1 • longer • more breadth • focus since assignment #2 • …but ideas are cummulative • open notes, books, etc. • Make available before 12:01am Mon. 3/13 • Due 11:59pm Wed. 3/15

  3. Previously • Cover (map) LUTs for minimum delay • day 3 • solve optimally • Retiming for minimum clock period • day 18 • solve optimally • Simultaneous Cover and 1D placement • day 9 • optimal area cover for trees

  4. Today • Solving cover/retime separately not optimal • Cover+retime • Review

  5. Example

  6. Example

  7. Example: Retimed

  8. Example: Retimed Note: only 4 signals here (2 w/ 2 delays each)

  9. Example 2

  10. Example 2 Cycle Bound: 2

  11. Example 2: retimed

  12. Example 2: retimed Cycle Bound: 1

  13. Basic Observation • Registers break up circuit, limiting coverage • fragmentation • prevent grouping

  14. Phase Ordering Problem • General problem we’ve seen before • e.g. placement • don’t know where connected neighbors will be if unplaced… • don’t know effect/results of other mapping step • Here • don’t know delay (what can be packed into LUT) if retime first • not retime first • fragmention: forced breaks at bad places

  15. Observation #1 • Retiming flops to input of (fanout free) subgraph is trivial (and always doable)

  16. Observation #1: Consequence • Can cover ignoring flop placement • Then retime LUTs to input

  17. Fanout Problem? Can I use the same trick here?

  18. Fanout Problem? Cannot retime without replicating. Replicating increase I/O (so cut size).

  19. Different Replication Problem

  20. Different Replication Problem

  21. Different Replication Problem Can now retime and cover with single LUT.

  22. Replication • Once add registers • can’t just grab max flow and get replication • (compare flowmap) • Or, can’t just ignore flop placement when have reconvergent fanout through flop

  23. Replication • Key idea: • represent timing paths in graph • differentiating based on number of registers in path • new graph: all paths from node to output have same number of flip-flops • label nodes ud where d is flip-flops to output

  24. Deal with Replication • Expanded Graph: • start with target output node • for each input u to current expanded graph • grab it’s input edge (xu) with weight (w(e)) • add node x(d+w(e)) to graph (if necessary) • add edge x(d+w(e)) ud with weight (w(e)) • continue breadth first until have enough • enough for flow cut • at most k*n node depth required

  25. Example c c0 a b i j

  26. Example c c0 a0 b1 a b i j

  27. Example c c0 a0 b1 a b i0 j0 c1 i j

  28. Example c c0 a0 b1 a b i0 j0 c1 i j a1 b2

  29. a c e b d Example 2 e0 c0 d0 a1 a0 b0 b1 i1 j1 i0 j0

  30. Expanded Graph • Expanded graph does not have fanout of different flip-flop depths from the same node. • Can now cover ignoring flip-flops and trivially retime.

  31. Labeling • Key idea #1: • compute distances/delay like flowmap • dynamic programming • Key idea #2: • count distance from register (like G-1/c graph)

  32. Labeling: Edge Weights • To target clock period c • use graph G-1/c • paper: • assign weight -c*w(e)+1 • (same thing scaled by c and negated)

  33. Labeling: Edge Weight Idea • same idea: • will need register ever c LUT delays • credit with registers as encounter • charge a fraction (1/c) every LUT delay • know net distance at each point • if negative (delays > c*registers) • cannot distribute to achieve c • otherwise • labeling tells where to distribute

  34. Labeling: Flow cut • Label node as before (flowmap) • L(v)=min{l(u)+w(e)|uv} • trivially can be L(v)-1/c == new LUT • note min vs. max and -1/c vs. +1 due to rescaling to match retiming formulation and G-1/c graph • in this formulation, a combinational circuit of depth 4 would have L(v)=-4/c • if can put this and all L(v)’s in one LUT • this can be L(v) • construct and compute flow cut to test

  35. LUT Map and Retime • Start with outputs • Cover with LUT based on cut • move flip-flops to inputs of LUT • Recursively cover inputs • Use label to retime • r(v)=l(v)+1/c

  36. Target Clock Period c • As before (retiming) • binary search to find optimal c

  37. Variations • Relaxation/Iteration • original computed labels iteratively • Flow cover • Cong+Wu/ICCAD96 showed can use flowmap-style min-cut • Find all k-cuts first • Pan+Liu/FPGA’98

  38. Summary • Can optimally solve • LUT map for delay • retiming for minimum clock period • But, solving separately does not give optimal solution to problem • Account for registers on paths • Label based on register placement and (flow) cover ignoring registers • Labeling gives delay,covering, retiming

  39. Today’s Big Ideas • Exploit freedom • Cost of decomposition • benefit of composite solution • Technique: • dynamic programming • network flow

  40. This Class: Decomposition • Scheduling • Logic Optimization • sequential, two-level, multi-level • Covering/gate-mapping • Retiming • Partitioning • Placement • Routing

  41. This Class: Techniques • Dynamic Programming • Linear Programming (LP, ILP) • Graph Algorithms • Greedy Algorithms • Randomization • Search • Heuristics • Approximation Algorithms • Iterative/Relaxation

  42. Spring Quarter • Primarily Project • select problem • formulate • survey attacks, similar problems, brainstorm • implement • benchmark/analyze • Few lectures on additional topics • most geared to projects • time to discuss, present project

More Related