1 / 40

ACME LAB

ACME LAB. PipeRoute: A Pipelining-Aware Router for FPGAs Akshay Sharma, Carl Ebeling* and Scott Hauck Electrical Engineering / *Computer Science & Engineering University of Washington Seattle, WA – 98195. Pipelined FPGA Architectures. FPGAs and flexible computing But, max clock frequency?

rhea
Download Presentation

ACME LAB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ACME LAB PipeRoute: A Pipelining-Aware Router for FPGAsAkshay Sharma, Carl Ebeling* and Scott HauckElectrical Engineering / *Computer Science & EngineeringUniversity of WashingtonSeattle, WA – 98195

  2. Pipelined FPGA Architectures • FPGAs and flexible computing • But, max clock frequency? • Examples of pipelined FPGAs • RaPiD (Ebeling et al, 1996) • HSRA (Tsu et al, 1999) • UCSB (Singh et al, 2001) • Few prominent features • A fraction of (or all) switch-points are registered • Registered LUT inputs • Netlists heavily pipelined and retimed

  3. Pipelined Routing • PipeRoute – route netlists on pipelined FPGAs • pipelined netlist provides information about register separation • FPGA routing graph consists of R-nodes and D-nodes • Cost of using an R-node or D-node in a route is the same as Pathfinder • Pipelined routing problem differs from normal FPGA routing T1  S   T2

  4. Normal Routing – Two Terminal • Dijkstra’s shortest-path for two-terminal routing T S

  5. Normal Routing – Two Terminal • Dijkstra’s shortest-path for two-terminal routing T S

  6. Normal Routing – Two Terminal • Dijkstra’s shortest-path for two-terminal routing T S

  7. Normal Routing – Two Terminal • Dijkstra’s shortest-path for two-terminal routing T S

  8. Normal Routing – Two Terminal • Dijkstra’s shortest-path for two-terminal routing T S

  9.  T S    Pipeline Routing – Two Terminal • Find shortest route that goes through N registers (hereafter “registers” will be called “delays”) • Traveling Salesman • Find shortest route that goes through all nodes in a graph • NP Complete

  10. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  11. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  12. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  13. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  14. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  15. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  16. Two Terminal 1-Delay Router • Can do optimal routing for 1-delay routes via Dijkstra  S T 

  17. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router     S T 

  18. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router • Find 1-delay route     S T 

  19. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router • Find 1-delay route • While not enough delay on route • Replace any 0-delay segment with cheapest 1-delay replacement     S T 

  20. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router • Find 1-delay route • While not enough delay on route • Replace any 0-delay segment with cheapest 1-delay replacement     S T 

  21. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router • Find 1-delay route • While not enough delay on route • Replace any 0-delay segment with cheapest 1-delay replacement     S T 

  22. Two Terminal N-Delay Router • Greedy Approximation via 1-Delay Router • Find 1-delay route • While not enough delay on route • Replace any 0-delay segment with cheapest 1-delay replacement     S T 

  23. Normal Routing – Multi-Terminal • Do two-terminal routing • Use all of previous route(s) as source for next route T1 S T2

  24. T1 S Normal Routing – Multi-Terminal • Do two-terminal routing • Use all of previous route(s) as source for next route T2

  25. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S T1 S     T2 

  26. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S T1 S     T2 

  27. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time T1 S     T2 

  28. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 

  29. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 1 

  30. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 

  31. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 2 

  32. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 

  33. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 3 

  34. Multi-Terminal Router • Sinks considered in increasing order of delay separation • T1 is 2 delays away from S, and T2 is 3 delaysaway from S • Accumulate 1 delay at a time • When routing for an I delay, start from all existing routing at delay I and I-1 T1 S     T2 

  35. Benchmark Architecture • Modified RaPiD architecture • 1-D datapath of 16-bit ALUs, Multipliers, registers and memories • Pipelined interconnect structure • Long and short tracks • Bus Connectors used to pick up delay

  36. Testing • Benchmark RaPiD netlists • Pipelining aware placement tool • For each netlist • Treat netlist as unpipelined and determine smallest RaPiD arch. (Zl) • Determine smallest RaPiD arch. needed to route pipelined netlist (Zp) • Pipelining cost = Zp/Zl

  37. Results • Avg pipelining cost incurred = 1.74

  38. Results • Effect of netlist-size on pipelining cost • Normalized to unpipelined netlist area

  39. Results • Effect of % pipelined signals on pipelining cost • Normalized to unpipelined circuit area

  40. The Future • Delay driven PipeRoute • Currently under development • Sophisticated pipelining-aware placement algorithms • Fast pipelined routing algorithms • Use PipeRoute to explore pipelined FPGA architectures • Number and location of registered switch-points

More Related