1 / 27

Architecture and Routing for NoC-based FPGA

Architecture and Routing for NoC-based FPGA. Israel Cidon*. *joint work with Roman Gindin and Idit Keidar. One NoC does not fit all!. Traffic uncertainty. CMP. Run time. FPGA. Configuration. SOC. Chip design. Flexibility. single application. General purpose computer.

vine
Download Presentation

Architecture and Routing for NoC-based FPGA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar

  2. One NoC does not fit all! Traffic uncertainty CMP Run time FPGA Configuration SOC Chip design Flexibility single application General purpose computer I. Cidon and K. Goossens, in “Networks on Chips” , G. De Micheli and L. Benini, Morgan Kaufmann, 2006

  3. Field Programmable Gate Array - 101 • Flexible Soft logic • Configurable logic blocks (CLBs) and routing channels • Programmed Look-up-tables (LUTs) • Configurable switching boxes • Area, power and speed efficient Hard logic • Wire and clock infrastructure • Special purpose modules, e.g., CPU, SerDes

  4. Challenges for Future FPGA • Scalability of design methodology • Dominance of wire delays • Already more than 50% of delay • Power • Complex communication patterns • Prototyping for NoC-based SoCs

  5. NoC Based FPGA Architecture Functional unit NoC for inter-routing Routers Configurable region – User logic Configurable network interface

  6. Why hard Interconnect is a performance bottleneck Interconnect power Part of FPGA infrastructure Why soft Application is not known when the network is built Provides maximum flexibility Prevents resource lockup Hard or soft NoC?

  7. Suggested FPGA NoC Architecture

  8. FPGA Routing – Optimization Problem Common efficient NoC Set of Applications Different Architectures Different Traffic Patterns Implemented on the same chip

  9. The NoC design problem • Design Envelope • Collection of designs supported by a given programmable chip • The cost • Hard grid links • For uniform grids - the capacity of the most congestion link • NoC Logic • Hard logic for router • Soft logic for routing tables, headers, CNIs • The variables • Number of “hard-coded” wires per link • Possible configurable routing schemes

  10. Routing Schemes • XY • Very simple logic • Deadlock free • Unbalanced - high cost in uniform capacity grids

  11. Toggle XY (TXY) • Split packets evenly between XY, YX routes • Deadlock avoided with 2 VCs • Near-optimal for symmetric traffic (permutations) [Seo et al. 05; Towles & Dally 02] • Simple • Better Balanced • Split routes • Does not take into account the traffic pattern

  12. Max. Capacity for graph with two hotspots at (1,1) and (1,2) on 5x5 grid Weighted Schemes • TXY not always produces the best results - TXY Optimum

  13. WTXY • Given a traffic pattern, choose XY/YX ratio of lowest maximum capacity • Compute the ratio at programming time • Load into Cxyfield in router • Router chooses XY route with probability Cxy, otherwise YX

  14. TXY, WTXY Limitation • Traffic split • packets of the same flow take different paths • Delays may cause out-of-order arrivals • Re-ordering buffers are costly

  15. Ordered Routing Algorithms • One route per source-destination (S-D) pair • No traffic splitting Unordered Routing Ordered Routing

  16. Source Toggle XY • The route is a function of source and destination ID • bitwise XOR • Very simple algorithm • Maximum capacity is similar to TXY

  17. Weighted Ordered Toggle - WOT • Weighted Ordered Toggle (WOT) • Route per S-D pair is chosen at programming time • Each source stores a routing bit for each destination • Objective: minimize max link capacity • Optimal route assignment is difficult

  18. WOT Min-max Route Assignment • initial assignment - STXY • Make changes that reduce the capacity: • Find most loaded link • Among S-D pairs sharing this link change one that minimizes the max capacity (if possible) • Sub-optimal

  19. Iteration Demonstration S3 S2 S1 D3 D1 D2

  20. Benchmarks • Previous work consider uniform permutations • Chips have one or more hotspots • CPU, on-chip memory, off-chip memory interface • We use several hot-spot traffic models • Also use a real world example

  21. Single Hotspot

  22. Two Hotspots Design Envelope for various distances between the hotspots for WOT Maximum Capacity

  23. Three Hotspots • Maximum capacity vs. Minimum distance between the hotspots

  24. Mixed Traffic Model • Three parameters per node • A probability to be a hotspot, • A probability to send data to a hotspot • A probability to send data to a non-hotspot • Average improvement for WOT vs. TXY is 12% and vs. XT is 25%

  25. Real-World Example • Based on Bertozzi - video encoder • Mapping and placement are done manually

  26. Real World Example • Maximum Capacity • WOT - 1053 • STXY -1377 • XY - 1539

  27. Summary • A new NoC-based architecture for FPGA • A design methodology for this architecture. • WOT routing algorithm – • Balanced • In-order • Low cost

More Related