1 / 23

Introduction

Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis of Irregular Application Specific NoC's by Ben Meakin. Introduction. N1. N2. N3. N1. N2. N3. Irregular vs Regular Networks. N4. N5.

konane
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Presentation ofDesigning Efficient Irregular Networks forHeterogeneous Systems-on-ChipbyChristian Neeb and Norbert WehnandWorkload Driven Synthesis of Irregular Application Specific NoC'sbyBen Meakin

  2. Introduction N1 N2 N3 N1 N2 N3 • Irregular vs Regular Networks N4 N5 N6 N4 N5 N6 N7 N8 N9 N7 N8 N9 • Advantages: • Efficient for general purpose applications • Easy routing • Easy floor planning/P&R • Disadvantages: • Expensive • Under utilized channels • Avg. latency not as scalable • Advantages: • High performance/low power for specific application • Lower hardware cost of network • Disadvantages: • Difficult routing • Difficult floor planning/P&R

  3. Background and Motivation • NoC power a function of network hops: • Pperflit = (buff + xbar + arb)*Hops + wire*D • NoC latency a function of network hops: • L = Rpipeline * Hops (clock cycles)‏ • NoC hardware cost a function of router radix: • C = sum (k1*radix_n + k2) n = 1....nodes • Minimize hops and router complexity!

  4. Design of Efficient Irregular Networks for Heterogeneous Systems-on-ChipChristian Neeb and Norbert WehnUniversity of Kaiserslautern9th EUROMICRO Conference on Digital System Design 2006

  5. Application Communication Model • Resource Communication Graph (RCG)‏ • Vertex: hardware component • Edge: abstract communication channel • Edge Weight: average communication rate (bandwidth)‏ • Normalized to physical bandwidth of one channel link • Efficient mapping of task communication to RCG is assumed and is not discussed in this paper

  6. INOC Architecture • Nodes are tiles consisting of a chip resource and a router • Routers: Wormhole with virtual channels • Round-Robin switch arbitration • Routing: • Deterministic: one output channel per destination address • Adaptive: > one output channel per destination address

  7. Routing in Irregular Topologies • Channel Dependency Graph (CDG)‏ • Vertex: channel • Edge: pair of channels with a dependency due to routing function • Deadlock free if there are no cycles • Node Numbering • Arbitrary in this paper • Channels Increasing / Channels Decreasing • Acyclic CDGs • Determined by source/destination node numbers • Deadlock Free Routing • Route packets on increasing channels, then decreasing • General form of dimension order routing

  8. Routing in Irregular Topologies • Unsafe paths allowed provided there is a safe path (acyclic) that a packet can escape to • Routing table filled via Dijkstra's algorithm • Virtual channels are used to expand routing function using similar rules as dimension order schemes

  9. Network Generation • Start with a bidirectional connection of all nodes in ascending order • Ensures a correct routing path exists • Add shortcuts based on network traffic estimation and RCG • Provides an optimal routing path that will be used in the common case

  10. Results • Network traffic patterns randomly generated with constraints to model heterogeneous SoC traffic • Simulated with 100 different patterns • Variable flit injections rates

  11. Limitations • Naive implementation of “safe path” increases cost/complexity of network • RCG is the most efficient topology • More efficient deadlock free solutions may exist • Efficient mapping of TCG to RCG is assumed • Can be a difficult problem • Network generation could be linked to a specific programming model or API for simple TCG to RCG mapping

  12. Workload Driven Synthesis of Irregular Application-Specific NoC's Ben Meakinmeakin@cs.utah.edu

  13. Project Objectives • Synthesis tool for INoC's • Generation of optimal network topology • MCAPI workload specification • Synthesis of deadlock free routing • Comparison of irregular to regular NoC performance, power, and cost

  14. Motivation • CAD for ASICS always needs improvement • Reduce design cost, time-to-market, etc. • Synthesis of INoCs could be useful in implementing parallel algorithms in programmable logic • Average router complexity needs to be minimized • Asynch NoCs can realize improvement in average case performance • GALS INoCs could be future direction in heterogeneous SoC • Minimize hops to reduce power and latency

  15. Specification of Workload • Communication channels: ScalarChan N1 N2 256e9 10 Comm Type Sender Receiver BW (bits/sec) Priority • Optimization Effort: E = 100 * ((BW / Norm)/2 + (PR / Norm)/2)‏

  16. Network Synthesis Algorithm • Sort channels by priority/effort • Add all nodes to network • For each channel: • if sender has links: • while link has not been added • if receiver has links, add link; else increase search depth • else recursive call from next sender neighbor node • while not done • if network is not correct, add link; else done

  17. Network Synthesis Example • Max Radix is 3 • N1 – N2 added • Highest priority link N1 N2

  18. Network Synthesis Example • N1 – N4 added • N1 – N3 added • N3 – N4 added N4 N1 N2 N3

  19. Network Synthesis Example • N1 – N5 added • Max radix of N1 reached, so N2 used as via • N6 – N7 added • N5 – N8 added • Are we done? N4 N1 N2 N6 N8 N3 N5 N7

  20. Network Synthesis Example • N2 – N6 added for correctness • Every node must have a path to every other node N4 N1 N2 N6 N8 N3 N5 N7

  21. Synthesis of Routing Function • Rename nodes with naming algorithm • Depth first, breadth first, root node selection • Most efficient synthesis for application • For each node pair in workload • check if shortest path is correct • if not, add a virtual channel to make it correct • Most correct synthesis for general purpose • For ALL node pairs in network • check if shortest path is correct • if not, add a virtual channel to make it correct • Synthesize ROM for each node

  22. Uses of this Tool • Incorporate with previous MCAPI hardware implementation • Allow user to synthesize a working HDL model based on MCAPI workloads and a set of IP blocks • Modern FPGA's could make this a feasible and cost effective alternative to fabricating ASIC's • Aid for studying optimal topology for known network traffic

  23. Questions?

More Related