1 / 66

Routing

Routing. Z. Lu / A.Jantsch / I. Sander Dally and Towles, Chapter 8, 9, 10, 11. Overview. Interconnect Network Introduction. Topology (Regular, irregular). Deadlock, Livelock. Router Architecture (pipelined, classic) . Routing (algorithms, mechanics). Network Interface

vilina
Download Presentation

Routing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Routing Z. Lu / A.Jantsch / I. Sander Dally and Towles, Chapter 8, 9, 10, 11

  2. Overview Interconnect Network Introduction Topology (Regular, irregular) Deadlock, Livelock Router Architecture (pipelined, classic) Routing (algorithms, mechanics) Network Interface (message passing, shared memory) Flow Control (Circuit, packet switching (SAF, wormhole, virtual channel) Performance Analysis and QoS Implementation Evaluation Summary Concepts SoC Architecture

  3. Outline • A motivating example: different routing algorithms make difference • Routing algorithms • Deterministic routing • Oblivious routing • Adaptive routing • Routing mechanics • Table-based: source-table and node-table • Algorithmic routing SoC Architecture

  4. Routing • Objectives of a routing algorithm: • Find a path from a source node to a destination node on a given topology • Reduce the number of hops and overall latency, saving energy. • Balance the load of network channels, maximizing throughput. • Going for minimal routes is not always good. SoC Architecture

  5. 0 1 2 3 4 5 6 7 Shanghai Stockholm An Introductory Example • There are only two directions a packet can take: clockwise and counterclockwise • There are plenty of possible routing algorithms SoC Architecture

  6. 0 1 2 3 4 5 6 7 What about performance? • Performance of algorithm depends on Toplogy and traffic pattern • For further investigation, we assume the Tornado traffic pattern • Every node i sends a packet to node (i+3) mod 8 • 0 -> 3, 1->4, ... 5->8, 6->1, 7->2 • Lower bound: (8 packets, 3 hops, 16 channels) Is the bound tight? SoC Architecture

  7. 0 1 2 3 4 5 6 7 Shanghai Stockholm An Introductory Example • Greedy: Always send a packet in the shortest direction. If the distance is the same in both directions, pick a direction randomly. SoC Architecture

  8. 0 1 2 3 4 5 6 7 Performance of Greedy Algorithm • No traffic is routed counterclockwise => Bad utilization of channels • Thus and  = 3 SoC Architecture

  9. 0 1 2 3 4 5 6 7 Shanghai Stockholm An Introductory Example • Uniform Random: Send a packet randomly (p=0.5) either clockwise or counterclockwise SoC Architecture

  10. 0 1 2 3 4 5 6 7 Performance of Uniform Random Algorithm • Half of the traffic is going clockwise and half of the traffic counterclockwise • Bottleneck counterclockwise traffic, where half of the traffic traverses five links • and thus p=0.5 p=0.5 SoC Architecture

  11. 0 1 2 3 4 5 6 7 Shanghai Stockholm An Introductory Example • Weighted Random: Randomly pick a direction for each packet, but weight the short direction with probability 1-p/8, and the long direction with p/8 where p is the minimum distance between source and destination SoC Architecture

  12. 0 1 2 3 4 5 6 7 Performance ofWeighted Random Algorithm • 5/8 of the traffic is going clockwise (3 links) and • 3/8 of the traffic goes counterclockwise (5 links) • In both directions thus p=0.625 p=0.375 SoC Architecture

  13. 0 1 2 3 4 5 6 7 Shanghai Stockholm An Introductory Example • Adaptive: Send the packet in the direction for which the local channel has the lowest load. • This can be done by either • measuring the lentgh of the queue in this direction • recording how many packets a channel has transmitted over the last T slots SoC Architecture

  14. 0 1 2 3 4 5 6 7 Performance of Adaptive Routing • Adaptive routing will in a steady state match the perfect load. • Thus it will in theory allow in both directions SoC Architecture

  15. What is the best Algorithm? • Lower bound on channel load: • Upper bound on throughput: SoC Architecture

  16. Let’s analyze the algorithms • 4 Algorithms • Greedy • Random • Weighted random • Adaptive • Properties • In terms of adaptivity: deterministic, oblivious, adaptive • In terms of shortest path: minimal, non-minimal How would you classify the algrothms according to the properties? SoC Architecture

  17. Taxonomy of Routing Algorithms • Deterministic algorithms always choose the same path between two nodes • Easy to implement and to make deadlock free • Do not use path diversity and thus bad on load balancing • The tornado example: The Greedy algorithm SoC Architecture

  18. Taxonomy of Routing Algorithms • Oblivious (Oblivious: unconscious) algorithms always choose a route without knowing about the state of the network • Random algorithms that do not consider the network state, are oblivious algorithms • Include deterministic routing algorithms as a subset • The tornado example: Random, Weighted random SoC Architecture

  19. Taxonomy of Routing Algorithms • Adaptive algorithms use information about the state of the network to make routing decisions • Length of queues • Historical channel load SoC Architecture

  20. Taxonomy of Routing Algorithms • Minimal algorithms only consider minimal routes (shortest path) • The tornado example: the greedy algorithm • Non-Minimal algorithms allow even non-minimal routes • The tornado example: Random and weighted random, adaptive SoC Architecture

  21. Deterministic Routing • A packet from a source node s to a destination node dis always sent over exactly the same route. • Advantages • Simple and inexpensive to implement • Usual deterministic routing is minimal, which leads to short path length • Packets arrive in order • Disadvantage • Does not exploit path diversity and can create large load imbalances SoC Architecture

  22. Deterministic Routing Example:Destination-Tag Routing in Butterfly Networks Routed to destination 5, 101 Routed to destination 11, 23 Binary tag=101 (down,up,down) (3-digit radix-2) Quaternary tag=23 (2-digit radix-4) Destination-Tag Routing depends on the destination address only (not on source) SoC Architecture

  23. Deterministic Routing Example:Dimension-Order Routing in Tori and Meshes • Also called e-cube routing • Digits of destination address are used to one at a time to direct the routing. • Since a torus can be traversed in clockwise or counterclockwise direction, the preferred direction in each dimension have to be calculated first • The packet is routed along these directions { +x / -x, +y / -y } until it reaches its destination SoC Architecture

  24. Dimension-Order RoutingExample • Compute the relative address ∆ifor each digit i • Here: ∆ = (2, -1) • Thus preferred direction is: D = (+1, -1) • First the packet is routed along the first dimension (-x) and then along the second dimension (+y) SoC Architecture

  25. Dimension-Order RoutingExample • What happens, if the packet shall be routed from 03 to 32? • The packet now needs three hops in both +yand -ydirection • In order to balance the load a random decision should be made SoC Architecture

  26. Oblivious Routing • Packets are routed without regard of the state of the network • Main tradeoff is between • Locality (short path lengths) • Load balance • Oblivious routing is easy to analyze due to linear channel load functions SoC Architecture

  27. Valiant’s Randomized Routing Algorithm • Valiant’s Algorithm can balance the load for any traffic pattern to any connected topology (a topology with at least one path between each terminal pair.) • The delivery of packet is divided into two phases: • Phase 1: A packet is first sent from s to a random node x • Phase 2: Then it is sent from x to its destination d SoC Architecture

  28. Valiant’s Algorithm on Torus • A packet shall be routed from 00 to 12 • The packet is first routed to a random destination (31) • Dimension-Order Routing is used in this example • The packet is then routed to its final destination 12 • Again Dimension-Order Routing is used in this example 6-ary 2-cube (6-by-6 torus) SoC Architecture

  29. Valiant’s Algorithm on Torus • Torus • Channel load under uniform traffic (50% of traffic crosses bisection) T,U = k / 8 • Channel load under worst traffic (100% of traffic crosses bisection) T,W = k / 4 • Average minimum hop count (k even) Hmin, T = nk / 4 6-ary 2-cube SoC Architecture

  30. Valiant’s Algorithm on Torus • Valiant’s Algorithm gives a good worst case performance on k-ary n-cubes (any traffic pattern) • Each packet travels on average Hmin = 2k/4 for each phase, resulting in HTotal = k • Bisection load is γ= k/4 (twice the load of random traffic T,U) • Throughput is Θ = 4b/k (half throughput of random traffic) 6-ary 2-cube SoC Architecture

  31. Valiant’s Algorithm on Torus • Good performance on worst-case traffic patterns comes at expense of • Locality • Overhead (2 routing phases) • Valiant’s Algorithm performs very bad for nearest neighbor traffic • Hmin is increased from 1 to nk/2 6-ary 2-cube SoC Architecture

  32. Minimal oblivious routing • Minimal oblivious routing attempts to achieve the load balance of randomized routing without giving up the locality • A minimal version of Valiant’s algorithm • This is done by restricting routes to minimal paths • Again routing is done in two steps • Route to random node • Route to destination SoC Architecture

  33. Idea: For each packet randomly determine a node xinside the minimal quadrant, such that the packet is routed from source node s to x and then to destination node d Assumption: At each node routing in x or y direction is allowed. 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  34. For each node in quadrant (00, 10, 20, 01, 11, 21) Determine a minimal route via x Start with x = 00 Three possible routes: (00, 01, 11, 21) (p=0.33) (00, 10, 20, 21) (p=0.33) (00,10,11,21) (p=0.33) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  35. x = 01 One possible route: (00, 01, 11, 21) (p=1) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  36. x = 10 Two possible routes: (00, 10, 20, 21) (p=0.5) (00, 10, 11, 21) (p=0.5) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  37. x = 11 Two possible routes: (00, 10, 11, 21) (p=0.5) (00, 01, 11, 21) (p=0.5) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  38. x = 20 One possible route: (00, 10, 20, 21) (p=1) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  39. x = 21 Three possible routes: (00, 01, 11, 21) (p=0.33) (00, 10, 20, 21) (p=0.33) (00, 10, 11, 21) (p=0.33) 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  40. Adding the probabilities on each channel Example, link (00,01) P=1/3, x = 00 P=1, x = 01 P=0, x = 10 P=1/2, x = 11 P=0, x = 20 P=1/3, x = 21 P(00,01)=(2*1/3+1/2+1)/6 =2.17/6 20 03 13 23 33 02 12 30 01 11 21 31 00 10 22 32 Minimal oblivious routing SoC Architecture

  41. Results: Load is not very balanced Path between node 10 and 11 is less used Load imbalance can be fixed by weighting the choice of different intermediate nodes, resulting in Load-Balanced Oblivious Routing 30 12 03 13 33 02 22 23 32 01 21 20 31 00 10 11 Minimal oblivious routing p = 2.17/6 p = 3.83/6 p = 2.17/6 p = 1.67/6 p = 2.17/6 p = 3.83/6 p = 2.17/6 SoC Architecture

  42. Adaptive Routing • Adaptive routing algorithms use information about the network state, typically queue occupancies, to route packets over the network • In theory, adaptive routing should be better than an oblivious algorithm • However in practice, they do not, since only local information is used for adaptive decisions • Locally load is balanced • Globally load maybe imbalanced SoC Architecture

  43. Adaptive RoutingLocal Information not enough • In each cycle node 5 sends a packet to node 6 • Node 3 sends packet to node 7 SoC Architecture

  44. Adaptive RoutingLocal Information not enough • Node 3 does not know about the traffic between 5 and 6 before the input buffers between node 3 and 5 are completely filled with packets! SoC Architecture

  45. Adaptive RoutingLocal Information is not enough • Adaptive flow works better with smaller buffers, since small buffers fill faster and thus congestion is propagated earlier to the sensing node (stiff backpressure) SoC Architecture

  46. Adaptive Routing • How does the adaptive routing algorithm sense the state of the network? • It can only sense current,local information • Global information is based on historic local information • Global information is also difficult to collect • Changes in the traffic flow in the network are observed much later SoC Architecture

  47. Minimal adaptive routing chooses among the minimal routes from source s to destination d At each hop a routing function generates a productive output vector that identifies which output channels of the current node will move the packet closer to its destination Network state is then used to select one of these channels for the next hop 12 03 13 33 02 20 30 01 11 21 31 10 22 32 Minimal Adaptive Routing 23 00 SoC Architecture

  48. Good at locally balancing load Poor at globally balancing load Minimal adaptive routing algorithms are unable to avoid congestion of source-destination pairs with no minimal path diversity. Minimal Adaptive Routing Local congestion can be avoided Local congestion cannot be avoided SoC Architecture

  49. Local optimum results in global non-optimum. 10 20 03 13 33 12 02 32 01 11 21 31 22 30 Minimal Adaptive Routing 23 How to improve the situation? 00 SoC Architecture

  50. Fully-Adaptive Routing does not restrict packets to take the shortest path Misrouting is allowed This can help to avoid congested areas and improves load balance 20 10 03 13 33 12 02 32 01 11 21 31 22 30 Fully Adaptive Routing 23 00 What problem may fully adaptive routing lead to? SoC Architecture

More Related