Indirect adaptive routing on large scale interconnection networks
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Indirect Adaptive Routing on Large Scale Interconnection Networks PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on
  • Presentation posted in: General

Indirect Adaptive Routing on Large Scale Interconnection Networks. Nan Jiang, William J. Dally Computer System Laboratory Stanford University. John Kim Korean Advanced Institute of Science and Technology. Overview. Indirect adaptive routing (IAR)

Download Presentation

Indirect Adaptive Routing on Large Scale Interconnection Networks

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Indirect adaptive routing on large scale interconnection networks

Indirect Adaptive Routing on Large Scale Interconnection Networks

Nan Jiang, William J. Dally

Computer System Laboratory

Stanford University

John Kim

Korean Advanced Institute of Science and Technology


Overview

Overview

Indirect adaptive routing (IAR)

Allow adaptive routing decision to be based on local and remote congestion information

Main contributions

Three new IAR algorithms for large scale networks

Steady state and transient performance evaluations

Impact of network configurations

Cost of implementation


Presentation outline

Presentation Outline

Background

The dragonfly network

Adaptive routing

Indirect adaptive routing algorithms

Performance results

Implementation considerations


The dragonfly network

Global Network

Group

0

Group

1

Group

2

p

0

Router

0

Router

1

Router

2

p

1

Local Network

The Dragonfly Network

  • High Radix Network

    • High radix routers

    • Small network diameter

  • Each router

    • Three types of channels

    • Directly connected to a few other groups

  • Each group

    • Organized by a local network

    • Large number of global channels (GC)

  • Large network with a global diameter of one


Routing on the dragonfly

Congestion

Routing on the Dragonfly

  • Minimal Routing (MIN)

    • Source local network

    • Global network

    • Destination local network

  • Some Adversarial traffic congests the global channels

    • Each group i sends all packets to group i+1

  • Oblivious solution: Valiant’s Algorithm (VAL)

    • Poor performance on benign traffic

Group

0

Group

1

Group

2

p

0

Router

0

Router

1

Router

2

p

1


Adaptive routing

q

0

q

1

Congestion

Adaptive Routing

  • Choose between the MIN path and a VAL path at the packet source [Singh'05]

    • Decision metric: path delay

    • Delay: product of path distance and path queue depth

  • Measuring path queue length is unrealistic

  • Use local queues length to approximate path

    • Require stiff backpressure

MIN

VAL

GC

GC

q

2

q

3

Source

Router


Adaptive routing worst case traffic

Adaptive Routing: Worst Case Traffic

450

400

350

300

Packet Latency (Simulation cycles)

250

200

Valiant’s

150

Minimal

Adaptive

100

0

0.1

0.2

0.3

0.4

0.5

Throughput (Flit Injection Rate)


Indirect adaptive routing

Indirect Adaptive Routing

Improve routing decision through remote congestion information

Previous method:

Credit round trip [Kim et. al ISCA’08]

Three new methods:

Reservation

Piggyback

Progressive


Credit round trip crt

Congestion

Credits

Delayed

Credits

Credit Round Trip (CRT)

  • Delay the return of local credits to the congested router

  • Creates the illusion of stiffer backpressure

  • Drawbacks

    • Remote congestion is still inferred through local queues

    • Information not up to date

MIN

VAL

GC

GC

Source

Router

  • [Kim et. al ISCA’08]


Reservation res

Reservation (RES)

Each global channel track the number of incoming MIN packets

Injected packets creates a reservation flit

Routing decision based on the reservation outcome

Drawbacks

Reservation flit flooding

Reservation delay

RES

Failed

RES

Flit

MIN

VAL

GC

GC

Congestion

Source

Router


Piggyback pb

Piggyback (PB)

Local congestion broadcast

Piggybacking on each packet

Send on idle channels

Congestion data compression

Drawbacks

Consumes extra bandwidth

Congestion information not up to date

(broadcast delay)

GC

GC

Free

Busy

MIN

VAL

GC

GC

Congestion

Source

Router


Progressive par

Progressive (PAR)

  • MIN routing decisions at the source are not final

  • VAL decisions are final

  • Switch to VAL when encountering congestion

  • Draw backs

    • Need an additional virtual channel to avoid deadlock

    • Add extra hops

MIN

VAL

GC

GC

Congestion

Source

Router


Experimental setup

Experimental Setup

Fully connected local and global networks

33 groups

1,056 nodes

10 cycle local channel latency

100 cycle global channel latency

10-flit packets


Steady state traffic uniform random

Steady State Traffic: Uniform Random

300

Piggyback

280

Credit Round Trip

Progressive

260

Reservation

Minimal

240

220

Packet Latency (Simulation cycles)

200

180

160

140

120

100

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Throughput (Flit Injection Rate)


Steady state traffic worst case

Steady State Traffic: Worst Case

450

Piggyback

Credit Round Trip

400

Progressive

Reservation

Valiant’s

350

300

Packet Latency (Simulation cycles)

250

200

150

100

0

0.1

0.2

0.3

0.4

0.5

Throughput (Flit Injection Rate)


Transient traffic uniform random to worst case

Transient Traffic: Uniform Random to Worst Case

Progressive

Progressive

Piggyback

Piggyback

Average Packet Latency per Cycle - UR to WC

500

400

Packet Latency

300

200

100

0

20

40

60

80

100

Cycles After Transition

% Packets Routing Non-minimally per Cycle - UR to WC

100

% of Packets Routing Nonminimally

50

0

0

20

40

60

80

100

Cycles After Transition


Network configuration considerations

Network Configuration Considerations

Packet size

RES requires long packets to amortize reservation flit cost

Routing decision is done on per packet basis

Channel latency

Affects information delay (CRT, PB)

Affects packet delay (PAR, RES)

Network size

Affects information bandwidth overhead (RES, PB)

Global diameter greater than one

Need to exchange congestion information on the global network


Cost considerations

Cost Considerations

Credit round trip

Credit delay tracker for every local channel

Reservation

Reservation counter for every global channel

Additional buffering at the injection port to store packets waiting for reservation

Piggyback

Global channel lookup table for every router

Increase in packet size

Progressive

Extra virtual channel for deadlock avoidance


Conclusion

Conclusion

  • Three new indirect adaptive routing algorithms for large scale networks

  • Performance and design evaluation of the algorithms

  • Best Algorithm?

    • Piggyback performed the best under steady state traffic

    • Progressive responded fastest to transient changes

    • Network configurations will affect some algorithm performance

    • Cost of implementation


Thank you

Thank You!

Questions?


Adaptive routing uniform traffic

300

VAL

280

MIN

Adaptive

260

240

220

200

Packet Latency - Simulation cycles

180

160

140

120

100

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Throughput - Flit Injection Rate

Adaptive Routing: Uniform Traffic


Transient traffic worst case to uniform random

Transient Traffic: Worst Case to Uniform Random


Transient traffic worst case 1 to worst case 10

Transient Traffic: Worst Case 1 to Worst Case 10


1000 random permutation traffic

1000 Random Permutation Traffic

PB

CRT

PAR

30

30

30

25

25

25

20

20

20

% of 1K Permutations

% of 1K Permutations

% of 1K Permutations

15

15

15

10

10

10

5

5

5

0

0

0

200

300

200

300

200

300

Packet Latency

Packet Latency

Packet Latency

RES

VAL

30

30

25

25

20

20

% of 1K Permutations

% of 1K Permutations

15

15

10

10

5

5

0

0

200

300

200

300

Packet Latency

Packet Latency


Effect of packet size on res worst case traffic

Effect of Packet size on RES: Worst Case Traffic

550

500

450

400

350

300

Latency - Simulation cycles

250

200

150

1 Flit

100

2 Flits

4 Flits

50

8 Flits

0

0

0.1

0.2

0.3

0.4

0.5

Throughput - Flit Injection Rate


Large local network uniform random

Large local network: Uniform Random

400

350

300

250

200

Packet Latency - Simulation cycles

150

PB

100

CRT

MIN

50

PAR

RES

0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Throughput - Flit Injection Rate


Large local network worst case

Large local network: Worst Case

600

500

400

Packet Latency - Simulation cycles

300

200

PB

CRT

100

PAR

RES

VAL

0

0

0.1

0.2

0.3

0.4

0.5

Throughput - Flit Injection Rate


  • Login