Scaling internet routers using optics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 39

Scaling Internet Routers Using Optics PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on
  • Presentation posted in: General

Scaling Internet Routers Using Optics. Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department of Electrical Engineering Stanford University. Backbone router capacity. 1Tb/s. 100Gb/s. 10Gb/s. Router capacity per rack

Download Presentation

Scaling Internet Routers Using Optics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Scaling internet routers using optics

Scaling Internet Routers Using Optics

Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown

Department of Electrical Engineering

Stanford University


Backbone router capacity

Backbone router capacity

1Tb/s

100Gb/s

10Gb/s

Router capacity per rack

2x every 18 months

1Gb/s


Backbone router capacity1

Backbone router capacity

1Tb/s

100Gb/s

Traffic

2x every year

10Gb/s

Router capacity per rack

2x every 18 months

1Gb/s


Extrapolating

Extrapolating

100Tb/s

2015:

16x disparity

Traffic

2x every year

Router capacity

2x every 18 months

1Tb/s


Consequence

Consequence

  • Unless something changes, operators will need:

    • 16 times as many routers, consuming

    • 16 times as much space,

    • 256 times the power,

    • Costing 100 times as much.

  • Actually need more than that…


Stanford 100tb s internet router

Optical

Switch

Electronic

Linecard #1

Electronic

Linecard #625

160-320Gb/s

160-320Gb/s

40Gb/s

  • Line termination

  • IP packet processing

  • Packet buffering

  • Line termination

  • IP packet processing

  • Packet buffering

40Gb/s

160Gb/s

40Gb/s

100Tb/s = 640 * 160Gb/s

40Gb/s

Stanford 100Tb/s Internet Router

Goal: Study scalability

  • Challenging, but not impossible

  • Two orders of magnitude faster than deployed routers

  • We will build components to show feasibility


Throughput guarantees

Throughput Guarantees

  • Operators increasingly demand throughput guarantees:

    • To maximize use of expensive long-haul links

    • For predictability and planning

  • Despite lots of effort and theory, no commercial router today has a throughput guarantee.


Requirements of our router

Requirements of our router

  • 100Tb/s capacity

  • 100% throughput for all traffic

  • Must work with any set of linecards present

  • Use technology available within 3 years

  • Conform to RFC 1812


What limits router capacity

What limits router capacity?

Approximate power consumption per rack

Power density is the limiting factor today


Trend multi rack routers reduces power density

Crossbar

Linecards

Switch

Linecards

Trend: Multi-rack routersReduces power density


Scaling internet routers using optics

Juniper TX8/T640

Alcatel 7670 RSP

TX8

Avici TSR

Chiaro


Limits to scaling

Limits to scaling

  • Overall power is dominated by linecards

    • Sheer number

    • Optical WAN components

    • Per packet processing and buffering.

  • But power density is dominated by switch fabric


Trend multi rack routers reduces power density1

  • Limit today ~2.5Tb/s

    • Electronics

    • Scheduler scales <2x every 18 months

    • Opto-electronic conversion

Switch

Linecards

Trend: Multi-rack routersReduces power density


Multi rack routers

Multi-rack routers

Switch fabric

Linecard

In

WAN

Out

In

WAN

Out


Question

Question

  • Instead, can we use an optical fabric at 100Tb/s with 100% throughput?

  • Conventional answer: No.

    • Need to reconfigure switch too often

    • 100% throughput requires complex electronic scheduler.


Outline

Outline

  • How to guarantee 100% throughput?

  • How to eliminate the scheduler?

  • How to use an optical switch fabric?

  • How to make it scalable and practical?


100 throughput

R

R

?

R

R

?

Out

?

R

R

?

R

R

R

R

?

R

R

R

?

R

Out

?

R

R

R

R

?

?

R

Out

Switch capacity = N2R

Router capacity = NR

100% Throughput

In

In

In


If traffic is uniform

R

R/N

R/N

Out

R/N

R/N

R

R

R

R/N

R/N

Out

R/N

R

R/N

R/N

Out

If traffic is uniform

R

In

R

In

R

In


Real traffic is not uniform

R

R

R

R

?

R/N

In

R

R/N

Out

R/N

R/N

R

R

R

R

R

In

R

R

R/N

R/N

Out

R/N

R

R

R

R/N

In

R/N

Out

Real traffic is not uniform


Two stage load balancing switch

Out

Out

Out

Out

Out

100% throughput for weakly mixing, stochastic traffic.

[C.-S. Chang, Valiant]

Two-stage load-balancing switch

R

R

R

R/N

R/N

In

Out

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

In

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

In

R/N

R/N

Load-balancing stage

Switching stage


Scaling internet routers using optics

Out

Out

Out

R

R

In

3

3

3

R/N

R/N

1

R/N

R/N

R/N

R/N

R/N

R/N

R

R

In

2

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R/N

In

3

R/N

R/N


Scaling internet routers using optics

Out

Out

Out

R

R

In

R/N

R/N

1

R/N

R/N

3

R/N

R/N

R/N

R/N

R

R

In

2

R/N

R/N

3

R/N

R/N

R/N

R/N

R/N

R

R

R/N

In

3

R/N

R/N

3


Chang s load balanced switch good properties

Chang’s load-balanced switchGood properties

  • 100% throughput for broad class of traffic

  • No scheduler needed a Scalable


Chang s load balanced switch bad properties

  • FOFF: Load-balancing algorithm

    • Packet sequence maintained

    • No pathological patterns

    • 100% throughput - always

    • Delay within bound of ideal

    • (See paper for details)

Chang’s load-balanced switchBad properties

  • Packet mis-sequencing

  • Pathological traffic patterns a Throughput 1/N-th of capacity

  • Uses two switch fabricsa Hard to package

  • Doesn’t work with some linecards missinga Impractical


Single mesh switch

One linecard

R

R

Out

R

R

Out

R

R

Out

Single Mesh Switch

2R/N

In

2R/N

2R/N

2R/N

In

2R/N

2R/N

2R/N

2R/N

In

2R/N


Packaging

2R/N

2R/N

Backplane

Out

R

2R/N

2R/N

2R/N

2R/N

Out

R

2R/N

2R/N

R/N

Out

R

Packaging

R

In

R

In

R

In


Many fabric options

C1, C2, …, CN

C1

C2

C3

CN

In

In

In

In

Out

Out

Out

Out

Many fabric options

N channels each at rate 2R/N

Any permutation

network

Options

Space: Full uniform mesh

Time: Round-robin crossbar

Wavelength: Static WDM


Static wdm switching

A, A, A, A

A, B, C, D

B, B, B, B

A, B, C, D

C, C, C, C

A, B, C, D

D, D, D, D

A, B, C, D

4 WDM channels,

each at rate 2R/N

In

In

In

In

Out

Out

Out

Out

Static WDM switching

Array

Waveguide

Router

(AWGR)

Passive andAlmost ZeroPower

A

B

C

D


Linecard dataflow

2

2

2

2

2

2

l1

R

l1, l2,.., lN

WDM

lN

R

l1

l1, l2,.., lN

R

R

WDM

2

lN

Out

l1

R

l1, l2,.., lN

R

1

1

1

1

WDM

lN

Linecard dataflow

In

l1

l1, l2,.., lN

R

R

WDM

lN

1

3

1

1

1

1

2

3

4

1

1

1

1


Problems of scale

Problems of scale

  • For N < 64, WDM is a good solution.

  • We want N = 640.

  • Need to decompose.


Decomposing the mesh

Decomposing the mesh

2R/8

1

1

2

2

3

3

4

4

5

5

6

6

7

7

8

8


Decomposing the mesh1

WDM

TDM

Decomposing the mesh

1

2R/8

2R/8

1

2R/4

2R/8

2R/8

2

2

3

3

4

4

5

5

6

6

7

7

8

8


When n is too large decompose into groups or racks

1

L

1

2

2

L

When N is too largeDecompose into groups (or racks)

Group/Rack 1

2R

Array

Waveguide

Router

(AWGR)

l1, l2, …, lG

2R

1

2R

Group/Rack G

2R

l1, l2, …, lG

2R

G

2R


When a linecard is missing

When a linecard is missing

  • Each linecard spreads its data equally over every other linecard.

  • Problem: If one is missing, or failed, then the spreading no longer works.


When a linecard fails

R

R

2R/3 + 2R/3 = 1.5R

2R/3 + 2R/6 + 2R/3 + 2R/6 = 2R

2R/3 + 2R/6

Out

2R/3 + 2R/6

R

R

Out

R

R

2R/3 + 2R/6

Out

2R/3 + 2R/6

When a linecard fails

2R/3

In

2R/3

2R/3

  • Solution:

  • Move light beams

    • Replace AWGR with MEMS switch.

    • Reconfigure when linecard added, removed or fails.

  • Finer channel granularity

    • Multiple paths.

2R/3

In

2R/3

2R/3

2R/3

2R/3

In

2R/3


Solution use transparent mems switches

1

MEMS

Switch

G

1

MEMS

Switch

G

1

MEMS

Switch

G

L

1

2

1

2

L

SolutionUse transparent MEMS switches

Group/Rack 1

MEMS switches reconfigured only when linecard added, removed or fails.

2R

2R

2R

Group/RackG=40

2R

2R

2R

Theorems:

1. Require L+G-1 MEMS switches

2. Polynomial time reconfiguration algorithm


Challenges

Low-cost, low-power optoelectronic conversion?

l1

Pkt

Switch

How to build a 250ms

160Gb/s buffer?

WDM

lG

l1

R

R

WDM

lG

Challenges

In

l1

Address

Lookup

l1, l2,.., lG

R

R

WDM

lG

R

l1, l2,.., lG

l1, l2,.., lG

1

1

1

2

2

R=160Gb/s

3

4

Out

l1

R

l1, l2,.., lG

R

WDM

lG


What we are building

Chip #2: 16 x 55

Opto-electronic crossbar

55 x 10Gb/s

55 x 10Gb/s

Optical source

16 x 10Gb/s

CMOS ASIC

To Linecards

To Optical Fabric

What we are building

250ms DRAM

320Gb/s

Chip #1: 160Gb/s Packet Buffer

Buffer Manager

90nm ASIC

160Gb/s

160Gb/s

Optical Detector

Optical Modulator


100tb s load balanced router

40 x 40

MEMS

Linecard Rack 1

Linecard Rack G = 40

Switch Rack < 100W

L = 16

160Gb/s

linecards

L = 16

160Gb/s

linecards

1

2

55

56

100Tb/s Load-Balanced Router

L = 16

160Gb/s

linecards


  • Login