Forwarding Metamorphosis:
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Outline PowerPoint PPT Presentation


  • 126 Views
  • Uploaded on
  • Presentation posted in: General

Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN Pat Bosshart, Glen Gibb, Hun- Seok Kim, George Varghese, Nick McKeown , Martin Izzard, Fernando Mujica , Mark Horowitz Texas Instruments, Stanford University, Microsoft. Outline.

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Outline

Forwarding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDNPat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard,Fernando Mujica, Mark HorowitzTexas Instruments, Stanford University, Microsoft


Outline

Outline

  • Conventional switch chips are inflexible

  • SDN demands flexibility…sounds expensive…

  • How do we do it: The RMT switch model

  • Flexibility costs less than 15%


Fixed function switch

Fixed function switch

L2: 128k x 48

Exact match

L3: 16k x 32

Longest prefix

match

ACL: 4k

Ternary match

X

X

X

X

X

L2 Stage

ACL Stage

L3 Stage

?????????

PBB Stage

Queues

L3 Table

ACL Table

L2 Table

Action: set L2D, dec TTL

Action: set L2D

Action: permit/deny

Out

In

Deparser

Parser

Stage 1

Stage 3

Stage 2

Data


What if you need flexibility

What if you need flexibility?

  • Flexibility to:

    • Trade one memory size for another

    • Add a new table

    • Add a new header field

    • Add a different action

  • SDN accentuates the need for flexibility

    • Gives programmatic control to control plane, expects to be able to use flexibility


What does sdn want

What does SDN want?

  • Multiple stages of match-action

    • Flexible allocation

  • Flexible actions

  • Flexible header fields

  • No coincidence OpenFlowbuilt this way…


What about alternatives are n t there other ways to get flexibility

What about Alternatives?Aren’t there other ways to get flexibility?

  • Software? 100x too slow, expensive

  • NPUs? 10x too slow, expensive

  • FPGAs? 10x too slow, expensive


W hat w e s et out t o learn

What We Set Out To Learn

  • How do I design a flexible switch chip?

  • What does the flexibility cost?


What s hard about a flexible switch chip

What’s Hard about a Flexible Switch Chip?

  • Big chip

  • High frequency

  • Wiringintensive

  • Many crossbars

  • Lots of TCAM

  • Interaction between physical design and architecture

  • Good news? No need to read 7000 IETF RFC’s!


Outline1

Outline

  • Conventional switch chip are inflexible

  • SDN demands flexibility…sounds expensive…

  • How do we do it: The RMT switch model

  • Flexibility costs less than 15%


The rmt abstract model

The RMT Abstract Model

  • Parse graph

  • Table graph


Arbitrary fields the parse graph

Arbitrary Fields: The Parse Graph

Ethernet IPV4 TCP

Packet:

Ethernet

IPV4

IPV6

TCP

UDP


Arbitrary fields the parse graph1

Arbitrary Fields: The Parse Graph

Packet:

Ethernet IPV4 TCP

Ethernet

IPV4

TCP

UDP


Arbitrary fields the parse graph2

Arbitrary Fields: The Parse Graph

Packet:

Ethernet IPV4 RCP TCP

Ethernet

IPV4

RCP

TCP

UDP


Reconfigurable match tables the table graph

Reconfigurable Match Tables:The Table Graph

VLAN

ETHERTYPE

MAC

FORWARD

IPV4-DA

IPV6-DA

ACL

RCP


Changes to parse graph and table graph

Changes to Parse Graph and Table Graph

ETHERTYPE

Ethernet

VLAN

VLAN

IPV6-DA

IPV4-DA

L2S

IPV6

IPV4

RCP

L2D

RCP

UDP

TCP

ACL

Done

MY-TABLE

Parse Graph

Table Graph


But the parse graph and table graph don t show you how to build a switch

But the Parse Graph and Table Graphdon’t show you how to build a switch


Match action forwarding model

Match/Action Forwarding Model

Match

Action Stage

Match

Action Stage

Match

Action Stage

Queues

Out

In

Programmable Parser

Deparser

Match Table

Match Table

Match Table

Action

Action

Action

Data

Stage 1

Stage N

Stage 2


Performance vs flexibility

Performance vs Flexibility

  • Multiprocessor: memory bottleneck

  • Change to pipeline

  • Fixed function chips specialize processors

  • Flexible switch needs general purpose CPUs

L2

L3

ACL

Memory

CPU

Memory

CPU

Memory

CPU


How we did it

How We Did It

  • Memory to CPU bottleneck

  • Replicate CPUs

  • More stages for finer granularity

  • Higher CPU cost ok

CPU

CPU

CPU

C

P

U

CPU

CPU

Memory

C

P

U

CPU

CPU

C

P

U

CPU

CPU


Rmt logical to physical table mapping

RMT Logical to Physical Table Mapping

Physical

Stage 1

Physical

Stage n

Physical

Stage 2

9 ACL

3

IPV4

ETH

VLAN

TCAM

IPV6

2

VLAN

5

IPV6

IPV4

L2S

640b

L2D

TCP

Match Table

Match Table

Match Table

UDP

Action

Action

Action

4

L2S

7 TCP

SRAM

HASH

8 UDP

ACL

Logical Table 6

L2D

Logical Table 1

Ethertype

Table Graph

640b


Action processing model

Action Processing Model

Field

Header Out

Header In

ALU

Field

Data

Match result

Instruction


Outline

Modeled as Multiple VLIW CPUs per Stage

ALU

ALU

ALU

ALU

ALU

ALU

ALU

ALU

ALU

Match result

VLIW Instructions


Our switch design

Our Switch Design

  • 64 x 10Gb ports

    • 960M packets/second

    • 1GHz pipeline

  • Programmable parser

  • 32 Match/action stages

  • HugeTCAM: 10x current chips

    • 64K TCAM words x 640b

  • SRAM hashtables for exact matches

    • 128K words x 640b

  • 224 action processors per stage

  • All OpenFlow statistics counters


Outline2

Outline

  • Conventional switch chip are inflexible

  • SDN demands flexibility…sounds expensive…

  • How do I do it: The RMT switch model

  • Flexibility costs less than 15%


Cost of configurability comparison with conventional switch

Cost of Configurability:Comparison with Conventional Switch

  • Many functions identical: I/O, data buffer, queueing…

  • Make extra functions optional: statistics

  • Memory dominates area

    • Compare memory area/bit and bit count

  • RMT must use memory bits efficiently to compete on cost

  • Techniques for flexibility

    • Match stage unit RAM configurability

    • Ingress/egress resource sharing

    • Table predication allows multiple tables per stage

    • Match memory overhead reduction

    • Match memory multi-word packing


Chip comparison with fixed function switches

Chip Comparison with Fixed Function Switches

Area

Power


Conclusion

Conclusion

  • How do we design a flexible chip?

    • The RMT switch model

    • Bring processing close to the memories:

      • pipeline of many stages

    • Bring the processing to the wires:

      • 224 action CPUs per stage

  • How much does it cost?

    • 15%

  • Lots of the details how we designed this in 28nm CMOS are in the paper


  • Login