Netfpga summer camp day 1
Download
1 / 115

NetFPGA Summer Camp - PowerPoint PPT Presentation


  • 565 Views
  • Updated On :

Presented by: Adam Covington, Glen Gibb (Stanford University) Stanford, CA August 9 - 13, 2010 http://NetFPGA.org NetFPGA Summer Camp Day 1 Tutorial Outline Background Introduction The NetFPGA Platform The Stanford Base Reference Router Motivation: Basic IP review

Related searches for NetFPGA Summer Camp

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' NetFPGA Summer Camp' - Leo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Netfpga summer camp day 1 l.jpg

Presented by:

Adam Covington, Glen Gibb

(Stanford University)

Stanford, CA

August 9 - 13, 2010

http://NetFPGA.org

NetFPGA Summer CampDay 1


Tutorial outline l.jpg
Tutorial Outline

Background

Introduction

The NetFPGA Platform

The Stanford Base Reference Router

Motivation: Basic IP review

Demo1: Reference Router running on the NetFPGA

The Enhanced Reference Router

Motivation: Understanding buffer size requirements in a router

Demo 2: Observing and controlling the queue size

How does the NetFPGA work

Utilities

Reference Designs

Inside the NetFPGA Hardware

The Life of a Packet Through the NetFPGA

Hardware Datapath

Interface to software: Exceptions and Host I/O

Exercise: Drop Nth Packet

Concluding Remarks

Using NetFPGA for research and teaching



What is the netfpga l.jpg
What is the NetFPGA?

A line-rate, flexible, open networking platform for teaching and research


Netfpga consists of l.jpg
NetFPGA consists of…

NetFPGA Board

Four elements:

  • NetFPGA board

  • Tools + reference designs

  • Contributed projects

  • Community


Netfpga board l.jpg
NetFPGA board

PC with NetFPGA

1GE

FPGA

1GE

1GE

Memory

1GE

NetFPGA Board

NetworkingSoftware

running on a standard PC

CPU

Memory

PCI

A hardware accelerator

built with Field Programmable Gate Arraydriving Gigabit network links


Tools reference designs l.jpg
Tools + reference designs

Tools:

  • Compile designs

  • Verify designs

  • Interact with hardware

    Reference designs:

  • Router (HW)

  • Switch (HW)

  • Network Interface Card (HW)

  • Router Kit (SW)

  • SCONE (SW)


Contributed projects l.jpg
Contributed Projects

More projects:

http://netfpga.org/foswiki/NetFPGA/OneGig/ProjectTable


Community l.jpg
Community

Wiki

  • Documentation (slowly growing)

  • Encourage users to contribute

    Forums

  • Support by users for users

  • Active community – 10s to 100s of posts per week


Netfpga s defining characteristics l.jpg
NetFPGA’s Defining Characteristics

  • Line-Rate

    • Processes back-to-back packets

      • Without dropping packets

      • At full rate of Gigabit Ethernet Links

    • Operating on packet headers

      • For switching, routing, and firewall rules

    • And packet payloads

      • For content processing and intrusion prevention

  • Open-sourceHardware

    • Similar to open-source software

      • Full source code available

      • BSD-Style License

    • But harder, because

      • Hardware modules must meeting timing

      • Verilog & VHDL Components have more complex interfaces

      • Hardware designers need high confidence in specification of modules


Test driven design l.jpg
Test-Driven Design

  • Regression tests

    • Have repeatable results

    • Define the supported features

    • Provide clear expectation on functionality

  • Example: Internet Router

    • Drops packets with bad IP checksum

    • Performs Longest Prefix Matching on destination address

    • Forwards IPv4 packets of length 64-1500 bytes

    • Generates ICMP message for packets with TTL <= 1

    • Defines how packets with IP options or non IPv4

      … and dozens more … Every feature is defined by a regression test


Who how why l.jpg
Who, How, Why

Who uses the NetFPGA?

  • Teachers

  • Students

  • Researchers

    How do they use the NetFPGA?

  • To run the Router Kit

  • To build modular reference designs

    • IPv4 router

    • 4-port NIC

    • Ethernet switch, …

      Why do they use the NetFPGA?

  • To measure performance of Internet systems

  • To prototype new networking systems


What you will learn l.jpg
What you will learn

  • Overall picture of NetFPGA

  • How reference designs work

  • How you can work on a project

    • NetFPGA Design Flow

    • Directory Structure, library modules and projects

    • How to utilize contributed projects

      • Interface/Registers

    • How to verify a design (Simulation and Regression Tests)

    • Things to do when you get stuck

      AND… You can start your own projects!



Basic uses of netfpga l.jpg
Basic Uses of NetFPGA

  • Recap Internet Protocol and Routing

  • Demonstrate

    • How you can use the NetFPGA as a router

    • See routing in action


What is ip l.jpg
What is IP?

  • IP (Internet Protocol)

    • Protocol used for communicating data across packet-switched networks

    • Divides data into a number of packets (IP packet)

  • IP Packet

    • Header (IP Header) including:

      • Source IP address

      • Destination IP address


Ip header l.jpg
IP Header

Data

Data

Hdr

Data

Hdr

Data

Hdr

1

4

16

32

Ver

HLen

T.Service

Total Packet Length

Fragment ID

Flags

Fragment Offset

TTL

Protocol

Header Checksum

Source Address

Destination Address

Options (if any)

20 bytes

Data


Ip address l.jpg
IP Address

  • Used to uniquely identify a device (such as a computer) from all other devices on a network

    • Two parts

      • Identifier of a particular network on the Internet

      • Identifier of a particular device within a network

        All packets, except the ones for the same network, first go to their gateway (router) and are transferred to the destination via routers.


Basic operation of an ip router l.jpg
Basic Operation of an IP Router

D

R3

R1

R4

D

A

B

E

R2

C

R5

Destination

Next Hop

F

D

R3

E

R3

F

R5


What does a router do l.jpg
What does a router do?

R3

R1

R4

D

A

1

4

16

32

D

Ver

HLen

T.Service

Total Packet Length

Fragment ID

Flags

Fragment Offset

B

E

TTL

Protocol

Header Checksum

20 bytes

Source Address

R2

C

R5

Destination Address

Destination

Next Hop

F

D

R3

Options (if any)

E

R3

Data

F

R5


What does a router do21 l.jpg
What does a router do?

R3

R1

R4

D

A

B

E

R2

C

R5

F


Basic components of an ip router l.jpg
Basic Components of an IP Router

Software

Hardware

Management

& CLI

Routing

Protocols

Control Plane

Routing

Table

Datapath

per-packet

processing

Forwarding

Table

Switching


Per packet processing in an ip router l.jpg
Per-packet processing in an IP Router

1. Accept packet arriving on an incoming link.

2. Lookup packet destination address in the forwarding table to identify outgoing port(s).

3. Manipulate IP header: e.g., decrement TTL, update header checksum.

5. Buffer packet in the output queue.

6. Transmit packet onto outgoing link.


Generic datapath architecture l.jpg
Generic Datapath Architecture

Data

Hdr

Data

Hdr

IP Address

Next Hop

Forwarding

Table

Buffer

Memory

Header Processing

Lookup

IP Address

Update

Header

Queue

Packet


Cidr and longest prefix matches l.jpg
CIDR and Longest Prefix Matches

142.12/19

  • The IP address space is broken into line segments.

  • Each line segment is described by a prefix.

  • A prefix is of the form x/y where x indicates the prefix of all addresses in the line segment, and y indicates the length of the segment.

  • e.g. The prefix 128.9/16 represents the line segment containing addresses in the range: 128.9.0.0 … 128.9.255.255.

128.9.0.0

65/8

128.9/16

0

232-1

216

128.9.16.14


Classless interdomain routing cidr l.jpg
Classless Interdomain Routing (CIDR)

128.9.19/24

128.9.25/24

128.9.16/20

128.9.176/20

Most specific route = “longest matching prefix”

128.9/16

0

232-1

128.9.16.14


Techniques for lpm in hardware l.jpg
Techniques for LPM in hardware

  • Linear search

    • Slow

  • Direct lookup

    • Currently requires too much memory

    • Updating a prefix leads to many changes

  • Tries

    • Deterministic lookup time

    • Easily pipelined but require multiple memories/references

  • TCAM (Ternary CAM)

    • Simple and widely used but havelower density than RAM and need more power

    • Gradually being replaced by algorithmic methods


An ip router on netfpga l.jpg
An IP Router on NetFPGA

Software

Hardware

Management

& CLI

Linux user-level

processes

Routing

Protocols

Exception

Processing

Routing

Table

Verilog on

NetFPGA PCI board

Forwarding

Table

Switching


Netfpga router l.jpg
NetFPGA Router

Function

  • 4 Gigabit Ethernet ports

    Fully programmable

  • FPGA hardware

    Low cost

Open-source FPGA hardware

  • Verilog base design

    Open-souce Software

  • Drivers in C and C++


Demo 1 l.jpg

Demo 1

Reference Router running on the NetFPGA


Hardware setup for demo 1 l.jpg
Hardware Setup for Demo #1

Demo 1

CPU x2

NIC

Video

Server

PCI-e

PCI-e

GE

GE

Net-FPGA

GE

PCI

Internet

Router

Hardware

GE

GE

Server deliversstreaming HD video

through a chain of NetFPGA Routers

GE

Net-FPGA

GE

Internet

Router

Hardware

GE

GE

GE

CPU x2

NIC

PCI-e

GE

GE

Net-FPGA

GE

Video

Display

PCI

Internet

Router

Hardware

GE

GE

CAD Tools

GE


Slide32 l.jpg

Topology

Demo 1

.1.1

.4.1

.7.1

.10.1

.13.1

.16.1

.1.2

.4.2

.7.2

.10.2

.13.2

.16.2

.3.1

.6.2

.9.2

.12.2

.15.2

.17.1

.3.2

.2.1

.6.1

.9.1

.12.1

.15.1

.30.2

.5.1

.8.1

.11.1

.14.1

.18.1

.30.1

.26.1

.23.1

.18.2

.27.2

.24.2

.21.2

.20.1

.29.1

.24.1

.21.1

.27.1

.28.2

.25.2

.22.2

.19.2

.28.1

.25.1

.22.1

.19.1

Video Server

Video Client

Shortest Path


Working ip router l.jpg
Working IP Router

Objectives

Become familiar with Stanford Reference Router

Observe PW-OSPF re-routing traffic around a failure

Demo 1


Step 1 observe the routing tables l.jpg
Step 1 – Observe the Routing Tables

The router is already configured and running on your machines

The routing table has converged to the routing decisions with minimum number of hops

Next, break a link …

Demo 1


Step 2 dynamic re routing l.jpg
Step 2 - Dynamic Re-routing

Break the link between video server and video client

Routers re-route traffic around the broken link and video continues playing

Demo 1

.1.1

.4.1

.7.1

.10.1

.13.1

.16.1

.1.2

.4.2

.7.2

.10.2

.13.2

.16.2

.3.1

.6.2

.9.2

.12.2

.15.2

.17.1

.2.1

.3.2

.15.1

.6.1

.9.1

.12.1

.30.2

.5.1

.8.1

.11.1

.14.1

.18.1

.30.1

.26.1

.23.1

.18.2

.21.2

.27.2

.24.2

.20.1

.29.1

.24.1

.21.1

.27.1

.28.2

.25.2

.22.2

.19.2

.28.1

.25.1

.22.1

.19.1



Advanced uses of netfpga l.jpg
Advanced Uses of NetFPGA

  • Introduction on TCP and Buffer Sizes

  • Demonstrate

    • NetFPGA used for real time measurement

    • See TCP Saw tooth in real time


Buffer requirements in a router l.jpg
Buffer Requirements in a Router

Buffer size matters:

Small queues reduce delay

Large buffers are expensive

Theoretical tools predict requirements

Queuing theory

Large deviation theory

Mean field theory

Yet, there is no direct answer

Flows have a closed-loop nature

Question arises on whether focus should be on equilibrium state or transient state


Rule of thumb l.jpg

Universally applied rule-of-thumb:

A router needs a buffer size:

2T is the two-way propagation delay (or just 250ms)

C is capacity of bottleneck link

Context

Mandated in backbone and edge routers

Appears in RFPs and IETF architectural guidelines

Already known by inventors of TCP

[Van Jacobson, 1988]

Has major consequences for router design

Rule-of-thumb

Source

Destination

Router

C

2T


The story so far l.jpg
The Story So Far

# packets

at 10Gb/s

1,000,000

20

10,000

  • Assume: Large number of desynchronized flows; 100% utilization

  • Assume: Large number of desynchronized flows; <100% utilization


Exploring buffer sizes l.jpg
Exploring Buffer Sizes

Need to reduce buffer size and measure occupancy

Not possible in commercial routers

So, we will use the NetFPGA instead

Objective:

Use the NetFPGA to understand how large a buffer we need for a single TCP flow.


Why 2txc for a single tcp flow l.jpg
Why 2TxC for a single TCP Flow?

  • Rule for adjusting W

    • If an ACK is received: W ← W+1/W

    • If a packet is lost: W ← W/2

Only Wpackets may be outstanding

http://guido.appenzeller.net/anims/


Time evolution of a single tcp flow l.jpg
Time Evolution of a Single TCP Flow

Time evolution of a single TCP flow through a router. Buffer is 2T*C

Time evolution of a single TCP flow through a router. Buffer is < 2T*C


Demo 2 buffer sizing experiments using the netfpga router l.jpg

Demo 2Buffer Sizing Experimentsusing the NetFPGA Router


Hardware setup for demo 2 l.jpg
Hardware Setup for Demo #2

Demo 2

CPU x2

NIC

PCI-e

GE

GE

Net-FPGA

GE

Video

Client

PCI

Server deliversstreaming HD video

to adjacent client

Internet

Router

Hardware

GE

GE

GE

CPU x2

NIC

Video

Server

PCI-e

PCI-e

GE

GE


Topology l.jpg
Topology

Demo 2

  • eth1 connects your host to your NetFPGA Router

  • nf2c2 routes to nf2c1 (your adjacent server)

  • eth2 serves web and video traffic to your neighbor

  • nf2c0 & nf2c3 (the network ring) are unused

.2.1

.5.1

.8.1

.11.1

.4.1

.7.1

.10.1

.13.1

.1.2

.10.2

.1.1

.4.2

.7.2

.13.2

.2.2

.5.2

.8.2

.11.2

.14.2

.29.1

.29.2

.14.1

.26.2

.20.2

.23.2

.17.2

.19.2

.16.2

.28.2

.25.2

.22.2

.16.1

.28.1

.25.1

.22.1

.19.1

.26.1

.23.1

.20.1

.17.1

This configuration allows you to modify and test your router without affecting others


Enhanced router l.jpg
Enhanced Router

Demo 2

Objectives

  • Observe router with new modules

  • New modules: rate limiting, event capture

    Execution

  • Run event capture router

  • Look at routing tables

  • Explore details pane

  • Start tcp transfer, look at queue occupancy

  • Change rate, look at queue occupancy


Step 1 run pre made enhanced router l.jpg
Step 1 - Run Pre-made Enhanced Router

Start terminal and cd to “netfpga/projects/

tutorial_router/sw/”

Type “./tut_adv_router_gui.pl”

A familiar GUI should start

Demo 2


Step 2 explore enhanced router l.jpg
Step 2 - Explore Enhanced Router

Click on the Details tab

A similar pipeline to the one seen previously shown with some additions

Demo 2


Enhanced router pipeline l.jpg
Enhanced Router Pipeline

Two modules added

Event Capture to capture output queue events (writes, reads, drops)

Rate Limiter

to create a

bottleneck

Demo 2

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

Input Arbiter

Output Port Lookup

Event Capture

Output Queues

MAC

TxQ

CPU

TxQ

Rate

Limiter

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ


Step 3 decrease the link rate l.jpg
Step 3 - Decrease the Link Rate

To create bottleneck and show the TCP “sawtooth,” link-rate is decreased.

In the Details tab, click the “Rate Limit” module

Check Enabled

Set link rate to 1.953Mbps

Demo 2


Step 4 decrease queue size l.jpg
Step 4 – Decrease Queue Size

Go back to the Details panel and click on “Output Queues”

Select the “Output Queue 2” tab

Change the output queue size in packets slider to 16

Demo 2


Step 5 start event capture l.jpg
Step 5 - Start Event Capture

Click on the Event Capture module under the Details tab

This should start the configuration page

Demo 2


Step 6 configure event capture l.jpg
Step 6 - Configure Event Capture

Check Send to local host to receive events on the local host

Check Monitor Queue 2 to monitor output queue of MAC port1

Check Enable Captureto start event capture

Demo 2


Step 7 start tcp transfer l.jpg
Step 7 - Start TCP Transfer

We will use iperf to run a

large TCP transfer and

look at queue evolution

Start a terminal and cd to

“netfpga/projects/tutorial_router/sw”

Type “./iperf.sh”

Demo 2


Step 8 look at event capture results l.jpg
Step 8 - Look at Event Capture Results

Click on the Event Capture module under the Details tab.

The sawtooth pattern should now be visible.

Demo 2


Queue occupancy charts l.jpg
Queue Occupancy Charts

Demo 2

Observe the TCP/IP sawtooth

Leave the control windows open



Integrated circuit technology l.jpg
Integrated Circuit Technology

Full-custom Design

Complementary Metal Oxide Semiconductor (CMOS)

Semi-custom ASIC Design

Gate array

Standard cell

Programmable Logic Device

Programmable Array Logic

Field Programmable Gate Arrays

Processors


Look up tables l.jpg
Look-Up Tables

Combinatorial logic is stored in Look-Up Tables (LUTs)

Also called Function Generators (FGs)

Capacity is limited only bynumber of inputs, not complexity

Delay through the LUT is constant

Combinatorial Logic

A

B

Z

C

D

Diagram From: Xilinx, Inc


Xilinx clb structure l.jpg

Each slice has four outputs

Two registered outputs, two non-registered outputs

Two BUFTs associated with each CLB, accessible by all 16 CLB outputs

Carry logic run vertically

Signals run upward

Two independent carry chains per CLB

Xilinx CLB Structure

LUT

LUT

PRE

D

Q

CE

CLR

PRE

D

Carry

Carry

Q

CE

CLR

Slice 0

Diagram From: Xilinx, Inc.


Field programmable gate arrays l.jpg
Field Programmable Gate Arrays

CLB

Primitive element of FPGA

Routing Module

Global routing

Local interconnect

Macro Blocks

Block Memories

Microprocessor

I/O Block


Netfpga package l.jpg
NetFPGA Package

  • Utilities

    • Simulation

    • Synthesis

    • Registers

  • Verilog Libraries (shared modules)

  • Projects (reference and contributed)


Simulation and synthesis l.jpg
Simulation and Synthesis

  • Simulation (nf_run_test.pl)

    • Allows simulation from command line or GUI

    • Uses backend libraries (Perl and Python) to create packets for simulation

  • Synthesis (make)

    • In the projects synth directory

    • Automatically includes Xilinx Coregen components from shared libraries

    • Includes all Xilinx Coregen components form a projects synth directory (.xco)


Shared verilog libraries modules l.jpg
Shared Verilog Libraries (modules)

  • Located at netfpga/lib/verilog

  • Specify shared libraries in project.xml

    • Any project can use any module

  • Local modules in a project’s src dir over rides a shared library

    • If arp_reply is found both in shared library and project’s src directory only the project’s src directory version is used


Register system l.jpg
Register System

  • Project XML (project.xml)

    • Found in project/include directory

    • Specifies shared libraries and location of registers in pipeline

  • Each module with registers has an XML file

    • Specifies the register names and widths

  • Register files automatically created using nf_register_gen.pl

    • Perl header files

    • C header files

    • Verilog file defining registers


Reference projects l.jpg
Reference Projects

  • Easily extend and add modules

  • Currently

    • Reference NIC

    • Reference Router

    • Reference Switch

    • Router KIT

    • Router Buffer Sizing


Full system components l.jpg
Full System Components

MAC

TxQ

MAC

TxQ

MAC

TxQ

CPU

RxQ

CPU

RxQ

CPU

RxQ

CPU

RxQ

MAC

TxQ

Java GUI

SCONE

PW-OSPF

Software

Driver

nf2c0

nf2c1

nf2c2

nf2c3

ioctl

PCI Bus

CPU

TxQ

CPU

TxQ

CPU

TxQ

nf2_reg_grp

CPU

TxQ

DMA

Registers

NetFPGA

user data path

MAC

RxQ

MAC

RxQ

MAC

RxQ

MAC

RxQ

Ethernet


Reference router pipeline l.jpg
Reference Router Pipeline

Five stages

Input

Input arbitration

Routing decision and packet modification

Output queuing

Output

Packet-based module interface

Pluggable design

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

Input Arbiter

Output Port Lookup

Output Queues

MAC

TxQ

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ

CPU

TxQ



Life of a packet through the hardware l.jpg
Life of a Packet through the Hardware

192.168.2.y

192.168.1.x

port0

port2

IP packet


Inter module communication l.jpg
Inter-Module Communication

Using “Module Headers”:

Ctrl Word

(8 bits)

Data Word

(64 bits)

x

Module Hdr

Contain information such as packet length, input port, output port, …

y

Last Module Hdr

0

Eth Hdr

0

IP Hdr

0

0x10

Last word of packet


Inter module communication73 l.jpg
Inter-Module Communication

Module i

Module i+1

data

ctrl

wr

rdy


Mac rx queue l.jpg
MAC Rx Queue

MAC Rx Queue

Eth Hdr:

Dst MAC = port 0, Ethertype = IP

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

Data


Rx queue l.jpg
Rx Queue

Rx Queue

0xff

Pkt length,

input port = 0

0

Eth Hdr:

Dst MAC = port 0, Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

0

Data


Input arbiter l.jpg
Input Arbiter

Rx

Q 7

Input Arbiter

Pkt

Rx

Q 1

Pkt

Rx

Q 0

Pkt


Output port lookup l.jpg
Output Port Lookup

Output Port Lookup

0xff

Pkt length,

input port = 0

0

EthHdr: Dst MAC = 0

Src MAC = x,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

0

Data


Output port lookup78 l.jpg
Output Port Lookup

5- Add output port header

1- Check input port matches Dst MAC

Output Port Lookup

0xff

Pkt length,

input port = 0

Pkt length,

input port = 0

output port = 4

6- Modify MAC Dst and Src addresses

2- Check TTL, checksum

0

EthHdr: Dst MAC = 0

Src MAC = x,

Ethertype = IP

EthHdr: Dst MAC = nextHopSrc MAC = port 4,

Ethertype = IP

3- Lookup next hop IP & output port (LPM)

0

7-Decrement TTL and update checksum

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

IP Hdr:

IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2

4- Lookup next hop MAC address (ARP)

0

Data


Output queues l.jpg
Output Queues

Output Queues

OQ0

OQ4

Pkt

OQ7


Mac tx queue l.jpg
MAC Tx Queue

0xff

Pkt length,

input port = 0

output port = 4

0

EthHdr: Dst MAC = nextHopSrc MAC = port 4,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

IP Hdr:

IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2

0

Data

MAC Tx Queue


Mac tx queue81 l.jpg
MAC Tx Queue

MAC Tx Queue

0xff

Pkt length,

input port = 0

output port = 4

0

EthHdr: Dst MAC = nextHopSrc MAC = port 4,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

IP Hdr:

IP Dst: 192.168.2.3, TTL: 63, Csum:0x3ac2

0

Data


Exception packet l.jpg
Exception Packet

Example: TTL = 0 or TTL = 1

Packet has to be sent to the CPU which will generate an ICMP packet as a response

Difference starts at the Output Port lookup stage


Exception packet path l.jpg
Exception Packet Path

CPU

RxQ

CPU

RxQ

CPU

RxQ

CPU

RxQ

MAC

TxQ

MAC

TxQ

MAC

TxQ

MAC

TxQ

PW-OSPF

SCONE

Java GUI

Software

Driver

nf2c0

nf2c1

nf2c2

nf2c3

ioctl

PCI Bus

DMA

Registers

CPU

TxQ

CPU

TxQ

CPU

TxQ

CPU

TxQ

nf2_reg_grp

NetFPGA

user data path

MAC

RxQ

MAC

RxQ

MAC

RxQ

MAC

RxQ

Ethernet


Output port lookup84 l.jpg
Output Port Lookup

1- Check input port matches Dst MAC

Output Port Lookup

Pkt length,

input port = 0

output port = 1

0xff

Pkt length,

input port = 0

2- Check TTL, checksum – EXCEPTION!

0

EthHdr: Dst MAC = 0,

Src MAC = x,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4

3- Add output port module

0

Data


Output queues85 l.jpg
Output Queues

Output Queues

OQ0

OQ1

OQ2

Pkt

OQ7


Cpu tx queue l.jpg
CPU Tx Queue

0xff

Pkt length,

input port = 0

output port = 1

0

EthHdr: Dst MAC = 0,

Src MAC = x,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 64, Csum:0x3ab4

IP Hdr:

IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4

0

Data

CPU Tx Queue


Cpu tx queue87 l.jpg
CPU Tx Queue

CPU Tx Queue

0xff

Pkt length,

input port = 0

output port = 1

0

EthHdr: Dst MAC = 0,

Src MAC = x,

Ethertype = IP

0

IP Hdr:

IP Dst: 192.168.2.3, TTL: 1, Csum:0x3ab4

0

Data


Icmp packet l.jpg
ICMP Packet

For the ICMP packet, the packet arrives at the CPU Rx Queue from the PCI Bus

It follows the same path as a packet from the MAC until it reaches the Output Port Lookup

The OPL module sees the packet is from the CPU Rx Queue 1 and sets the output port directly to 0

The packet then continues on the same path as the non-exception packet to the Output Queues and then MAC Tx queue 0


Icmp packet path l.jpg
ICMP Packet Path

CPU

RxQ

CPU

RxQ

CPU

RxQ

CPU

RxQ

MAC

TxQ

MAC

TxQ

MAC

TxQ

MAC

TxQ

SCONE

Java GUI

PW-OSPF

Software

Driver

nf2c0

nf2c1

nf2c2

nf2c3

ioctl

PCI Bus

DMA

Registers

CPU

TxQ

CPU

TxQ

CPU

TxQ

CPU

TxQ

nf2_reg_grp

NetFPGA

user data path

MAC

RxQ

MAC

RxQ

MAC

RxQ

MAC

RxQ

Ethernet


Netfpga host interaction l.jpg
NetFPGA-Host Interaction

Linux driver interfaces with hardware

Packet interface via standard Linux network stack

Register reads/writes via ioctl system call with wrapper functions:

readReg(nf2device *dev, int address, unsigned *rd_data);

writeReg(nf2device *dev, int address, unsigned *wr_data);

eg:

readReg(&nf2, OQ_NUM_PKTS_STORED_0, &val);


Netfpga host interaction91 l.jpg
NetFPGA-Host Interaction

1. Packet arrives – forwarding table sends to CPU queue

2. Interrupt notifies driver of packet arrival

3. Driver sets up and initiates DMA transfer

NetFPGA to host packet transfer

PCI Bus


Netfpga host interaction92 l.jpg
NetFPGA-Host Interaction

5. Interrupt signals completion of DMA

4. NetFPGA transfers packet via DMA

NetFPGA to host packet transfer (cont.)

PCI Bus

6. Driver passes packet to network stack


Netfpga host interaction93 l.jpg
NetFPGA-Host Interaction

3. Interrupt signals completion of DMA

2. Driver sets up and initiates DMA transfer

Host to NetFPGA packet transfers

PCI Bus

1. Software sends packet via network sockets Packet delivered to driver


Netfpga host interaction94 l.jpg
NetFPGA-Host Interaction

2. Driver performs PCI memory read/write

Register access

PCI Bus

1. Software makes ioctl call on network socket ioctl passed to driver


Netfpga host interaction95 l.jpg
NetFPGA-Host Interaction

Packet transfers shown using DMA interface

Alternative: use programmed IO to transfer packets via register reads/writes

slower but eliminates the need to deal with network sockets



Drop 1 in n packets l.jpg
Drop 1 in N Packets

Objectives

Add counter and FSM to the code

Synthesize and test router

Execution

Open drop_nth_packet.v

Insert counter code

Synthesize

After synthesis, test the new system.

Exercise 1


New reference router pipeline l.jpg
New Reference Router Pipeline

One module added

Drop Nth Packet to drop every Nth packet from the reference router pipeline

Exercise 1

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

MAC

RxQ

CPU

RxQ

Input Arbiter

Output Port Lookup

Event Capture

Drop Nth Packet

Output Queues

MAC

TxQ

CPU

TxQ

Rate

Limiter

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ

CPU

TxQ

MAC

TxQ


Step 1 open the source l.jpg
Step 1 - Open the Source

We will modify the Verilog

source code to add a

counter to the

drop_nth_packet

module

Open terminal

Type “xemacs netfpga/projects/tutorial_router/src/drop_nth_packet.v

Exercise 1


Step 2 add counter to module l.jpg
Step 2 - Add Counter to Module

Add counter using the following signals:

counter –16 bit output signal that you should increment on each packet pulse

rst_counter – reset signal (a pulse input)

inc_counter – increment (a pulse input)

Search for insert counter (ctrl+s insert counter, Enter)

Insert counter and save

(ctrl+x+s)

Exercise 1


Step 3 build the hardware l.jpg
Step 3 - Build the Hardware

Start terminal, cd to “netfpga/projects/tutorial_router/synth”

Run “make clean”

Start synthesis with “make”

Exercise 1


Step 5 test your router l.jpg
Step 5 – Test your Router

You can watch the number of received and sent packets to watch the module drop every Nth packet. Ping a local machine (i.e. 192.168.7.1) and watch for missing pings

To run your router:

1- Enter the directory by typing:cd netfpga/projects/tutorial_router/sw

2- Run the router by typing:./tut_adv_router_gui.pl --use_bin ../../../bitfiles/tutorial_router.bit

To set the value of N (which packet to drop)

type regwrite 0x2000704 N

– replace N with a number (such as 100)

To enable packet dropping, type: To disable packet dropping, type:

regwrite 0x2000700 0x1regwrite 0x2000700 0x0

Exercise 1


Step 5 measurements l.jpg
Step 5 – Measurements

Determine iperf TCP throughput to neighbor’s server for each of several values of N

Similar to Demo 2, Step 8

cd netfpga/projects/tutorial_router/sw

./iperf.sh

Ping 192.168.x.2 (where x is your neighbor’s server)

TCP throughput with:

Drop circuit disabled

TCP Throughput = ________ Mbps

Drop one in N = 1,000 packets

TCP Throughput = ________ Mbps

Drop one in N = 100 packets

TCP Throughput = ________ Mbps

Drop one in N = 10 packets

TCP Throughput = ________ Mbps

Explain why TCPs throughput is so low given that only a tiny fraction of packets are lost

Exercise 1



Netfpgas are used l.jpg
NetFPGAs are used:

To run laboratory courses on network routing

Professors teach courses (CS344, Workshops, ..)

To teach students how to build real Internet routers

Train students to build routers (Cisco, Juniper, Huawei, .. )

To research how new features in the network

Build network services for data centers (Google, UCSD.. )

To prototype systems with live traffic

That Buffer measurement (while maintaining throughput, ..)

To help hardware vendors understand device requirements

Use of hardware (Xilinx, Micron, Cypress, Broadcom, ..)


Running the router kit user space development 4x1ge line rate forwarding l.jpg
Running the Router KitUser-space development, 4x1GE line-rate forwarding

OSPF

BGP

My Protocol

user

kernel

Routing

Table

“Mirror”

1GE

FPGA

Fwding

Table

Packet

Buffer

1GE

1GE

1GE

1GE

IPv4

Router

1GE

Memory

1GE

1GE

Usage #1

CPU

Memory

PCI


Enhancing modular reference designs l.jpg
Enhancing Modular Reference Designs

Verilog

EDA Tools

(Xilinx,

Mentor, etc.)

NetFPGA Driver

  • Design

  • Simulate

  • Synthesize

  • Download

1GE

L3

Parse

L2

Parse

In Q

Mgmt

1GE

1GE

IP

Lookup

Out Q

Mgmt

1GE

Verilog modules interconnected by FIFO interfaces

Usage #2

PW-OSPF

CPU

Memory

Java GUI

Front Panel

(Extensible)

PCI

1GE

FPGA

1GE

1GE

My

Block

Memory

1GE


Creating new systems l.jpg
Creating new systems

Verilog

EDA Tools

(Xilinx,

Mentor, etc.)

  • Design

  • Simulate

  • Synthesize

  • Download

NetFPGA Driver

1GE

My Design

(1GE MAC is soft/replaceable)

1GE

1GE

1GE

Usage #3

CPU

Memory

PCI

1GE

FPGA

1GE

1GE

Memory

1GE


Netfpga platform l.jpg
NetFPGA Platform

Major Components

Interfaces

4 Gigabit Ethernet Ports

PCI Host Interface

Memories

36Mbits Static RAM

512Mbits DDR2 Dynamic RAM

FPGA Resources

Block RAMs

Configurable Logic Block (CLBs)

Memory Mapped Registers


Netfpga cube systems l.jpg
NetFPGA Cube Systems

PCs assembled from parts

Stanford University

Cambridge University

Pre-built systems available

Accent Technology Inc.

Details are in the Guide

http://netfpga.org/static/guide.html


Rackmount netfpga servers l.jpg
Rackmount NetFPGA Servers

NetFPGA inserts in PCI or PCI-X slot

2U Server (Dell 2950)

1U Server

(Accent Technology Inc.)

Thanks: Brian Cashman for providing machine


Stanford netfpga cluster l.jpg
Stanford NetFPGA Cluster

  • Statistics

  • Rack of 40

    • 1U PCs with NetFPGAs

  • Manged

    • Power

    • Console

    • LANs

  • Provides 4*40=160 Gbps of full line-rate processing bandwidth


Acknowledgments l.jpg
Acknowledgments

NetFPGA Team at Stanford University (Past and Present):

Nick McKeown, Glen Gibb,Jad Naous, David Erickson,

G. Adam Covington, John W. Lockwood, Jianying Luo,Brandon Heller, Paul Hartke, Neda Beheshti, Sara Bolouki, James Zeng,

Jonathan Ellithorpe, Sachidanandan Sambandan


Special thanks to our partners l.jpg
Special thanks to our Partners:

Patrick Lysaght, Veena Kumar, Paul Hartke, Anna Acevedo

Xilinx University Program (XUP)

Other NetFPGA Tutorial Presented At:

SIGMETRICS

See: http://NetFPGA.org/tutorials/


Thanks to our sponsors l.jpg
Thanks to our Sponsors:

Support for the NetFPGA project has been provided by the following companies and institutions

Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project.


ad