slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Sundar Iyer PowerPoint Presentation
Download Presentation
Sundar Iyer

Loading in 2 Seconds...

play fullscreen
1 / 23

Sundar Iyer - PowerPoint PPT Presentation


  • 173 Views
  • Uploaded on

Winter 2012 Lecture 7 Packet Buffers. EE384 Packet Switch Architectures. Sundar Iyer. The Problem. All packet switches (e.g. Internet routers, Ethernet switch) require packet buffers for periods of congestion.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Sundar Iyer' - brit


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Winter 2012

Lecture 7

Packet Buffers

EE384

Packet Switch Architectures

Sundar Iyer

the problem
The Problem
  • All packet switches (e.g. Internet routers, Ethernet switch) require packet buffers for periods of congestion.
  • Size: A commonly used “rule of thumb” says that buffers need to hold one RTT (about 0.25s) of data. Even if this could be reduced to 10ms, a 4x10Gb/s linecard would require 400Mbits of buffering.
  • Speed: Clearly, the buffer needs to store (retrieve) packets as fast as they arrive (depart). At 4x10Gb/s, minimum sized packets must arrive and depart every 8ns.
an example packet buffers for a 40gb s linecard

Write Rate, R

One 40B packet

every 8ns

Read Rate, R

One 40B packet

every 8ns

UnpredictableScheduler Requests

An ExamplePacket buffers for a 40Gb/s linecard

Buffer

Memory

Buffer Manager

Memory needs to be accessed for write or read every 4ns

memory operations per second mops
Memory Operations Per Second (MOPS)

What is MOPS?

  • Num. Unique Memory Operations Per Second
    • Refers to the speed of the address (not data) bus
  • Inverse of Random Access Time

Examples

  • SRAM with 4ns access time = 250M MOPS
  • DRAM with 50 ns access time = 20M MOPS
memory technology
Memory Technology

Use SRAM?

+ Fast enough random access time, but

  • Low density, high cost, high power.

Use DRAM?

+ High density means we can store data, but

  • Can’t meet random access time.
the problem no single memory technology is a good match

SRAM (S)

FCRAM/RLDRAM (F)

XDRAM (X)

DDR3 (D)

25M MOPS

2c per Mb

3200 Mb/s per pin

800M MOPS

$1 per Mb

800 Mb/s per pin

50M MOPS

4c per Mb

1000 Mb/s per pin

25M MOPS

1c per Mb

1600 Mb/s per pin

X

D

F

S

The Problem: No single memory technology is a good match

Ideal to have access/s of SRAM,

Cost & Density of DRAM

sol 1 can t we just use lots of drams as separate memories in parallel
Sol 1: Can’t we just use lots of DRAMs as separate memories in parallel?

Read, write 40B every 4ns from a different ‘32ns access time’ memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

40B

40B

40B

40B

40B

40B

40B

40B

Solution

  • Write 40B packets to available banks
  • Read 40B packets from specified banks

Problem

  • What if back to back reads occur from a small number of banks?
sol 2 can t we just use lots of drams as one monolithic memory in parallel

Read Rate, R

One 40B packet

every 8ns

Sol 2: Can’t we just use lots of DRAMs as one monolithic memory in parallel?

Read/write 320B every 32ns

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Buffer

Memory

Bytes: 0-39

40-79

280-319

320B

320B

Write Rate, R

Buffer Manager

One 40B packet

every 8ns

sol 2 works fine if there is only one fifo

320B

320B

320B

320B

320B

320B

320B

320B

320B

320B

40B

40B

40B

40B

40B

40B

40B

40B

Sol 2: Works fine if there is only one FIFO

Slow Buffer Memory

Bytes: 0-39

40-79

280-319

320B

320B

Write Rate, R

Read Rate, R

Buffer Manager

40B

40B

320B

320B

One 40B packet

every 8ns

One 40B packet

every 8ns

sol 2 works fine if there is only one fifo1
Sol 2: Works fine if there is only one FIFO

& Supports Variable Length Packets

Buffer Memory

320B

320B

320B

320B

320B

320B

320B

320B

320B

320B

Bytes: 0-39

40-79

280-319

320B

320B

Write Rate, R

Read Rate, R

Buffer Manager

?B

?B

320B

320B

One 40B packet

every 8ns

One 40B packet

every 8ns

sol 2 in practice buffer holds many fifos

320B

320B

Write Rate, R

Read Rate, R

Buffer Manager

?B

?B

320B

320B

One 40B packet

every 8ns

One 40B packet

every 8ns

Sol 2: In practice, buffer holds many FIFOs

1

320B

320B

320B

320B

How can we writemultiple variable-lengthpackets into different

queues?

Q might be 1k – 64k

2

320B

320B

320B

320B

Q

320B

320B

320B

320B

Bytes: 0-39

40-79

280-319

problem
Problem

A block contains packets for different queues, which must be written to, or read from different memory locations.

sol 3 hybrid memory hierarchy

Small Probability of Miss Rate

Sol 3: Hybrid Memory Hierarchy

Big slow memory

DRAM

Small fast cache

SRAM

Arriving

Packet processor

Departing

Packets

Packets

R

R

A CPU cache is probabilistic

Q: Why is randomness a problem in this context?

sol 4 hybrid memory hierarchy with 100 cache hit rate

Large DRAM memory holds FIFO body

54

53

52

51

50

10

9

8

7

6

5

1

95

94

93

92

91

90

89

88

87

86

15

14

13

12

11

10

9

8

7

6

2

86

85

84

83

82

11

10

9

8

7

DRAM

Q

Reading

b bytes

Writing

b bytes

1

1

4

3

1

2

Arriving

Departing

55

60

59

58

57

56

2

Packets

Packets

2

1

2

4

3

5

97

96

R

R

Q

Q

6

5

4

3

2

1

SRAM

87

88

91

90

89

Unpredictable

Scheduler

Small SRAM

Small SRAM

Requests

for FIFO heads

for FIFO tails

Sol 4: Hybrid Memory Hierarchy with 100% Cache Hit Rate
design questions
Design questions
  • What is the minimum SRAM needed to guarantee that a byte is always available in SRAM when requested?
  • What algorithm minimizes the SRAM size?
an example q 5 w 9 b 6

Bytes

Replenish

Bytes

Bytes

Bytes

t = 0

t = 1

t = 2

t = 3

Replenish

Bytes

Bytes

Bytes

Bytes

t = 4

t = 5

t = 6

t = 7

An Example Q = 5, w = 9+, b = 6
an example q 5 w 9 b 61

Bytes

Bytes

Bytes

Bytes

t = 8

t = 9

t = 10

t = 11

Replenish

Replenish

Bytes

Bytes

Bytes

Bytes

Read

t = 13

t = 19

t = 23

t = 12

An Example Q = 5, w = 9+, b = 6
the size of the sram cache
The size of the SRAM cache

Bytes

Necessity

  • How large does the SRAM cache need to be under any management algorithm?
  • Claim: wQ > Q(b - 1)(2 + lnQ)

Sufficiency

  • For any pattern of arrivals, what is the smallest SRAM cache needed so that a byte is always available when requested?
  • For one particular algorithm: wQ = Qb(2 + lnQ)

Q

w

w

definitions
Definitions

Occupancy: X(q,t)

The number of bytes in FIFO q(in SRAM) at time t.

Deficit: D(q,t) = w - X(q,t)

Q

w

w

deficit

occupancy

smallest sram cache1
Smallest SRAM cache

In addition, each queue needs to hold (b – 1) bytes in case it is replenished with b bytes when only 1 byte has been removed.

Therefore, SRAM size must be at least: Qw > Q(b – 1)(2 + lnQ).

most deficit queue first

Examples:

  • 40Gb/s linecard, b=640, Q=128: SRAM = 560kBytes
  • 160Gb/s linecard, b=2560, Q=512: SRAM = 10MBytes
Most Deficit Queue First
  • Algorithm: Every b timeslots, replenish the queue with the largest deficit.
  • Claim: An SRAM cache of size Qw > Qb(2 + lnQ) is sufficient.
intuition for theorem

Examples:

  • 40Gb/s line card, b=640, Q=128: SRAM = 560kBytes
  • 160Gb/s line card, b=2560, Q=512: SRAM = 10MBytes
Intuition for Theorem
  • The maximum number of un-replenished requests for any i queues wi, is the solution of the difference equation -
  • with boundary conditions