High performance networking for all
This presentation is the property of its rightful owner.
Sponsored Links
1 / 36

High Performance Networking for ALL PowerPoint PPT Presentation


  • 44 Views
  • Uploaded on
  • Presentation posted in: General

MB - NG. High Performance Networking for ALL. Members of GridPP are in many Network collaborations including:. Close links with: SLAC UKERNA, SURFNET and other NRNs Dante Internet2 Starlight, Netherlight GGF Ripe Industry …. UKLIGHT. DataGrid WP7 code extended by Gareth Manc

Download Presentation

High Performance Networking for ALL

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


High performance networking for all

MB - NG

High Performance Networking for ALL

Members of GridPP are in many Network collaborations including:

Close links with:SLACUKERNA, SURFNET and other NRNsDanteInternet2Starlight, NetherlightGGFRipeIndustry…

UKLIGHT

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Network monitoring 1

DataGrid WP7 code extended by Gareth Manc

Technology transfer to UK e-Science

Developed by Mark Lees DL

Fed back into DataGrid by Gareth

Links to: GGF NM-WG, Dante, Internet2 Characteristics, Schema & web services

Success

Network Monitoring [1]

  • Architecture

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Network monitoring 2

Network Monitoring [2]

24 Jan to 4 Feb 04

TCP iperf RAL to HEP

Only 2 sites >80 Mbit/s

24 Jan to 4 Feb 04

TCP iperf DL to HEP

HELP!

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High bandwidth long distance where is my throughput

High bandwidth, Long distance….Where is my throughput?

Robin Tasker

CCLRC, Daresbury Laboratory, UK

[[email protected]]

DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

RIPE-47, Amsterdam, 29 January 2004


Throughput what s the problem

Throughput… What’s the problem?

One Terabyte of data transferred in less than an hour

On February 27-28 2003, the transatlantic DataTAG network was

extended, i.e. CERN - Chicago - Sunnyvale (>10000 km).

For the first time, a terabyte of data was transferred across the Atlantic in less than one hour using a single TCP (Reno) stream.

The transfer was accomplished from Sunnyvale to Geneva at

a rate of 2.38 Gbits/s

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Internet2 land speed record

Internet2 Land Speed Record

On October 1 2003, DataTAG set a new Internet2 Land Speed Record by transferring 1.1 Terabytes of data in less than 30 minutes from Geneva to Chicago across the DataTAG provision, corresponding to an average rate of 5.44 Gbits/s using a single TCP (Reno) stream

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


So how did we do that

So how did we do that?

Management of the End-to-End Connection

Memory-to-Memory transfer; no disk system involved

Processor speed and system bus characteristics

TCP Configuration – window size and frame size (MTU)

Network Interface Card and associated driver and their configuration

End-to-End “no loss” environment from CERN to Sunnyvale!

At least a 2.5 Gbits/s capacity pipe on the end-to-end path

A single TCP connection on the end-to-end path

No real user application

That’s to say - not the usual User experience!

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Realistically what s the problem why do network research

Realistically – what’s the problem & why do network research?

End System Issues

Network Interface Card and Driver and their configuration

TCP and its configuration

Operating System and its configuration

Disk System

Processor speed

Bus speed and capability

Network Infrastructure Issues

Obsolete network equipment

Configured bandwidth restrictions

Topology

Security restrictions (e.g., firewalls)

Sub-optimal routing

Transport Protocols

Network Capacity and the influence of Others!

Many, many TCP connections

Mice and Elephants on the path

Congestion

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


End hosts buses nics and drivers

End Hosts: Buses, NICs and Drivers

Throughput

  • Use UDP packets to characterise Intel PRO/10GbE Server Adaptor

    • SuperMicro P4DP8-G2 motherboard

    • Dual Xenon 2.2GHz CPU

    • 400 MHz System bus

    • 133 MHz PCI-X bus

Latency

Bus Activity

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


End hosts understanding nic drivers

End Hosts: Understanding NIC Drivers

  • Linux driver basics – TX

    • Application system call

    • Encapsulation in UDP/TCP and IP headers

    • Enqueue on device send queue

    • Driver places information in DMA descriptor ring

    • NIC reads data from main memory

    • via DMA and sends on wire

    • NIC signals to processor that TX

    • descriptor sent

  • Linux driver basics – RX

    • NIC places data in main memory via

    • DMA to a free RX descriptor

    • NIC signals RX descriptor has data

    • Driver passes frame to IP layer and

    • cleans RX descriptor

    • IP layer passes data to application

  • Linux NAPI driver model

    • On receiving a packet, NIC raises interrupt

    • Driver switches off RX interrupts and schedules RX DMA ring poll

    • Frames are pulled off DMA ring and is processed up to application

    • When all frames are processed RX interrupts are re-enabled

    • Dramatic reduction in RX interrupts under load

    • Improving the performance of a Gigabit Ethernet driver under Linux

      http://datatag.web.cern.ch/datatag/papers/drafts/linux_kernel_map/

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

Protocols: TCP (Reno) – Performance

  • AIMD and High Bandwidth – Long Distance networks

    Poor performance of TCP in high bandwidth wide area networks is due

    in part to the TCP congestion control algorithm

    • For each ack in a RTT without loss:

      cwnd -> cwnd + a / cwnd- Additive Increase, a=1

    • For each window experiencing loss:

      cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

Protocols: HighSpeed TCP & Scalable TCP

  • Adjusting the AIMD Algorithm – TCP Reno

    • For each ack in a RTT without loss:

      cwnd -> cwnd + a / cwnd- Additive Increase, a=1

    • For each window experiencing loss:

      cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½

  • High Speed TCP

    a and b vary depending on current cwnd where

    • a increases more rapidly with larger cwnd and as a consequence returns to the ‘optimal’ cwnd size sooner for the network path; and

    • b decreases less aggressively and, as a consequence, so does the cwnd. The effect is that there is not such a decrease in throughput.

  • Scalable TCP

    a and b are fixed adjustments for the increase and decrease of cwnd

    such that the increase is greater than TCP Reno, and the decrease on

    loss is less than TCP Reno

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

Protocols: HighSpeed TCP & Scalable TCP

  • HighSpeed TCP

  • Scalable TCP

Success

HighSpeed TCP implemented by Gareth Manc

Scalable TCP implemented by Tom Kelly Camb

Integration of stacks into DataTAG Kernel Yee UCL + Gareth

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Some measurements of throughput cern sara

Some Measurements of Throughput CERN -SARA

  • Using the GÉANT Backup Link

    • 1 GByte file transfers

    • Blue Data

    • Red TCP ACKs

  • Standard TCP

    • Average Throughput 167 Mbit/s

    • Users see 5 - 50 Mbit/s!

  • High-Speed TCP

    • Average Throughput 345 Mbit/s

  • Scalable TCP

    • Average Throughput 340 Mbit/s

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Users the campus the man 1

Pete White

Pat Meyrs

Users, The Campus & the MAN [1]

  • NNW – to – SJ4 Access 2.5 Gbit PoS Hits 1 Gbit 50 %

  • Man – NNW Access 2 * 1 Gbit Ethernet

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Users the campus the man 2

Users, The Campus & the MAN [2]

  • Message:

    • Continue to work with your network group

    • Understand the traffic levels

    • Understand the Network Topology

  • LMN to site 1 Access 1 Gbit Ethernet

  • LMN to site 2 Access 1 Gbit Ethernet

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


10 gigethernet tuning pci x

10 GigEthernet: Tuning PCI-X

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


10 gigethernet at sc2003 bw challenge phoenix

10 GigEthernet at SC2003 BW Challenge (Phoenix)

  • Three Server systems with 10 GigEthernet NICs

  • Used the DataTAG altAIMD stack 9000 byte MTU

  • Streams From SLAC/FNAL booth in Phoenix to:

    • Pal Alto PAIX 17 ms rtt

    • Chicago Starlight 65 ms rtt

    • Amsterdam SARA 175 ms rtt

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Helping real users 1 radio astronomy vlbi poc with nrns geant 1024 mbit s 24 on 7 now

Helping Real Users [1]Radio Astronomy VLBIPoC with NRNs & GEANT 1024 Mbit/s 24 on 7 NOW

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

VLBI Project: Throughput Jitter & 1-way Delay

  • 1472 byte Packets Manchester -> Dwingeloo JIVE

  • 1472 byte Packets man -> JIVE

  • FWHM 22 µs (B2B 3 µs )

  • 1-way Delay – note the packet loss (points with zero 1 –way delay)

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

VLBI Project: Packet Loss Distribution

  • Measure the time between lost packets in the time series of packets sent.

  • Lost 1410 in 0.6s

  • Is it a Poisson process?

  • Assume Poisson is stationary λ(t) = λ

  • Use Prob. Density Function:P(t) = λ e-λt

  • Mean λ = 2360 / s[426 µs]

  • Plot log: slope -0.0028expect -0.0024

  • Could be additional process involved

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Vlbi traffic flows only testing

VLBI Traffic Flows – Only testing!

  • Manchester – NetNorthWest - SuperJANET Access links

    • Two 1 Gbit/s

  • Access links:SJ4 to GÉANT GÉANT to SurfNet

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Throughput pci transactions on the mark5 pc

Read /

Write n bytes



time

Wait time

Throughput & PCI transactions on the Mark5 PC:

  • Mark5 uses Supermicro P3TDLE

    • 1.2 GHz PIII

    • Mem bus 133/100 MHz

    • 2 *64bit 66 MHz PCI

    • 4 32bit 33 MHz PCI

Ethernet

NIC

IDE

Disc

Pack

SuperStor

Input Card

Logic Analyser

Display

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Pci activity read multiple data blocks 0 wait

PCI Activity: Read Multiple data blocks 0 wait

  • Read 999424 bytes

  • Each Data block:

    • Setup CSRs

    • Data movement

    • Update CSRs

  • For 0 wait between reads:

    • Data blocks ~600µs longtake ~6 ms

    • Then 744µs gap

  • PCI transfer rate 1188Mbit/s(148.5 Mbytes/s)

  • Read_sstor rate 778 Mbit/s (97 Mbyte/s)

  • PCI bus occupancy: 68.44%

  • Concern about Ethernet Traffic 64 bit 33 MHz PCI needs ~ 82% for 930 Mbit/s Expect ~360 Mbit/s

Data transfer

Data Block131,072 bytes

CSR Access

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester

PCI Burst 4096 bytes


Pci activity read throughput

PCI Activity: Read Throughput

  • Flat then 1/t dependance

  • ~ 860 Mbit/s for Read blocks >= 262144 bytes

  • CPU load ~20%

  • Concern about CPU load needed to drive Gigabit link

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


High performance networking for all

Helping Real Users [2]HEPBaBar & CMSApplication Throughput

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Babar case study disk performace

BaBar Case Study: Disk Performace

  • BaBar Disk Server

    • Tyan Tiger S2466N motherboard

    • 1 64bit 66 MHz PCI bus

    • Athlon MP2000+ CPU

    • AMD-760 MPX chipset

    • 3Ware 7500-8 RAID5

    • 8 * 200Gb Maxtor IDE 7200rpm disks

  • Note the VM parameterreadahead max

  • Disk to memory (read)Max throughput 1.2 Gbit/s 150 MBytes/s)

  • Memory to disk (write)Max throughput 400 Mbit/s 50 MBytes/s)[not as fast as Raid0]

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Babar serial ata raid controllers

BaBar: Serial ATA Raid Controllers

  • ICP 66 MHz PCI

  • 3Ware 66 MHz PCI

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Babar case study raid throughput pci activity

BaBar Case Study: RAID Throughput & PCI Activity

  • 3Ware 7500-8 RAID5 parallel EIDE

  • 3Ware forces PCI bus to 33 MHz

  • BaBar Tyan to MB-NG SuperMicroNetwork mem-mem 619 Mbit/s

  • Disk – disk throughput bbcp40-45 Mbytes/s (320 – 360 Mbit/s)

  • PCI bus effectively full!

Read from RAID5 Disks

Write to RAID5 Disks

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Mb ng superjanet4 development network babar case study

MAN

MCC

OSM-1OC48-POS-SS

Gigabit Ethernet

2.5 Gbit POS Access

2.5 Gbit POS core

MPLS Admin. Domains

SJ4 Dev

SJ4 Dev

SJ4 Dev

PC

BarBar PC

SJ4 Dev

MB - NG

RAL

OSM-1OC48-POS-SS

3ware RAID5

3ware RAID5

PC

PC

PC

PC

MB – NG SuperJANET4 Development NetworkBaBar Case Study

Status / Tests:

  • Manc host has DataTAG TCP stack

  • RAL Host now available

  • BaBar-BaBar mem-mem

  • BaBar-BaBar real data MB-NG

  • BaBar-BaBar real data SJ4

  • Mbng-mbng real data MB-NG

  • Mbng-mbng real data SJ4

  • Different TCP stacks already installed

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Study of applications mb ng superjanet4 development network

MAN

MCC

OSM-1OC48-POS-SS

Gigabit Ethernet

2.5 Gbit POS Access

2.5 Gbit POS core

MPLS Admin. Domains

SJ4 Dev

SJ4 Dev

SJ4 Dev

PC

PC

SJ4 Dev

MB - NG

UCL

OSM-1OC48-POS-SS

3ware RAID0

3ware RAID0

PC

PC

PC

PC

Study of Applications MB – NG SuperJANET4 Development Network

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


24 hours highspeed tcp mem mem

MB - NG

24 Hours HighSpeed TCP mem-mem

  • TCP mem-mem lon2-man1

  • Tx 64 Tx-abs 64

  • Rx 64 Rx-abs 128

  • 941.5 Mbit/s +- 0.5 Mbit/s

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Gridftp throughput highspeedtcp

MB - NG

Gridftp Throughput HighSpeedTCP

  • Int Coal 64 128

  • Txqueuelen 2000

  • TCP buffer 1 M byte(rtt*BW = 750kbytes)

  • Interface throughput

  • Acks received

  • Data moved

  • 520 Mbit/s

  • Same for B2B tests

  • So its not that simple!

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Gridftp throughput web100

MB - NG

Gridftp Throughput + Web100

  • Throughput Mbit/s:

  • See alternate 600/800 Mbitand zero

  • Cwnd smooth

  • No dup Ack / send stall /timeouts

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


Http data transfers highspeed tcp

MB - NG

http data transfers HighSpeed TCP

  • Apachie web server out of the box!

  • prototype client - curl http library

  • 1Mbyte TCP buffers

  • 2Gbyte file

  • Throughput 72 MBytes/s

  • Cwnd - some variation

  • No dup Ack / send stall /timeouts

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


More information some urls

More Information Some URLs

  • MB-NG project web site:http://www.mb-ng.net/

  • DataTAG project web site: http://www.datatag.org/

  • UDPmon / TCPmon kit + writeup: http://www.hep.man.ac.uk/~rich/net

  • Motherboard and NIC Tests:

    www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt& http://datatag.web.cern.ch/datatag/pfldnet2003/

  • TCP tuning information may be found at:http://www.ncne.nlanr.net/documentation/faq/performance.html& http://www.psc.edu/networking/perf_tune.html

GridPP Meeting Edinburgh 4-5 Feb 04

R. Hughes-Jones Manchester


  • Login