1 / 40

Richard Hughes-Jones DANTE Jonathan Hargreaves JBO

4 Gigabit Onsala - Jodrell Lightpath for e-VLBI The iNetTest Unit Development of Real Time eVLBI at Jodrell Bank Observatory 7 th International eVLBI Workshop, Shanghai, 16-17 th June 2008. Richard Hughes-Jones DANTE Jonathan Hargreaves JBO. The 4 Gig Path for EXPReS Onsala – Jodrell. Multi:

odele
Download Presentation

Richard Hughes-Jones DANTE Jonathan Hargreaves JBO

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4 Gigabit Onsala - Jodrell Lightpath for e-VLBIThe iNetTest UnitDevelopment of Real Time eVLBI at Jodrell Bank Observatory7th International eVLBI Workshop, Shanghai, 16-17th June 2008 Richard Hughes-Jones DANTE Jonathan Hargreaves JBO

  2. The 4 Gig Path for EXPReSOnsala – Jodrell • Multi: • Domain • Vendor • Technology

  3. First Test Path 10 Gigabit Ethernet Lambda over NORDUnet 4 Gig TDM/SDH Lighpath over GÉANT2+

  4. Connect. Communicate. Collaborate UDP Throughput Stoc-Lon • Alcatel Metro Core Connect MCC Flow control OFF • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • Max throughput4.05 Gbit/s • Packet loss as expected • Sending host,3 CPUs idle • For <8 µs packets,1 CPU is >90% in kernel mode • Receiving host3 CPUs idle • For <20 µs packets, 1 CPU is ~30% in kernel mode

  5. Connect. Communicate. Collaborate UDP Jitter Stoc-Lon • Alcatel Metro Core Connect MCC Flow control OFF • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=0 Coalescence OFF • 8972 byte packets • 16.5 μs spacing • Peak in packet spacing at 16 μsas expected • Other peaks at 19 and 51 µs

  6. Connect. Communicate. Collaborate UDP Throughput Lon-Stoc • Alcatel Metro Core Connect MCC Flow control ON • Kernel 2.6.20-web100_pktd • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • Max throughput4.05 Gbit/s • No Packet loss – flow control • Sending host,3 CPUs idle • For <18 µs packets,1 CPU is >60% in kernel mode • Receiving host3 CPUs idle • For <20 µs packets, 1 CPU is ~30% in kernel mode

  7. Connect. Communicate. Collaborate TCP reno Throughput Lon-Stoc • Alcatel Metro Core Connect MCC Flow control ON • 27 VC-4 • RTT 31 ms • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • TCP buffer 32 MByte 2*BDP • iperf Ave throughput3.85 Gbit/s • iperf Max throughput4.02 Gbit/s • 3 re-transmitsDue to “other reductions”

  8. Connect. Communicate. Collaborate TCP reno Throughput Stoc-Lon • MCC NO Flow control • 27 VC-4 • RTT 31 ms • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • TCP buffer 32 MByte 2*BDP • iperf Ave throughput192 Mbit/s • iperf Max throughput465 Mbit/s • TCP Cwnd to 0 • Many re-transmits

  9. Connect. Communicate. Collaborate Alcatel MCC Buffer size • Classic Bottleneck • 10 Gbit/s input 4 Gbit/s output • Use udpmon to send a stream of spaced UDP packets • Measure packet number of first lost frame as function of w packet spacing Slope gives buffer size ~57 kBytes

  10. Use UDP to emulate TCP slowstart • udpmon sends bursts of spaced packets: • 32 packets • Jumbo 8000 bytes • back2back • 4 ms between bursts • PathLon-Ams_FF-Prague-Paris-Lon • Rtt 55.5 ms • See 13 packets then loose 1 in 3 • Confirm the TCP problem!

  11. iNetTest: iBoB FPGA with 10 GE

  12. iNetTest: Control & Operation Ten Gb Network 10Gb Port 0 10Gb Port 1 10Gb Port 0 10Gb Port 1 iNetTest iBOB 1 Send and receive packets of selected length and spacing Count packets transmitted and received, calculate histograms of arrival times iNetTest iBOB 2 Send and receive packets of selected length and spacing Count packets transmitted and received, calculate histograms of arrival times iNetTest iBOB n Expansion to more than two iBOBs 10/100 Ethernet Control PC IP control of multiple iBOBs

  13. iNetTest: Simulink design 10GE MAC core Network test parameter CSRs Inter-packet time histogram Transmit & receive event time log Jonathan Hargreaves JBO

  14. iNetTest: Details • iNetTest FPGA runs with a 200 MHz (5 ns) clocks • Two iNetTest units can be controlled over ethernet IP from a PC • iNetTest Ethernet IP address and MAC preset in code – last digit selected by jumper • Ten GE IP address, Port and Gateway can be user configured • Automatic or manual ARPing. ARP tables for each port can be examined • iNetTest responds to and generates PING • UDP packets can be sent by firmware between two iNetTests or between iNetTest and PCs • User selects the number of packets to send, their length, and the time between them • iBOBs count received and transmitted packets, and store up to 2048 time stamped events per port • iBOBs generate histograms of the arrival time distribution for received packets

  15. UDP Throughput vs. Packet Spacing • PC • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R CX4 • rx-usecs=25 Coalescence ON • MTU 9000 bytes • UDP Packets • Max throughput 9.4 Gbit/s • iBoB • Packet 8234Data: 8192+ Header: 42 • 100 MHz clock • Max rate 6.6 Gbit/s • See6.44Gbit/s • 200 MHz clock - Linespeed

  16. PC to iNetTest Packet Jitter • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • Packet separation 100 µs • iBoB bins 2320 ns • Width ± 34 µs • Similar to PC-PC but no extra peaks • Structure caused by PC The PC – PC plots look like:

  17. iNetTest – iBOB to iBOB Results • 10 GE 3m CX4 cable link • 8192 byte packets • 1 million transmitted at line rate between two iNetTest units. • 5 ns bins • B2B 10 ns FWHM • With Switch: • Fujutsu XG2000 20 ns • HP 6400cl 35 ns long tail Number of Packets Time between packet arrivals (ns)

  18. Back to the 4 Gig Path

  19. Stockholm – Manc. with TSS 4 Gigabit Ethernet Path over TSS NORDUnet 4 Gig TDM/SDH Lighpath over GÉANT2+ 4 Gig on Ethernet Optical Transmission JANET & NNW

  20. Connect. Communicate. Collaborate UDP Throughput Stoc-Man with TSS • Alcatel TSS – 10GE – MCC • Kernel 2.6.20-web100_pktd • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=75 Coalescence ON • MTU 9000 bytes • Max throughput3.78 Gbit/s(Previously was 4.05 Gbit/s) • Packet loss 10 % at 4.096 Gbit/s for 8192 Bytes • Man-Stoc:No packet loss @ 4.096 Gbit/s

  21. Connect. Communicate. Collaborate Packet loss Stoc-Man with TSS • Alcatel TSS – 10GE – MCC • Send 1M packets from PC in Stockholm • 8192 bytes • All cross the TSS cloud • All enter Alcatel MCC as 10 GE frames • Packet loss in MCC Card • ~10% at 4096 Mbit/s • Classic “bottleneck” performance • But WHY???

  22. Connect. Communicate. Collaborate UDP Jitter Man-Stoc with TSS • Alcatel MCC – 10GE – TSS – PC • Kernel 2.6.20-web100_pktd • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=0 Coalescence OFF • 8192 byte packets • Alcatel MCC is TDM 28 VC-4Can only send packets at 16 µsNot at 10 Gbit/s • Peak packet spacing ~ 6 µs ie packets arriving at 10 Gbit/s • Suggests packet bunching in the network Sto-Man no TSS 16 µs

  23. Connect. Communicate. Collaborate UDP 1-way delay Man-Stoc with TSS • Alcatel MCC – 10GE – TSS – PC • Kernel 2.6.20-web100_pktd • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=0 Coalescence OFF • 8192 byte packets • Structure similar to interrupt coalescence but this is OFF • As function of recv timesee gaps of ~ 130 uspackets at ~ 10Gbit/s • Again suggests packet bunching in the network

  24. Last week: Onsala – JodrelliNetTest - iNetTest 4 Gigabit Ethernet Path over TSS NORDUnet 4 Gig TDM/SDH Lighpath over GÉANT2+ 4 Gig on Ethernet Optical Transmission JANET & NNW

  25. Connect. Communicate. Collaborate iNetTest: Onsala  JBOUDP Packet Jitter • 1M packets • 8192 byte packets • 16 µs spacing 4.096 Gbit/s • 10 % packet loss • Peaks 14 & 15.7 µs • Tail to 130 µs • Cyclic variations spacing3.25 µs • 45 packets at 6.5 µs (line rate)

  26. Development of Real Time eVLBI at JBOeMERLIN Import – Onsala to Jodrell eMERLIN Export – Jodrell to JIVE

  27. Overview of eVLBI at JBO Station Board VSI VSI to ZDOK iBOB 0 Switch Switch VLBI Mk V b receivers CX4 1Gbps VSI VSI to ZDOK Station Board VSI VSI to ZDOK iBOB 1 4Gbps light path CX4 1Gbps VSI VSI to ZDOK Station Board VSI VSI to ZDOK iBOB 2 CX4 1Gbps VSI VSI to ZDOK Station Board VSI VSI to ZDOK iBOB 3 CX4 1Gbps VSI VSI to ZDOK 4Gbps light path CX4 4Gbps Onsala JBO CX4 4Gbps Or fibre if > 15m Station Board VSI iBOB 4 Switch Switch iBOB 5 ADC VSI JBO JIVE eMERLIN CORRELATOR

  28. Onsala to JBO: TransmiteMERLIN Import iBOB DATA Left Pol iADC 2 channels 1024MSPS per channel 8 bit sampling ZDOK0 Convert data to 2 bit Measure signal power on both channels Generate real time clock based on ADC clock Pass data & sync from ADC clock to iBOB 200MHz clock domain ASSEMBLE PACKETS Divide data into 8192 byte packets plus 16 byte MkVC compatible header Alternatively generate test packets as per iNetTest Count packets transmitted 10GB PORT . 4GB/s out DATA Right Pol ONEPPS ADCCLK Measured RMS Signal Power Set Network Source and Destination Set up test mode parameters Read and Set Real Time Clock Control Data Flow ON/OFF 10/100 ethernet Power PC iBOB control and monitoring via registers on the OBP bus

  29. Onsala to JBO: Receive iBOB Station Board 10GB PORT . 4GB/s in FIFO 16k deep Smooth out bunching due to network delay variations RE-ORDER PACKETS Store data in SRAM location according to its time stamp and sequence number Regenerate Sync and Data Valid signals STREAM OUT Clock data out of SRAM on 128MHz correlator clock Stream the left and right polarisations to the Station Board VSI chips ZDOK0 VSI Chip 0 Route left polarisation data to input stage of Station Board Send correlator clock to iBOB DATA 32 bit 128MHz clock ZDOK1 SRAM0 512 packets L pol. data SRAM1 512 packets R pol. data VSI Chip 1 Route right polarisation data to input stage of Station Board DATA 32 bit Power PC iBOB control and monitoring via registers on the OBP bus 10/100 ethernet

  30. Onsala to JBO - Status • iADC and transmit iBOB tested with ADC clocked at 1GS/s in diagnostic mode – ADC is set to output a fixed bit pattern • iBOB receiver and VSI chip coded in Simulink and VHDL Next Steps • Test iADC and transmit iBOB with sinusoidal test signal • Generated deliberately out-of-order packets and bursts of greater than 4Gbps to test receiver buffering

  31. JBO to JIVE: Transmit – OvervieweMERLIN Export iBOB Station Board DATA 32 bit REMOVE DELAY MODEL (COURSE) Use 72 bit wide bt 512k deep SRAM to remove eMERLIN delay model with 16 microsecond resolution VSI Chip 0 Process left polarisation data Detail on next slide ZDOK0 FORMAT Generate 10000 byte MkVb frames PACKETISE Divide each MkVb frame into two 5008 byte packets (incl header) Add Mk5C header to each packet 10GB PORT . 1 Gb/s out CLOCK VALID PDATA DATA 32 bit VSI Chip 1 Process right polarisation data Detail on next slide ZDOK1 CLOCK Power PC iBOB control and monitoring via registers on the OBP bus VALID 10/100 ethernet

  32. JBO to JIVE: Transmit -VSI Chip Detail Station Board VSI Chip 0 - Process left polarisation data 0 Select a 128MHz band from the filter bank Subtract Merlin Delay (Fine) Variable coefficient FIR filter changes delay in steps of 1/16th of original sample period Reclock for delays up to 16ns Mixer 0 Stream data out to iBOB over VSI cable Mixer 1 1 DATA 32 bit FIR Low pass filter Decimate by 8/16 128 taps Mixer Bank Select 8 or 16 VLBI bands from 128MHz input Remove eMERLIN offset (a multiple of 10kHz) Convert 4/8 bit data to 2 bit VLBI CLOCK VALID PDATA 15 MCB Interface Load band selection, delay model and filter parameters Mixer 8/16 Monitor & Control Bus (MCB)

  33. JBO to JIVE: Status • VSI chip DSP functions currently under development and simulation • Transmit iBOB re-uses packetising code from iNetTest • Next Steps • VSI chip MCB interface – based on Verilog code provided by Dave Fort at Penticton • Transmit iBOB needs coarse delay and MkVb formatting

  34. Summary • UDPon a 28 VC-4, 4.2 Gigabit lightpath provides the performance required for EXPReS • TCP performance poor due to small Ethernet buffer size on the equipment connecting 10GE to the 4 Gig Lightpath • Ethernet flow-control with short tails helps TCP • The network behaves correctly. • Care needed on choice of PC for 10 Gigabit. • FPGA iNetTest solutions work very well. • Onsala – Jodrell 4 Gigabit path now in place. • Packet loss and packet spacing need more understanding. • VLBI FPGA application progressing well.

  35. Any Questions ?

  36. For More Information • www.geant2.net • www.dante.net • For latest news and factsheets http://www.geant2.net/media • For research activities http://www.geant2.net/research • http://expres-eu.org/ [note: only one “s”] • http://www.jive.nl/ • Contact information:Richard Hughes-Jones, DANTE, Richard.Hughes-Jones@dante.org.ukJonathan Hargreaves, JBO, jh@jb.man.ac.uk • EXPReS is made possible through the support of the European Commission (DG-INFSO), Sixth Framework Programme, Contract #02664

  37. GÉANT2 Topology November 2006

  38. Connect. Communicate. Collaborate TCP BIC Throughput Lon-Stoc • Alcatel Metro Core Connect MCC Flow control ON • 27 VC-4 • RTT 31 ms • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • TCP buffer 32 MByte 2*BDP • iperf Ave throughput3.98 Gbit/s • iperf Max throughput4.02 Gbit/s • Several re-transmits

  39. Connect. Communicate. Collaborate TCP BIC Throughput Stoc-Lon • MCC NO Flow control • 27 VC-4 • RTT 31 ms • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • TCP buffer 32 MByte 2*BDP • iperf Ave throughput71.6 Mbit/s • iperf Max throughput192 Mbit/s • TCP Cwnd rapidly opens • Many more re-transmits

  40. Connect. Communicate. Collaborate TCP BIC Throughput Lon-StocBut Faster and Longer • MCC Flow control • 48 VC-4 • RTT 60 ms • Kernel 2.6.20-web100_pktd-plus • Myricom 10G-PCIE-8A-R Fibre • rx-usecs=25 Coalescence ON • MTU 9000 bytes • TCP buffer 59 MByte 1 * BDP • txquelen 1300 • … • iperf Ave throughput6.5 Gbit/s • iperf Max throughput ~6.9 Gbit/s • Some glitches & re-transmits

More Related