1 / 8

SC13 Data Movement WAN and ShowFloor

SC13 Data Movement WAN and ShowFloor. Azher Mughal Caltech. WAN Network Layout. WAN Transfers. From Denver Show Floor. TeraBit Demo 7x 100G links 8 x 40G links. Results. SC13 – DE-KIT - 75Gb from Disk to Disk (couple of servers at KIT – Two servers at SC13) SC13 BNL over Esnet :

thor
Download Presentation

SC13 Data Movement WAN and ShowFloor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC13 Data MovementWAN and ShowFloor Azher Mughal Caltech

  2. WAN Network Layout

  3. WAN Transfers

  4. From Denver Show Floor

  5. TeraBit Demo7x 100G links8 x 40G links

  6. Results SC13 – DE-KIT - 75Gb from Disk to Disk (couple of servers at KIT – Two servers at SC13) SC13 BNL over Esnet: - 80G over two pair of hosts, memory to memory NERSC to SC13 over ESnet: - Lots of packet loss at first, then removed the Mellanox switch from the path, and then the path was clean - Consistent 90Gbps, reading from 2 SSD host sending to single host in the booth. SC13 to FNAL over ESnet: - Lots of packet loss; TCP max around 5Gbps, but UDP could do 15G per flow. - Used 'tc' to pace TCP, and then at least single stream TCP behaved well up to 15G. But using multiple streams was still a problem. This seems to indicate something in the path with too small buffers, but we never figured out what. SC13 – Pasadena Internet2: - 80G read from the disks and write on the servers (disk to memory transfer). Link was lossy the other way. SC13 – CERN over Esnet: - About 75Gb memory to memory. Disks about 40Gb

  7. Post SC13 – Caltech - Geneva • About 68Gb • 2 pair of Servers used • 4 Streams per server • Each Server around 32Gbps • Single stream stuck at around 8Gbps ??

  8. Challenges • Servers with 48 SSD Disks - Adaptec Controllers • 1GB/s per controller (driver limitation, single IRQ) • Servers with 48 SSD Disks - LSI Controllers • 1.9GB/s per controller • Aggregate = 6 GB /s out of 6 controllers (still working) • Sheer number of resources (servers+switches+NICs+man-power) needed to achieve Tbit/sec

More Related