1 / 1

Network Performance for ATLAS Real-Time Remote Computing Farm Study

CERN-Manchester TCP Activity. Observation of the Status of Standard TCP with web100. 64 Byte Request in Green 1 Mbyte reponse in Blue TCP in Slow Start takes 19 round trips or ~ 380 ms TCP Congestion window in Red

Download Presentation

Network Performance for ATLAS Real-Time Remote Computing Farm Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CERN-Manchester TCP Activity Observation of the Status of Standard TCP with web100 • 64 Byte Request in Green • 1 Mbyte reponse in Blue • TCP in Slow Start takes 19 round trips or ~ 380 ms • TCP Congestion window in Red • This is reset by TCP on each Request due to lack of data sent by the application over the network. • TCP obeys RFC 2518 & RFC 2861 Remote Computing Concepts Remote Event Processing Farms Copenhagen Edmonton Krakow Manchester TCP/IP behaviour of the ATLAS Request- Response Application Protocol observed with Web100 ATLAS Detectors – Level 1 Trigger ROB ROB ROB ROB PF PF Data Collection Network PF PF Event Builders lightpaths SFI SFI SFI Observation of TCP with no Congestion window reduction • TCP Congestion window in Red grows nicely • Request-response takes 2 rtt after 1.5 s • Rate ~ 10 events/s with 50 ms processing time • Transfer achievable throughput grows to 800 Mbit/s • Data Transferred when the Application requires the data L2PU L2PU Back End Network L2PU L2PU GÉANT Level 2 Trigger PF SFOs Local Event Processing Farms CERN B513 Mass storage PF Switch Experimental Area The ATLAS Application Protocol CERN-Kracow TCP Activity Event Filter EFD SFI and SFO 64 byte Request 1 Mbyte Response Request event • Steady state request-response latency ~140 ms • Rate ~ 7.2 events/s • First event takes 600 ms due to TCP slow start Send event data Request-Response time (Histogram) Process event Request Buffer Send OK Send processed event ●●● Time • Event Request: • EFD requests an event from SFI • SFI replies with the event data • Processing of the event occurs • Return of Computation: • EF asks SFO for buffer space • SFO send OK • EF transfers the results of the computation 3 Round Trips 2 Round Trips Web100 parameters on the server located at CERN (data source) Green – small requestsBlue – big responses TCP ACK packets also counted (in each direction) One response = 1 MB ~ 380 packets Principal partners Network Performance for ATLAS Real-Time Remote Computing Farm Study Alberta, CERN Cracow, Manchester, NBI CERN-Alberta TCP Activity • MOTIVATION • Several experiments, including ATLAS at the Large Hadron Collider (LHC) and D0 at Fermi Lab, have expressed interest in using remote computing farms for processing and analysing, in real time, the information from particle collision events. Different architectures have been suggested from pseudo-real-time file transfer and subsequent remote processing, to the real-time requesting of individual events as described here. • To test the feasibility of using remote farms for real-time processing, a collaboration was set up between members of ATLAS Trigger/DAQ community, with support from several national research and education network operators (DARENET, Canarie, Netera, PSNC, UKERNA and Dante) to demonstrate a Proof of Concept and measure end-to-end network performance. The testbed was centred at CERN and used three different types of wide area high-speed network infrastructures to link the remote sites: • an end-to-end lightpath (SONET circuit) to the University of Alberta in Canada • standard Internet connectivity to the University of Manchester in the UK and the Niels Bohr Institute in Denmark • a Virtual Private Network (VPN) composed out of an MPLS tunnel over the GEANT and an Ethernet VPN over the PIONIER networks to IFJ PAN Krakow in Poland. Observation of TCP with no Congestion window reduction with web100 • 64 Byte Request in Green • 1 Mbyte reponse in Blue • TCP in Slow Start takes 12 round trips or ~ 1.67 s • TCP Congestion window in Red grows graduallyafter slowstart • Request-response takes 2 rtt after ~2.5 s • Rate ~ 2.2 events/s with 50 ms processing time • Transfer achievable throughput grows from 250 to 800 Mbit/s 2 RoundTrips

More Related