160 likes | 295 Views
This document summarizes the 2006 collaborative team meeting on the UltraLight project, led by Dimitri Bourilkov at the University of Florida in partnership with various institutions including Caltech and CERN. The UltraLight initiative aims to enhance global networking capabilities for high-energy physics data analysis, particularly for the Large Hadron Collider. Key topics covered include network management, data transportation challenges, integration of applications, and future goals for bandwidth and performance enhancements in scientific research.
E N D
UltraLight: Network & Applications Research at UF Dimitri Bourilkov University of Florida CISCO - UF Collaborative Team Meeting Gainesville, FL, September 12, 2006
Overview a NSF Project UltraLight
The UltraLight Team • Steering Group: H. Newman (Caltech, PI), P. Avery (U. Florida), J. Ibarra (FIU), S. McKee (U. Michigan) • Project Management: Richard Cavanaugh (Project Coordinator), PI and Working Group Coordinators: • Network Engineering: Shawn McKee (Michigan);+ S. Ravot (LHCNet), R. Summerhill (Abilene/HOPI), D. Pokorney (FLR), J. Ibarra (WHREN, AW), C. Guok (ESnet), L. Cottrell (SLAC), D. Petravick, M. Crawford (FNAL), S. Bradley, J. Bigrow (BNL), et al. • Applications Integration: Frank Van Lingen (Caltech);+ I. Legrand (MonALISA), J. Bunn (GAE + TG); C. Steenberg, M. Thomas (GAE), Sanjay Ranka (Sphinx) et al. • Physics Analysis User Group: Dimitri Bourilkov (UF; CAVES, Codesh) • Network Research, Wan In Lab Liaison: Steven Low (Caltech) • Education and Outreach: Laird Kramer (FIU), + H. Alvarez, J. Ibarra, H. Newman UltraLight
Large Hadron Collider CERN, Geneva: 2007 Start • pp s =14 TeV L=1034 cm-2 s-1 • 27 km Tunnel in Switzerland & France CMS TOTEM pp, general purpose; HI 5000+ Physicists 250+ Institutes 60+ Countries Atlas ALICE : HI LHCb: B-physics Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected Challenges: Analyze petabytes of complex data cooperativelyHarness global computing, data &NETWORKresources UltraLight
DISUN: LHC Data Grid Hierarchy CERN/Outside Resource Ratio ~1:4Tier0/( Tier1)/( Tier2) ~1:2:2 10-40+ Gbps 2.5 - 30 Gbps 4 of 7 US CMS Tier2s ShownWith ~8 MSi2k; 1.5 PB Disk by 2007>100 Tier2s at LHC CERN/Outside Ratio Smaller; Expanded Role of Tier1s & Tier2s:Greater Reliance on Networks UltraLight
Tier-2s ~100 Identified – Number still growing UltraLight
HENP Bandwidth Roadmap for Major Links (in Gbps) Continuing Trend: ~1000 Times Bandwidth Growth Per Decade;HEP: Co-Developer as well as Application Driver of Global Nets UltraLight
Data Samples and Transport Scenarios • 107 Events is a typical data sample for analysis or reconstruction development [Ref.: MONARC]; equivalent to just ~1 day’s running • Transporting datasets with quantifiable high performance is needed for efficient workflow, and thus efficient use of CPU and storage resources • One can only transmit ~2 RAW + REC or MC samples per day on a 10G path • Movement of 108 event samples (e.g. after re-reconstruction) will take ~1 day (RECO) to ~1 week (RAW, MC) with a 10G link at high occupancy • Transport of significant data samples will require one, or multiple 10G links UltraLight
Goal: Enable the network as an integrated managed resource Meta-Goal:Enable physics analysis & discoveries which otherwise could not be achieved Caltech, Florida, Michigan, FNAL, SLAC, CERN, BNL, Internet2/HOPI UERJ (Rio), USP(Sao Paulo), FIU, KNU (Korea), KEK (Japan),TIFR (India), PERN (Pakistan) NLR, ESnet, CENIC, FLR, MiLR, US Net, Abilene, JGN2, GLORIAD, RNP, CA*net4; UKLight, Netherlight, Taiwan Cisco, Neterion, Sun … UltraLight Goals • Next generation Information System, with the network as an integrated, actively managed subsystem in a global Grid • Hybrid network infrastructure: packet-switched + dynamic optical paths • End-to-end monitoring; Realtime tracking and optimization • Dynamic bandwidth provisioning; Agent-based services spanning all layers UltraLight
Large Scale Data Transfers • Network aspect: Bandwidth*Delay Product (BDP); we have to use TCP windows matching it in the kernel AND the application • On a local connection with 1GbE and RTT 0.19 ms, to fill the pipe we need around 2*BDP 2*BDP = 2*1Gb/s*0.00019s = ~ 48 KBytes Or, for a 10 Gb/s LAN: 2*BDP = ~ 480 KBytes • Now on the WAN: from Florida to Caltech the RTT is 115 ms. So for 1 Gb/s to fill the pipe we need 2*BDP = 2*1Gb/s*0.115s = ~ 28.8 MBytes etc. • User aspect: are the servers on both ends capable of matching these rates for useful disk-to-disk? Tune kernels, get highest possible disk read/write speed etc. Tables turned: WAN outperforms disk speeds! UltraLight
bbcp Tests bbcp was selected as a starting tool for data transfers on the WAN: • Supports multiple streams, highly tunable (window size etc), peer-to-peer type • Well supported by Andy Hanushevsky from SLAC • Is used successfully in BaBar • I have used it in 2002 for CMS production: massive data transfers from Florida to CERN; the only limit observed at the time was disk writing speed (LAN), network (WAN) • Starting point Florida Caltech: < 0.5 MB/s on the WAN, very poor performance UltraLight
Evolution of Tests Leading to SC|05 • End points in Florida (uflight1) and Caltech (nw1): AMD Opterons over UL network • Tuning of Linux kernels (2.6.x) and bbcp window sizes – coordinated iterative procedure • Current status (for file sizes ~ 2GB): • 6-6.5 Gb/s with iperf • up to 6 Gb/s memory to memory • 2.2 Gb/s ramdisk remote disk write • the speed was the same writing to SCSI disk which is supposedly less than 80 MB/s or writing to a raid array, so de facto it always goes first to memory cache (the Caltech node has 16 GB ram) • Used successfully with up to 8 bbcp processes in parallel from Florida to the show floor in Seattle; CPU load still OK UltraLight
bbcp Examples Florida Caltech [bourilkov@uflight1 data]$ iperf -i 5 -c 192.84.86.66 -t 60 ------------------------------------------------------------ Client connecting to 192.84.86.66, TCP port 5001 TCP window size: 256 MByte (default) ------------------------------------------------------------ [ 3] local 192.84.86.179 port 33221 connected with 192.84.86.66 port 5001 [ 3] 0.0- 5.0 sec 2.73 GBytes 4.68 Gbits/sec [ 3] 5.0-10.0 sec 3.73 GBytes 6.41 Gbits/sec [ 3] 10.0-15.0 sec 3.73 GBytes 6.40 Gbits/sec [ 3] 15.0-20.0 sec 3.73 GBytes 6.40 Gbits/sec bbcp: uflight1.ultralight.org kernel using a send window size of 20971584 not 10485792 bbcp -s 8 -f -V -P 10 -w 10m big2.root uldemo@192.84.86.66:/dev/null bbcp: Sink I/O buffers (245760K) > 25% of available free memory (231836K); copy may be slow bbcp: Creating /dev/null/big2.root Source cpu=5.654 mem=0K pflt=0 swap=0 File /dev/null/big2.root created; 1826311140 bytes at 432995.1 KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu=3.768 mem=0K pflt=0 swap=0 1 file copied at effectively 260594.2 KB/s bbcp -s 8 -f -V -P 10 -w 10m big2.root uldemo@192.84.86.66:dimitri bbcp: uflight1.ultralight.org kernel using a send window size of 20971584 not 10485792 bbcp: Creating ./dimitri/big2.root Source cpu=5.455 mem=0K pflt=0 swap=0 File ./dimitri/big2.root created; 1826311140 bytes at 279678.1 KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu=10.065 mem=0K pflt=0 swap=0 1 file copied at effectively 150063.7 KB/s UltraLight
bbcp Examples Caltech Florida [uldemo@nw1 dimitri]$ iperf -s -w 256m -i 5 -p 5001 -l 8960 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 512 MByte (WARNING: requested 256 MByte) ------------------------------------------------------------ [ 4] local 192.84.86.66 port 5001 connected with 192.84.86.179 port 33221 [ 4] 0.0- 5.0 sec 2.72 GBytes 4.68 Gbits/sec [ 4] 5.0-10.0 sec 3.73 GBytes 6.41 Gbits/sec [ 4] 10.0-15.0 sec 3.73 GBytes 6.40 Gbits/sec [ 4] 15.0-20.0 sec 3.73 GBytes 6.40 Gbits/sec [ 4] 20.0-25.0 sec 3.73 GBytes 6.40 Gbits/sec bbcp -s 8 -f -V -P 10 -w 10m big2.root uldemo@192.84.86.179:/dev/null bbcp: Sink I/O buffers (245760K) > 25% of available free memory (853312K); copy may be slow bbcp: Source I/O buffers (245760K) > 25% of available free memory (839628K); copy may be slow bbcp: nw1.caltech.edu kernel using a send window size of 20971584 not 10485792 bbcp: Creating /dev/null/big2.root Source cpu=5.962 mem=0K pflt=0 swap=0 File /dev/null/big2.root created; 1826311140 bytes at 470086.2 KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu=4.053 mem=0K pflt=0 swap=0 1 file copied at effectively 263793.4 KB/s UltraLight
SuperComputing 05 Bandwidth Challenge Above 100 Gbps for Hours 475 TBytes Transported in < 24 h UltraLight
Outlook • The UltraLight network is already very performant • SC|05 was a big success • The hard problem from the user perspective now is to match it with servers capable of sustained rates for large files > 20 GB (when the memory caches are exhausted); fast disk writes are key (raid arrays) • To fill 10 Gb/s pipes we need several pairs (3-4) of servers • Next step: disk-to-disk transfers between Florida, Caltech, Michigan, FNAL, BNL, CERN, preparations for SC|06 (next talk) • More info: http://ultralight.caltech.edu UltraLight