1 / 35

Paul Avery University of Florida avery@phys.ufl

High Energy & Nuclear Physics Experiments and Advanced Cyberinfrastructure. www.opensciencegrid.org. Internet2 Meeting San Diego, CA October 11, 2007. Paul Avery University of Florida avery@phys.ufl.edu. Context: Open Science Grid. Consortium of many organizations ( multiple disciplines )

Download Presentation

Paul Avery University of Florida avery@phys.ufl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Energy & Nuclear Physics Experiments and Advanced Cyberinfrastructure www.opensciencegrid.org Internet2 MeetingSan Diego, CAOctober 11, 2007 Paul AveryUniversity of Floridaavery@phys.ufl.edu Paul Avery

  2. Context: Open Science Grid • Consortium of many organizations (multiple disciplines) • Production grid cyberinfrastructure • 75+ sites, 30,000+ CPUs:US, UK, Brazil, Taiwan Paul Avery

  3. 2009 2007 2005 Community growth Data growth 2003 2001 OSG Science Drivers • Experiments at Large Hadron Collider • New fundamental particles and forces • 100s of petabytes 2008 - ? • High Energy & Nuclear Physics expts • Top quark, nuclear matter at extreme density • ~10 petabytes 1997 – present • LIGO (gravity wave search) • Search for gravitational waves • ~few petabytes 2002 – present Future Grid resources • Massive CPU (PetaOps) • Large distributed datasets (>100PB) • Global communities (1000s) • International optical networks Paul Avery

  4. OSG History in ContextPrimary Drivers: LHC and LIGO LIGO operation LIGO preparation LHC construction, preparation, commissioning LHC Ops iVDGL (NSF) OSG Trillium Grid3 GriPhyN (NSF) (DOE+NSF) PPDG (DOE) 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 European Grid + Worldwide LHC Computing Grid Campus, regional grids Paul Avery

  5. LHC Experiments at CERN • 27 km Tunnel in Switzerland & France CMS TOTEM ALICE ATLAS LHCb Search for • Origin of Mass • New fundamental forces • Supersymmetry • Other new particles • 2008 – ? Paul Avery

  6. l l e Higgs + e - Z o e + Z o jet jet SUSY..... e - Collisions at LHC (2008?) ProtonProton 2835 bunch/beam Protons/bunch 1011 Beam energy 7 TeV x 7 TeV Luminosity 1034 cm2s1 Bunch Crossing rate every 25 nsec (~20 Collisions/Crossing) Proton Parton (quark, gluon) • Collision rate ~109 Hz • New physics rate ~105 Hz • Selection: 1 in 1014 Particle Paul Avery

  7. LHC Data and CPU Requirements CMS ATLAS Storage • Raw recording rate 0.2 – 1.5 GB/s • Large Monte Carlo data samples • 100 PB by ~2012 • 1000 PB later in decade? Processing • PetaOps (> 600,000 3 GHz cores) • Users • 100s of institutes • 1000s of researchers LHCb Paul Avery

  8. LHC Global Collaborations CMS ATLAS • 2000 – 3000 physicists per experiment • USA is 20–31% of total Paul Avery

  9. Korea Russia UK FermiLab U Florida Caltech UCSD Maryland Iowa FIU LHC Global Grid • 5000 physicists, 60 countries • 10s of Petabytes/yr by 2009 • CERN / Outside = 10-20% CMS Experiment Online System CERN Computer Center 200 - 1500 MB/s Tier 0 10-40 Gb/s Tier 1 >10 Gb/s OSG Tier 2 2.5-10 Gb/s Tier 3 Tier 4 Physics caches PCs Paul Avery

  10. LHC Global Grid • 11 Tier-1 sites • 112 Tier-2 sites (growing) • 100s of universities Paul Avery J. Knobloch

  11. LHC Cyberinfrastructure Growth: CPU • Multi-core boxes • AC & power challenges Tier-2 ~100,000 cores Tier-1 CERN Paul Avery

  12. LHC Cyberinfrastructure Growth: Disk Disk Tier-2 100 Petabytes Tier-1 CERN Paul Avery

  13. LHC Cyberinfrastructure Growth: Tape Tape Tier-1 100 Petabytes CERN Paul Avery

  14. HENP Bandwidth Roadmapfor Major Links (in Gbps)  Paralleled by ESnet roadmap Paul Avery

  15. HENP Collaboration with Internet2www.internet2.edu HENP SIG Paul Avery

  16. HENP Collaboration with NLRwww.nlr.net • UltraLight and other networking initiatives • Spawning state-wide and regional networks (FLR, SURA, LONI, …) Paul Avery

  17. US LHCNet, ESnet Plan 2007-2010:3080 Gbps US-CERN US-LHCNet: NY-CHI-GVA-AMS 2007-10: 30, 40, 60, 80 Gbps AsiaPac SEA Europe Europe ESnet4 SDN Core: 30-50Gbps Aus. BNL Japan Japan SNV CHI NYC GEANT2 SURFNet IN2P3 DEN DC Metro Rings FNAL Aus. ESnet IP Core ≥10 Gbps ALB SDG ATL CERN ELP ESnet hubs New ESnet hubs US-LHCNet Network Plan (3 to 8 x 10 Gbps US-CERN) Metropolitan Area Rings 10Gb/s 10Gb/s 30Gb/s2 x 10Gb/s Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene Production IP ESnet core, 10 Gbps enterprise IP traffic Science Data Network core, 40-60 Gbps circuit transport Lab supplied Major international ESNet MANs to FNAL & BNL; Dark Fiber to FNAL; Peering With GEANT LHCNet Data Network NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2 Paul Avery

  18. Tier1–Tier2 Data Transfers: 2006–07 1 GB/sec CSA06 Sep. 2007 Mar. 2007 Sep. 2006 Paul Avery

  19. US: FNAL Transfer Rates to Tier-2 Universities Computing, Offline and CSA07 1 GB/s One well configured site. But ~10 such sites in near future  network challenge Nebraska June 2007 Paul Avery

  20. Current Data Transfer Experience • Transfers are generally much slower than expected • Or stop altogether • Potential causes difficult to diagnose • Configuration problem? Loading? Queuing? • Database errors, experiment S/W error, grid S/W error? • End-host problem? Network problem? Application failure? • Complicated recovery • Insufficient information • Too slow to diagnose and correlate at the time the error occurs • Result • Lower transfer rates, longer troubleshooting times • Need intelligent services, smart end-host systems Paul Avery

  21. UltraLight Integrating Advanced Networking in Applications http://www.ultralight.org Funded by NSF 10 Gb/s+ network • Caltech, UF, FIU, UM, MIT • SLAC, FNAL • Int’l partners • Level(3), Cisco, NLR Paul Avery

  22. UltraLight Testbed www.ultralight.org Funded by NSF Paul Avery

  23. Many Near-Term Challenges • Network • Bandwidth, bandwidth, bandwidth • Need for intelligent services, automation • More efficient utilization of network (protocols, NICs, S/W clients, pervasive monitoring) • Better collaborative tools • Distributed authentication? • Scalable services: automation • Scalable support Paul Avery

  24. END Paul Avery

  25. Extra Slides Paul Avery

  26. The Open Science Grid Consortium Science projects & communities U.S. gridprojects LHC experiments Universityfacilities Regional andcampus grids OpenScienceGrid Educationcommunities Multi-disciplinaryfacilities ComputerScience Laboratorycenters Technologists(Network, HPC, …) Paul Avery

  27. CMS: “Compact” Muon Solenoid Inconsequential humans Paul Avery

  28. Collision Complexity: CPU + Storage (+30 minimum bias events) All charged tracks with pt > 2 GeV Reconstructed tracks with pt > 25 GeV 109 collisions/sec, selectivity: 1 in 1013 Paul Avery

  29. LHC Data Rates: Detector to Storage 40 MHz ~TBytes/sec Physics filtering Level 1 Trigger: Special Hardware 75 GB/sec 75 KHz Level 2 Trigger: Commodity CPUs 5 GB/sec 5 KHz Level 3 Trigger: Commodity CPUs 0.15 – 1.5 GB/sec 100 Hz Raw Data to storage(+ simulated data) Paul Avery

  30. Cardiff AEI/Golm • LIGO: Search for Gravity Waves • LIGO Grid • 6 US sites • 3 EU sites (UK & Germany) Birmingham• * LHO, LLO: LIGO observatory sites * LSC: LIGO Scientific Collaboration Paul Avery

  31. Is HEP Approaching Productivity Plateau? Beijing 2001 The Technology Hype Cycle Applied to HEP Grids San Diego 2003 Victoria 2007 Padova 2000 Expectations Mumbai 2006 Interlachen 2004 (CHEP Conferences) Gartner Group From Les Robertson Paul Avery

  32. Challenges from Diversity and Growth • Management of an increasingly diverse enterprise • Sci/Eng projects, organizations, disciplines as distinct cultures • Accommodating new member communities (expectations?) • Interoperation with other grids • TeraGrid • International partners (EGEE, NorduGrid, etc.) • Multiple campus and regional grids • Education, outreach and training • Training for researchers, students • … but also project PIs, program officers • Operating a rapidly growing cyberinfrastructure • 25K  100K CPUs, 4  10 PB disk • Management of and access to rapidly increasing data stores (slide) • Monitoring, accounting, achieving high utilization • Scalability of support model (slide) Paul Avery

  33. Collaborative Tools: EVO Videoconferencing End-to-End Self Managed Infrastructure Paul Avery

  34. REDDnet: National Networked Storage • NSF funded project • Vanderbilt • 8 initial sites • Multiple disciplines • Satellite imagery • HENP • Terascale Supernova Initative • Structural Biology • Bioinformatics • Storage • 500TB disk • 200TB tape Brazil? Paul Avery

  35. OSG Operations Model Distributed model • Scalability! • VOs, sites, providers • Rigorous problemtracking & routing • Security • Provisioning • Monitoring • Reporting Partners with EGEE operations Paul Avery

More Related