1 / 13

The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October

The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October 3, 2011. Our job in the Computing Sector.

aspen
Download Presentation

The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Lab’s Computing Support Strategy for CDF and D0Victoria White, Associate Lab Director for Computing and CIOOctober 3, 2011

  2. Our job in the Computing Sector • Is to enable science and to optimize the support (human and technological) of the scientific programs of the lab (including the Experiment program) • Within funding and resource contraints • In the face of growing demands • To meet emerging needs • To deal with rapidly changing technology • We also have to provide computing to support the lab’s operations and provide all the standard services that an organization needs (and often expects 24x7) Computing Support Strategy for CDF and D0

  3. Computing Division -> Computing Sector Computing Support Strategy for CDF and D0

  4. Feynman Computing Center (FCC) • High availability services – e.g. core network, email, etc. • Tape Robotic Storage (3 10000 slot libraries) • UPS & Standby Power Generation • ARRA project: upgrade cooling and add HA computing room - completed • Grid Computing Center (GCC) • High Density Computational Computing • CMS, RUNII, GridFarm batch worker nodes • Lattice HPC nodes • Tape Robotic Storage (4 10000 slot libraries) • UPS & taps for portable generators • Lattice Computing Center (LCC) • High Performance Computing (HPC) • Accelerator Simulation, Cosmology nodes • No UPS Fermilab Computing Facilities EPA Energy Star award 2010 Computing Support Strategy for CDF and D0

  5. Facilities: more than just space power and cooling – continuous planning ARRA funded new high availability computer room in Feynman Computing Center Computing Support Strategy for CDF and D0

  6. Cooling problems at GCC this summer The air intake to the condensers can reach temps of 120F causing the cooling to shutdown (20-25F above ambient on pad) • $650–950k to move condensers to a platform for Comp.Rooms B and C • Rough estimate from FESS • Does not include Computer Room A • Better estimate when the study is complete in November Soaker hoses to cool concrete condenser pad Increased computer room operating temperatures Numerous air management improvements inside the computer room, including cold aisle containment test Extended monitoring outside to the condenser pad Executed load shed plan twice during hottest days Rented portable air conditioning for use in CRB & outside under the condensers (the latter was effective, not efficient) Computing Support Strategy for CDF and D0

  7. Need to fix Grid Computing Center quickly – ready for next summer • Need to be able to use the computer rooms as designed and plan for that going forward. • Need to move forward with CRA renovations for greater power per rack. • We cannot do this and run everyone ragged and be unreliable every summer Computing Support Strategy for CDF and D0

  8. Run II Computing Strategy • Production processing and Monte-Carlo production capability after the end of data taking • Ability to do some reprocessing if needed • Monte Carlo production at the current rate through mid-2013? • Analysis computing capability for at least 5 years, but diminishing after end of 2012 • Push for 2012 conferences for many results –no large drop in computing requirements through this period • Continued support for up to 5 years for • Code management and science software infrastructure • Data handling for production (+MC) and Analysis Operations • Curation of the data: > 10 years with possibly some support for continuing analyses Computing Support Strategy for CDF and D0

  9. We have pushed/insisted on sharing strategies for computing for many years –why? Cost Coherent technical approaches and architectures Support over the entire lifecycle of an experiment/project Computing Support Strategy for CDF and D0

  10. Experiment/Project Lifecycle and funding Expt or Project specific Project specific Shared services Shared services Shared services Shared services Mature phase Construction, Operations, Analysis Early Period R&D, Simulations LOI, Proposals Final data-taking and beyond Final analysis, Data preservation and access Computing Support Strategy for CDF and D0

  11. Sharing via the Grid – FermiGrid User Login & Job Submission TeraGrid WLCG NDGF Open Science Grid FermiGrid Infrastructure Services FermiGrid Monitoring/Accounting Services FermiGrid Authentication/Authorization Services FermiGrid Site Gateway CMS 7485 slots D0 6916 slots CDF 5600 slots GRIDFarm 3284 slots Computing Support Strategy for CDF and D0

  12. Budget/resource allocation for 2012 + • There is always upward pressure for computing • more disk and more cpu leads to faster results and greater flexibility • more help with software & operations is always requested • Within a fixed budget each experiment can usually optimize between tape drives, tapes, disk, cpu, servers • assuming basic shared services are provided. • With so many experiments in so many different stages we intend to convene a “Scientific Computing Portfolio Management Team” to examine the needs/computing models of the different Fermilab based experiments and help in allocating the finite dollars to optimize scientific output. Computing Support Strategy for CDF and D0

  13. “Data Preservation” for Tevatron data • Data will be stored and migrated to new tape technologies for ~ 10 years • Eventually 16 PB of data will seem modest • If we want to maintain the ability to reprocess and do analysis on the data there is a lot of work to be done to keep the entire environment viable • Code, access to databases, libraries, I/O routines, Operating Systems, documentation….. • If there is a goal to provide “open data” that scientists outside of CDF and Dzero could use there is even more work to do. • 4th Data Preservation Workshop at Fermilab in May • The collaboration has to decide – soon if we need to do more than maintain data for collaboration use. Computing Support Strategy for CDF and D0

More Related