1 / 27

E-Science and LCG-2 PPAP Summary

E-Science and LCG-2 PPAP Summary. Results from GridPP1/LCG1 Value of the UK contribution to LCG? Aims of GridPP2/LCG2 UK special contribution to LCG2? How much effort will be needed to continue activities during the LHC era?. Outline. What has been achieved in GridPP1? [7’]

eric-hull
Download Presentation

E-Science and LCG-2 PPAP Summary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. E-Science and LCG-2PPAP Summary Results from GridPP1/LCG1 Value of the UK contribution to LCG? Aims of GridPP2/LCG2 UK special contribution to LCG2? How much effort will be needed to continue activities during the LHC era?

  2. Outline • What has been achieved in GridPP1? [7’] • GridPP I (09/01-08/04) Prototype complete • What is being attempted in GridPP2? [6’] • GridPP II (09/04-08/07) Production short timescale • What is the value of a UK LCG Phase-2 contribution? • Resources needed in medium-long term? [10’] • (09/07-08/10) Exploitation medium • Focus on resources needed in 2008 • (09/10-08/14) Exploitation long-term PPAP

  3. Introduction Project Management Resources CERN Middleware Applications Tier-1/A Tier-2 Dissemination Exploitation Ref: http://www.gridpp.ac.uk/ the Grid is a reality A project was/is needed (under control via Project Map) Deployed according to planning Phase 1.. Phase 2 Prototype(s) made impact Fully engaged (value added) Tier-1 production mode Resources now being utilised UK flagship project Preliminary planning Executive Summary PPAP

  4. GridPP Deployment Status GridPP deployment is part of LCG (Currently the largest Grid in the world) The future Grid in the UK is dependent upon LCG releases Three Grids on Global scale in HEP (similar functionality) sites CPUs • LCG (GridPP) 82 (14) 7300 (1500) • Grid3 [USA] 29 2800 • NorduGrid 30 3200 PPAP

  5. Deployment Status (26/10/04) • Incremental releases: significant improvements in reliability, performance and scalability • within the limits of the current architecture • scalability is much better than expected a year ago • Many more nodes and processors than anticipated • installation problems of last year overcome • many small sites have contributed to MC productions • Full-scale testing as part of this year’s data challenges • GridPP “The Grid becomes a reality” – widely reported British Embassy (USA) Technology Sites British Embassy (Russia) PPAP

  6. Data Challenges • Ongoing.. • Grid and non-Grid Production • Grid now significant • ALICE - 35 CPU Years • Phase 1 done • Phase 2 ongoing LCG • CMS - 75 M events and 150 TB: first of this year’s Grid data challenges Entering Grid Production Phase.. PPAP

  7. Data Challenge • 7.7 M GEANT4 events and 22 TB • UK ~20% of LCG • Ongoing.. • (3) Grid Production • ~150 CPU years so far • Largest total computing requirement • Small fraction of what ATLAS need.. Entering Grid Production Phase.. PPAP

  8. LHCb Data Challenge 186 M Produced Events Phase 1 Completed 3-5 106/day LCG restarted LCG paused LCG in action 1.8 106/day DIRAC alone 424 CPU years (4,000 kSI2k months), 186M events • UK’s input significant (>1/4 total) • LCG(UK) resource: • Tier-1 7.7% • Tier-2 sites: • London 3.9% • South 2.3% • North 1.4% • DIRAC: • Imperial 2.0% • L'pool 3.1% • Oxford 0.1% • ScotGrid 5.1% Entering Grid Production Phase.. PPAP

  9. Paradigm ShiftTransition to Grid… 424 CPU · Years May: 89%:11% 11% of DC’04 Jun: 80%:20% 25% of DC’04 Jul: 77%:23% 22% of DC’04 Aug: 27%:73% 42% of DC’04 PPAP

  10. What was GridPP1? • A team that built a working prototype grid of significant scale > 1,500 (7,300) CPUs > 500 (6,500) TB of storage > 1000 (6,000) simultaneous jobs • A complex project where 82% of the 190 tasks for the first three years were completed A Success “The achievement of something desired, planned, or attempted” PPAP

  11. Aims for GridPP2? From Prototype to Production BaBarGrid BaBar EGEE SAMGrid CDF D0 ATLAS EDG LHCb ARDA GANGA LCG ALICE CMS LCG CERN Tier-0 Centre CERN Prototype Tier-0 Centre CERN Computer Centre UK Tier-1/A Centre UK Prototype Tier-1/A Centre RAL Computer Centre 4 UK Tier-2 Centres 19 UK Institutes 4 UK Prototype Tier-2 Centres Separate Experiments, Resources, Multiple Accounts Prototype Grids 'One' Production Grid 2004 2007 2001 PPAP

  12. Planning: GridPP2 ProjectMap Need to recognise future requirements in each area… PPAP

  13. Aim: build upon Phase 1 Ensure development programmes are linked Project management: GridPP LCG Shared expertise: Tier 0 and LCG: Foundation Programme F. LHC Computing Grid Project (LCG Phase 2) [review] • LCG establishes the global computing infrastructure • Allows all participating physicists to exploit LHC data • Earmarked UK funding being reviewed PPAP Required Foundation: LCG Deployment

  14. Jos Engelen proposal to RRB members (Richard Wade [UK]) on how a 20MCHF shortfall for LCG phase II can be funded Funding from UK (£1m), France, Germany and Italy for 5 staff. Others? Spain to fund ~2 staff. Others at this level? Now vitally important that the LCG effort established predominantly via UK funding (40%) is sustained at this level (~10%) URGENT Tier 0 and LCG: RRB meeting today PPAP Value to the UK? Required Foundation: LCG Deployment

  15. What lies ahead? Some mountain climbing.. Annual data storage: 12-14 PetaBytes per year CD stack with 1 year LHC data (~ 20 km) 100 Million SPECint2000 Importance of step-by-step planning… Pre-plan your trip, carry an ice axe and crampons and arrange for a guide… Concorde (15 km) In production terms, we’ve made base camp  100,000 PCs (3 GHz Pentium 4) We are here (1 km) Quantitatively, we’re ~7% of the way there in terms of CPU (7,000 ex 100,000) and disk (4 ex 12-14*3-4 years)… PPAP

  16. Grid and e-Science Support in 2008 I. Experiment Layer II. Application Middleware III. Grid Middleware IV. Facilities and Fabrics What areas require support? IV Running the Tier-1 Data Centre IV Hardware annual upgrade IV Contribution to Tier-2 Sysman effort  (non-PPARC) hardware IV Frontend Tier-2 hardware IV Contribution to Tier-0 support III One M/S/N expert in each of 6 areas III Production manager and four Tier-2 coordinators II Application/Grid experts (UK support) I ATLAS Computing MoU commitments and support I CMS Computing MoU commitments and support I LHCb Core Tasks and Computing Support I ALICE Computing support I Future experiments adopt e-Infrastructure methods • No GridPP management: (assume production mode established + devolved management to Institutes) PPAP

  17. PPARC Financial Input: GridPP1 Components Grid Application Development LHC and US Experiments + Lattice QCD UK Tier-1/A Regional Centre Hardware and Manpower Management Travel etc LHC Computing Grid Project (LCG) Applications, Fabrics, Technology and Deployment European DataGrid (EDG) Middleware Development PPAP

  18. PPARC Financial Input: GridPP2 Components A. Management, Travel, Operations F. LHC Computing Grid Project (LCG Phase 2) [review] B. Middleware Security Network Development E. Tier-1/A Deployment: Hardware, System Management, Experiment Support C. Grid Application Development LHC and US Experiments + Lattice QCD + Phenomenology D. Tier-2 Deployment: 4 Regional Centres - M/S/N support and System Management PPAP

  19. IV. Hardware Support • Global shortfall of Tier-1 CPU (-13%) and Disk (-55%) • UK Tier-1 input corresponds to ~40% (~10%) of global disk (CPU) • UK Tier-2 CPU and disk resources significant • Rapid physics analysis turnaround is a necessity • Priority is to ensure that ALL required software (experiment, middleware, OS) is routinely deployed on this hardware well before 2008 PPAP

  20. III. Middleware, Security and Network Security Middleware Networking Network Monitoring Configuration Management Grid Data Management Storage Interfaces Information Services Security Require some support expertise in each of these areas in order to maintain the Grid M/S/N builds upon UK strengths as part of International development PPAP

  21. II. Application Middleware AliEn BaBar GANGA Pheomenology Lattice QCD SAMGrid CMS Require some support expertise in each of these areas in order to maintain the Grid applications. Need to develop e-Infrastructure portals for new experiments starting up in exploitation era. PPAP

  22. ATLAS UK e-science forward look (Roger Jones) • Current core and infrastructure activities: • Run Time Testing and Validation Framework, tracking and trigger instantiations • Provision of ATLAS Distributed Analysis & production tools • Production management • GANGA development • Metadata development • ATLFast simulation • ATLANTIS Event Display • Physics Software Tools • ~11 FTEs mainly ATLAS e-science with some GridPP & HEFCE • Current Tracking • and Trigger e-science: • Alignment effort ~6FTEs • Core software ~2.5FTEs • Tracking tools ~6FTEs • Trigger ~2FTEs The current eScience funding will only take us (at best) to first data Expertise required for the real-world problems and maintenance Note for the HLT, the installation and commissioning will continue into the running period because of staging Both will move from development to optimisation & maintenance Need ~15 FTE (beyond existing rolling grant) in 2007/9 - continued e-science/GridPP support

  23. CMS UK e-science forward look (Dave Newbold) • Computing system / support • Development / tuning of computing • model + system; management • User support for T1 / T2 centres • (globally); liaison with LCG ops • Monitoring / DQM • Online data gathering/‘expert systems’ • for CMS tracker, trigger • Tracker /ECAL software • Installation / calibration support; low-level reconstruction codes • Data management • Phedex system for bulk offline data movement and tracking • System-level metadata; movement of HLT farm data online (new area) • Analysis system • CMS-specific parts of distributed analysis system on LCG NB: ‘First look’ estimates; well inevitably change as we approach running Need ~9 FTE (beyond existing rolling grant) in 2007/9 - continued e-science/GridPP support

  24. LHCb UK e-science forward look (Nick Brook) • Current RICH & VELO e-science: • RICH: UK provide bulk of the RICH s/w team including s/w coordinator ~7 FTEs about 50:50 e-science funding+rolling grant/HEFCE • VELO: UK provide bulk of the VELO s/w team including s/w coordinator ~4 FTEs about 50:50 e-science funding+rolling grant/HEFCE ALL essential alignment activities for both detectors through e-science funding Will move from development to maintenance and operational alignment ~3FTEs for alignment in 2007-9 • Current core activities: • GANGA development • Provision of DIRAC & production tools • Development of conditions DB • The production bookkeeping DB • Data management & metadata • Tracking • Data Challenge Production Manager • ~10 FTEs mainly GridPP, e-science, studentships with some HEFCE support Will move from development to maintenance phase - UK pro rata share of LHCb core computing activities ~5 FTEs Need ~9 FTE (core+alignment+UK support) in 2007/9 - continued e-science support

  25. Grid ande-Sciencefunding requirements • Priorities in context of a financial snapshot in 2008 • Grid (£5.6m p.a.) and e-Science (£2.7m p.a.) • Assumes no GridPP project management • Savings? • EGEE Phase 2 (2006-08) may contribute • UK e-Science context is • NGS (National Grid Service) • OMII (Open Middleware Infrastructure Institute) • DCC (Digital Curation Centre) • Timeline? To be compared with Road Map: Not a Bid- Preliminary Input PPAP

  26. Grid and e-ScienceExploitation Timeline? • PPAP initial input Oct 2004 • Science Committee initial input • PPARC call assessment (2007-2010) 2005 • Science Committee outcome Oct 2005 • PPARC call Jan 2006 • PPARC close of call May 2006 • Assessment Jun-Dec 2006 • PPARC outcome Dec 2006 • Institute Recruitment/Retention Jan-Aug 2007 • Grid and e-Science Exploitation Sep 2007 - …. • Note if the assessment from PPARC internal planning differs significantly from this preliminary advice from PPAP and SC, then earlier planning is required. PPAP

  27. Summary • What has been achieved in GridPP1? • Widely recognised as successful at many levels • What is being attempted in GridPP2? • Prototype to Production – typically most difficult phase • UK should invest further in LCG Phase-2 • Resources needed for Grid and e-Science in medium-long term? • Current Road Map ~£6m p.a. • Resources needed in 2008 estimated at £8.3m • Timeline for decision-making outlined.. PPAP

More Related