1 / 21

CMS LHC-Computing

CMS LHC-Computing. Paolo Capiluppi Dept. of Physics and INFN Bologna. Outline. Milestones and CMS-Italy Responsibilities CCS (Core Computing and Software) milestones Responsibilities (CMS Italy) Productions (Spring 2002) Goals and main issues Available resources Work done

amena-mayer
Download Presentation

CMS LHC-Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS LHC-Computing Paolo Capiluppi Dept. of Physics and INFN Bologna

  2. Outline • Milestones and CMS-Italy Responsibilities • CCS (Core Computing and Software) milestones • Responsibilities (CMS Italy) • Productions (Spring 2002) • Goals and main issues • Available resources • Work done • Data Challenge 04 • Goals and plans • CMS Italy participation and plans (preliminary) • LCG role • Tier1 and Tier2s (and Tier3s) • LCG and Grid • What’s LCG • Grid Real Results and Strategies • Conclusion

  3. Milestones (CCSand …externals)

  4. CMS-Italy official Responsibilities • CCS SC (Core Computing and Software Steering Committee) • Grid Integration Level 2 manager (Claudio Grandi) • INFN contact (Paolo Capiluppi) • CCS FB (CCS Financial Board) • INFN contact (Paolo Capiluppi) • PRS (Physics Reconstruction and Software) • Being recruited/refocused for the Physics TDR • Muons (Ugo Gasparini) • Tracker/b-tau (Lucia Silvestris) • LCG (LHC Computing Grid Project) • SC2 (Software and Computing Steering Committee) (Paolo Capiluppi alternate of David Stickland) • Detector Geometry & Material Description RTAG (Requirements Technical Assessment Group) chairperson (Lucia Silvestris) • HEPCAL (HEP Common Application Layer) RTAG (Claudio Grandi) • CCS Production Team • INFN contact (Giovanni Organtini)

  5. “Spring 2002 Production” (and Summer extension) • Goal of Spring 2002 Production: DAQ TDR Simulations and Studies • ~6 million events simulated, then digitized at different luminosities • NoPU (2.9M) , 2x1033 (4.4M), 1034 (3.8M) • CMSIM started in February with CMS125 • Digitization with ORCA-6, started in April • First analysis completed (just!) in time for the June CMS week • Extension of activities: “Summer 2002 Production” • Ongoing ‘ntuple-only’ productions • High-pt jets for the e- group (10 M) • Non-recycled pileup for the JetMet group (300 K) • Over 20 TB of data produced CMS-wide • Most available at CERN, lots at FNAL, INFN • FNAL, INFN, UK also hosting analysis • Some samples analyzed at various T2s (Padova/Legnaro, Bologna, …) • Production tools obligatory: IMPALA, BOSS, DAR, RefDB • BOSS is an official CMS production Tool : INFN developed (A. Renzi and C. Grandi) and maintained (C. Grandi)!

  6. Spring02: CPU Resources • 11 RCs (~20 sites) • About 1000 CPUs and 30 people CMS-wide • Some new sites & people, but lots of experience too Wisconsin UFL 5% 18% Bristol 3% UCSD 3% RAL 6% Caltech 4% Moscow FNAL 8% 10% HIP 1% INFN 18% CERN 15% IN2P3 10% IC 6%

  7. Production in the RCs Thanks to: Giovanni Organtini (Rm), Luciano Barone (Rm), Alessandra Fanfani (Bo), Daniele Bonacorsi (Bo), Stefano Lacaprara (Pd), Massimo Biasotto (LNL), Simone Gennai (Pi), Nicola Amapane (To), et al.

  8. CMSIM 6 million events 1.2 seconds per event for 4 months Feb. 8th June 6th

  9. DC04, 5% Data Challenge • Definition • Is 5% of 1034 running, or 25% of 2x1033 (Startup)One month “data taking” at Cern, 50 M events • It represents a factor 4 over Spring 2002, consistent with the goal of doubling complexity each year to reach a full-scale (for LHC startup) test by Spring 2006 • Called DC04 (and the others DC05, DC06) to get over the % confusion • More importantly, • Previous challenges have mostly been about doing the Digitization • This one will concentrate on the reconstruction, data distribution and early analysis phase • Move the issue of “Analysis Model” out of the classroom and into the spotlight

  10. Setting the Goals of DC04 • As defined to the LHCC, the milestone consists of: • CS-1041 1 April 2004 5% Data challenge complete (Now called DC04) • The purpose of this milestone is to demonstrate the validity of the software baseline to be used for the Physics TDR and in the preparation of the Computing TDR. The challenge comprises the completion of a “5% data challenge”, which successfully copes with a sustained data-taking rate equivalent to 25Hz at a luminosity of 0.2 x 1034 cm-2 s-1 for a period of 1 month (approximately 5 x 107 events). The emphasis of the challenge is on the validation of the deployed grid model on a sufficient number of Tier-0, Tier-1, and Tier-2 sites. We assume that 2-3 of the Tier-1 centers and 5-10 of the Tier-2 centers intending to supply computing to CMS in the 2007 first LHC run would participate to this challenge.

  11. DC04: Two Phases • Pre-Challenge (Must be successful) • Large scale simulation and digitization • Will prepare the samples for the challenge • Will prepare the samples for the Physics TDR work to get fully underway • Progressive shakedown of tools and centers • All centers taking part in challenge should participate to pre-challenge • The Physics TDR and the Challenge depend on successful completion • Ensure a solid baseline is available, worry less about being on the cutting edge • Challenge (Must be able to fail) • Reconstruction at “T0”(CERN) • Distribution to “T1s” • Subsequent distribution to “T2s” • Assign “streams” and “analyses” to people at T1 and T2 centers • Some will be able to work entirely within one center • Others will require analysis of data at multiple-centers • GRID tools tested for data movement and job migration

  12. DC04 Setting the Scale • Aim is 1 month of “running” at 25 Hz, 20 hours per day • 50 Million reconstructed events • (passing L1 Trigger and mostly passing HLT, but some background samples also required)) • Simulation (GEANT4!) • 100TB • 300 kSI95.Months • 1GHz P3 is 50 SI95 • Working assumption that most farms will be at 50SI95/CPU in late 2003 • Six months running for 1000 CPUS (Worldwide) • (Actually aim for more CPU’s to get production time down) • Digitization • 75TB • 15 kSI95.Months • 175MB/s Pileup bandwidth (if allow two months for digitization) • Reconstruction at T0-CERN • 25TB • 23 kSI95 for 1 month (460 CPU @ 50SI95/CPU) • Analysis at T1-T2s • Design a set of tasks such that offsite requirement during challenge is about twice that of the “T0” Pre-Challenge Challenge

  13. CSM-Italy and DC04 • Participation to the Challenge: ~ 20% contribution. • Use of 1 Tier1 (common) and 3 - 4 Tier2s • All Italian sites will possibly participate to pre-challenge phase • Use all available and validated (CMS-certified) Grid tools for the pre-challenge phase • Coordinate resources within LCG for both pre-challenge and challenge phases, where possible (Tier1/INFN must be fully functional: ~70 CPU Boxes, ~20 TB) • Use the CMS Grid Integrated environment for the Challenge (February 2004) • Participate to the preparation of: • Build the necessary resources and define the Italian commitments • Define the Data Flow Model • Validation of Grid tools • Integration of Grid and Production tools (review and re-design)

  14. CMS-Italy DC04 Preparation • Use the tail of “Summer Production” to test and validate resources and tools (grid and non-grid) • November/December 2002 • Participate to the Production-Tools Review • Now (Claudio Grandi, Massimo Biasotto) • Hopefully contribute to the new tools’ development (early 2003) • Make available the “new” software at all the Sites (T1, T2s, T3s) • Use some of the resources to test and validate Grid Integration • Already in progress at the Tier1 (CMS resources) and at Padova • Commit and validate (for CMS) the resources for DC04 • See following slide • Define the participation to the LCG-1 system • See following slide

  15. CMS Italy DC04 preliminary plans • All the current and coming resources of CMS Italy will be available for the DC04, possibly within the LCG Project • Small amount of resources requested for 2003 • Smoothly integrate the resources into LCG-1 • Continue to use dedicated resources for tests of Grid and Production tools’ Integration • Needs for the funding of the others 3 - 4 Tier2s • Request for common CMS Italy sub-judice in 2003: • Present a detailed plan and a clear Italian commitment to CMS • 60 CPUs and 6 TBytes disk + Switches • Will complete already existing Farms • We are particularly “low” in disk storage availability • Essential for physics analysis

  16. CMS Italy DC04 LCG preliminary plans • Tier1 plans common to all Experiments • See F. Ruggieri’s Presentation • LNL partially funded in 2002 (24 CPUs, 3 TB) for LCG participation. The remaining resources are CMS directly funded.

  17. DC04 Summary • With the DAQ TDR about to be completed, the focus moves to the next round of preparations • The Data Challenge series to reach full scale tests in 2006 • The baseline for the Physics TDR • The prototypes required for CMS to write a CCS TDR in 2004 • Start to address the analysis model • Start to test the data and task distribution models • Perform realistic tests of the LCG GRID implementations • Build the distributed expertise required for LHC Computing • DC04 will occupy us for most of the next 18 months

  18. LCG • LCG = LHC Computing Grid project (PM: Les Robertson) • CERN-based coordination effort (hardware, personnel, software, middleware) for LHC Computing;Worldwide!(Tier0, Tier1s and Tiers2s) • Funded by participating Agencies (INFN too) • Two phases: • 2002-2005 Preparation and setting-up (including tests, R&D and support for Experiments’ activities) • 2006-2008 Commissioning of LHC Computing System • Five (indeed four!) areas of activity for Phase 1: • Applications (common software and tools) (Torre Wenaus) • Fabrics (hardware, farms’ tools and architecture) (Bernd Panzer) • Grid Technologies (middleware development) (Fabrizio Gagliardi) • Grid Deployment (resources management and run) (Ian Bird) • Grid Deployment Board (agreements and plans) (Mirco Mazzucato) • Many Boards: POB(Funding), PEB(Executive), SC2(Advisory), ….

  19. Grid projects(CMS-Italy leading roles) • Integration of Grid Tools and Production tools almost done (Italy, UK, Fr main contributions) (Thanks to CNAF people and DataTAG personnel) • We can submit (production) jobs to the DataGrid testbed via the CMS Production tools (modified IMPALA/BOSS/RefDB) • Prototypes working correctly on DataTAG test layout • Will test large scale on DataGrid/LCG Production Testbed • Will measure performances to compare with “summer production” classic jobs (November 2002) • Integration of EU/US Grid/production tools • Already in progress in the GLUE activity • Most of the design (not only for CMS) is ready. Implementation in progress. • Target for (first) delivery by end of 2002

  20. Proposal for aDC04 diagram REPTOR/Giggle? PACMAN? Dataset Catalogue REPTOR/Giggle + Chimera? EDG Workload Management System MDS BOSS&R-GMA EDG SE VDT Server BOSS-DB LDAP EDG L&B Dataset Catalogue Data Push data or info Pull info Experiment Software Software release Dataset Input Specification Dataset Algorithm Specification Copy data Dataset Definition New dataset request Data management operations Write data SW download & installation Production on demand Retrieve Resource status Update dataset metadata Publish Resource status VDT Planner IMPALA/MOP Input data location Job creation EDG UI VDT Client Read data DAG/JDL +scripts Job assignment to resources EDG CE VDT server Job submission Job output filtering Production monitoring Job Monitoring Definition Job type definition

  21. Conclusion CMS Italia e’ leader nel Computing di CMS Pensiamo di averlo dimostrato e vogliamo continuare Chiediamo il supporto della CSN1 per realizzare il Data Challenge 04, e quelli che seguiranno

More Related