1 / 36

The ATLAS Computing Model

The ATLAS Computing Model. Eric Lançon Saclay LCG-France Lyon 14-15 Dec. 2005. Overview. The ATLAS Facilities and their roles Growth of resources CPU, Disk, Mass Storage Network requirements CERN  Tier 1  Tier 2 Data & Service challenges. Computing Resources.

trista
Download Presentation

The ATLAS Computing Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The ATLAS Computing Model Eric Lançon Saclay LCG-France Lyon 14-15 Dec. 2005

  2. Overview • The ATLAS Facilities and their roles • Growth of resources • CPU, Disk, Mass Storage • Network requirements • CERN Tier 1  Tier 2 • Data & Service challenges Workshop LCG-France E. lancon

  3. Computing Resources • Computing Model fairly well evolved, but still being revised • Documented in:http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/public/lhcc-2005-022.pdf • There are (and will remain for some time) many unknowns • Calibration and alignment strategy is still evolving • Physics data access patterns MAY start to be exercised this Spring • Unlikely to know the real patterns until 2007/2008! • Still uncertainties on the event sizes , reconstruction time • Lesson from the previous round of experiments at CERN (LEP) • Reviews in 1988 underestimated the computing requirements by an order of magnitude! Workshop LCG-France E. lancon

  4. ATLAS Facilities • Event Filter Farm at CERN (pit) • Located near the Experiment • Assembles data into a stream to the Tier 0 Center • Tier 0 Center at CERN (comp. center) • Raw data  Mass storage at CERN and to Tier 1 centers • Prompt reconstruction producing Event Summary Data (ESD) and Analysis Object Data (AOD) • Ship ESD, AOD to Tier 1 centers • Tier 1 Centers distributed worldwide (approximately 10 centers) • Re-reconstruction of raw data, producing new ESD, AOD • Tier 2 Centers distributed worldwide (approximately 30 centers) • Monte Carlo Simulation, producing ESD, AOD • ESD, AOD  Tier 1 centers • Physics analysis • CERN Analysis Facility • Tier 3 Centers distributed worldwide • Physics analysis Workshop LCG-France E. lancon

  5. Processing • Tier-0: • First pass processing on express/calibration physics stream • 24-48 hours later, process full physics data stream with reasonable calibrations • These imply large data movement from T0 to T1s • Tier-1: • Reprocess 1-2 months after arrival with better calibrations • Reprocess all resident RAW at year end with improved calibration and software • These imply large data movement from T1 to T1 and T1 to T2 Workshop LCG-France E. lancon

  6. Processing cont’d • Tier-1: • 1/10 of RAW data and derived samples • Shadow the ESD for another Tier-1 (e.g. 2/10 of whole sample) • Full AOD sample • Reprocess 1-2 months after arrival with better calibrations (to produce a coherent dataset) • Reprocess all resident RAW at year end with improved calibration and software • Provide scheduled access to ESD samples • Tier-2s • Provide access to AOD and group Derived Physics Datasets • Carry the full simulation load Workshop LCG-France E. lancon

  7. Analysis Model Analysis model broken into two components • Scheduled central production of augmented AOD, tuples & TAG collections from ESD • Derived files moved to other T1s and to T2s • Chaotic user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPU-bound tasks matching the official MC production • Modest job traffic between T2s Workshop LCG-France E. lancon

  8. Inputs to the ATLAS Computing Model (1) Workshop LCG-France E. lancon

  9. Inputs to the ATLAS Computing Model (2) Workshop LCG-France E. lancon

  10. Data Flow from experiment to T2s • T0 Raw data  Mass Storage at CERN • T0 Raw data  Tier 1 centers • 1/10 in each T1 • T0 ESD  Tier 1 centers • 2 copies of ESD distributed worldwide • 2/10 in each T1 • T0 AOD  Each Tier 1 center • T1  T2 • Some ESD, ALL AOD? • T0  T2 Calibration processing? • Dedicated T2s for some sub-detectors Workshop LCG-France E. lancon

  11. Total ATLAS Requirements in for 2008 Workshop LCG-France E. lancon

  12. Important points • Storage of Simulation data from Tier 2’s • Assumed to be at T1s • Need partnerships to plan networking • Must have fail-over to other sites? • Commissioning • These numbers are calculated for the steady-state but with the requirement of flexibility in the early stages • Simulation fraction is an important tunable parameter in T2 numbers! • Simulation / Real Data = 20% in TDR • Heavy Ion running still under discussion. Workshop LCG-France E. lancon

  13. ATLAS T0 Resources Workshop LCG-France E. lancon

  14. ATLAS T1 Resources Workshop LCG-France E. lancon

  15. ATLAS T2 Resources Workshop LCG-France E. lancon

  16. Offres / T1 2008 Workshop LCG-France E. lancon

  17. Tier 1  Tier 2 Bandwidth The projected time profile of the nominal aggregate bandwidth expected for an average ATLAS Tier- 1 and its three associated Tier-2s. Workshop LCG-France E. lancon

  18. Tier 1  CERN Bandwidth The projected time profile of the nominal bandwidth required between CERN and the Tier-1 cloud. Workshop LCG-France E. lancon

  19. Tier 1  Tier 1 Bandwidth The projected time profile of the nominal bandwidth required between T1 and Tier-1. Workshop LCG-France E. lancon

  20. Conclusions • Computing Model Data Flow understood for placing Raw, ESD and AOD at Tiered centers • Still need to understand data flow implications of Physics Analysis • How often do you need to “back navigate” AOD to ESD? • How distributed is Distributed Analysis? • Some of these issues will be addressed in the upcoming (early 2006) Computing System Commissioning exercise. • Some will only be resolved with real data in 2007-8 Workshop LCG-France E. lancon

  21. 2005 2006 2007 2008 SC3 First physics cosmics First beams Full physics run SC4 LHC Service Operation Key dates for Service Preparation Sep05 - SC3 Service Phase May06 –SC4 Service Phase Sep06 – Initial LHC Service in stable operation Apr07 – LHC Service commissioned SC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain – grid data throughput 1GB/sec, including mass storage 500 MB/sec (150 MB/sec & 60 MB/sec at Tier-1s) SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc. analysis – sustain nominal final grid data throughput (~ 1.5 GB/sec mass storage throughput) LHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007 – capable of handling twice the nominal data throughput Workshop LCG-France E. lancon

  22. Data Challenges Workshop LCG-France E. lancon

  23. Statistics on LCG usage 305410 jobs In2p3 = Clermont + Lyon Assez loin des ambitions Bcp de problèmes techniques au démarrage Vers la fin 10% de LCG au CC Workshop LCG-France E. lancon

  24. SC3 : T0->T1 exercise T0 - T1 dataflow SC3 ATLAS Workshop LCG-France E. lancon

  25. Distributed Data Management (DDM) • Catalogues généraux centralisés (LFC): • Contenus des datasets (1 dataset = plusieurs fichiers) • Localisation des datasets dans les sites (T0-T1-T2) • Liste des requêtes de transferts des datasets, etc… • Catalogues locaux (LFC) • Localisation dans le site des fichiers de chaque dataset • Site prend en charge à travers des agents (VOBox) : • Récupération auprès du catalogue central de la liste des datasets et des fichiers associés à transférer • Gestion du transfert • Enregistrement des informations dans les catalogues locaux et centraux Workshop LCG-France E. lancon

  26. DDM au CC • 1 serveur dCache • 2 pools de chacun 1-2 TB + Driver de bande : 40 MB/s • Transfert par FTS (Geant 1 Gb/s) • A quand 10 Gb/s? • VOBox LCG mode-CCIN2P3 • Machine Linux SL3 • Catalogue LFC • Serveur FTS (futurs transferts T1-T1 ou T1-T2) • Certificats Grille • Accès gsissh • Soft DDM installé • Pas besoin d'accès ROOT • Peut être installé depuis le CERN Pb : cron n’est pas géré par utilisateur grille Workshop LCG-France E. lancon

  27. Services challenges Workshop LCG-France E. lancon

  28. Integrated TB transfered Workshop LCG-France E. lancon

  29. Backup Slides Workshop LCG-France E. lancon

  30. From LATB… Workshop LCG-France E. lancon

  31. Heavy Ion Running Workshop LCG-France E. lancon

  32. Preliminary Tier-1 Resource PlanningCapacity at all Tier-1s in 2008 Workshop LCG-France E. lancon

  33. Preliminary Tier-2 Resource PlanningCapacity at all Tier-2s in 2008 • Includes resource planning from 27 centres/federations • 11 known Tier-2 federations have not yet provided data • These include potentially significant resources: USA CMS, Canada East+West, MPI Munich ….. Workshop LCG-France E. lancon

  34. ‘Rome’ production : French sites Workshop LCG-France E. lancon

  35. ‘Rome’ production : Italian sites Workshop LCG-France E. lancon

  36. Distributed Data Management (DDM) • Eviter l’utilisation d’un catalogue plat complètement centralisé • Organisation hiérarchique des catalogues de données • Dataset= Collection de fichiers • Datablock : Collection de datasets • Etiquetage des datasets pour maintenir une complète consistence • Information de chaque fichier physique stockée localement • Mouvements de données par dataset • Mouvement vers un site déclenché par une souscription à travers un programme client Workshop LCG-France E. lancon

More Related