1 / 26

ATLAS Data Challenges

ATLAS Data Challenges. ATLAS Software Workshop CERN September 20th 2001 Gilbert Poulard CERN EP-ATC. From CERN Computing Review. CERN Computing Review (December 1999 - February 2001) Recommendations: organize the computing for the LHC era LHC Grid project

tanek
Download Presentation

ATLAS Data Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS Data Challenges ATLAS Software Workshop CERN September 20th 2001 Gilbert Poulard CERN EP-ATC

  2. From CERN Computing Review CERN Computing Review (December 1999 - February 2001) Recommendations: • organize the computing for the LHC era • LHC Grid project • Phase 1: Development & prototyping (2001-2004) • Phase 2: Installation of the 1st production system (2005-2007) • Software & Computing Committee (SC2) • Proposal being submitted to the CERN council • Ask the experiments to validate their Computing model by iteratingon a set of Data Challenges of increasing complexity ATLAS Software Workshop - CERN - 20 September 2001

  3. LHC Computing GRID project • Phase 1: • prototype construction • develop Grid middleware • acquire experience with high-speed wide-area network • develop model for distributed analysis • adapt LHC applications • deploy a prototype (CERN+Tier1+Tier2) • Software • complete the development of the 1st version of the physics application and enable them for the distributed grid model • develop & support common libraries, tools & frameworks • including simulation, analysis, data management, ... • in parallel LHC collaborations must develop and deploy the first version of their core software ATLAS Software Workshop - CERN - 20 September 2001

  4. ATLAS Data challenges • Goal • understand and validate our computing model and our software • How? • Iterate on a set of DCs of increasing complexity • start with data which looks like real data • Run the filtering and reconstruction chain • Store the output data into our database • Run the analysis • Produce physics results • study • Performances issues, database technologies, analysis scenarios, ... • identify • weaknesses, bottle necks, etc… ATLAS Software Workshop - CERN - 20 September 2001

  5. ATLAS Data challenges • But: • Today we don’t have ‘real data’ • Needs to produce ‘simulated data’ first • so: • Physics Event generation • Simulation • Pile-up • Detector response • Plus reconstruction and analysis will be part of the first Data Challenges • we need also to “satisfy” the ATLAS communities • HLT, Physics groups, ... ATLAS Software Workshop - CERN - 20 September 2001

  6. ATLAS Data challenges • DC0 November-December 2001 • 'continuity' test through the software chain • aim primarily to check the state of readiness for DC1 • DC1 February-July 2002 • reconstruction & analysis on a large scale • learn about data model; I/O performances; identify bottle necks … • data management • should involve CERN & outside-CERN sites • scale 107 events in 10-20 days, O(1000) PC’s • data needed by HLT (others?) • simulation & pile-up will play an important role • checking of Geant4 versus Geant43 • DC2 January-September 2003 • use ‘prototype’, Grid middleware • increased complexity ATLAS Software Workshop - CERN - 20 September 2001

  7. DC scenario • Production Chain: • Event generation • Simulation • Pile-up • Detectors responses • Reconstruction • Analysis ATLAS Software Workshop - CERN - 20 September 2001

  8. Production stream ATLAS Software Workshop - CERN - 20 September 2001

  9. Event generation • The type of events has to be defined • Several event generators will, probably be used • For each of them we have to define the version • in particular Pythia • should it be a special ATLAS one? (size of common block) • We have also to insure that it runs for large statistics • Both, events type & event generators have to be defined by • HLT group (for HLT events) • Physics community • Depending on the output we can use the following frameworks • ATGEN/GENZ • for ZEBRA output format • Athena • for output in OO-db (HepMC) • we can also think to use only one framework and ‘convert’ the output from one to the other one (OO-db to Zebra or Zebra to OO-db), depending on the choice. I don’t think this is realistic. ATLAS Software Workshop - CERN - 20 September 2001

  10. Simulation • The goal is here to track the particles generated by the event generator to the detector. • We can use either Geant3 or Geant4 • for HLT & physics studies we still rely on Geant3 • I think that Geant4 should also be used • to get experience with ‘large production’ as part of its validation • it would be good to use the same geometry • ‘same geometry’ has to be defined • This is a question to the ‘simulation’ group • In the early stage we could decide to use only part of the detector • it would also be good to use the same sample of generated events • this has also to be defined by the ‘simulation’ group • for Geant3 simulation we will use either the “Slug/Dice” framework or the “Atlsim” framework • In both cases output will be Zebra (“Hits” and “deposited energy” for the calorimeters) • for Geant4 simulation I think that we will use the FADS/Goofy framework • output will be ‘Hits collections’ in OO-db ATLAS Software Workshop - CERN - 20 September 2001

  11. Pile-up & digitization • We have few possible scenarios • Work in “Slug/Dice” or “Atlsim” framework • input is ZEBRA • output is ZEBRA • advantage: we have the full machinery in place • Work in “Athena” framework • 2 possibilities • 1) ‘mixt’ • input is hits from ZEBRA • ‘’digits’ and digits collections’ are produced • output is ‘digits collections’ in OO-db • 2) ‘pure’ Athena • input is ‘Hits collections’ from OO-db • ’digits’ and digits collections’ are produced • output is ‘Digits collections’ in OO-db • We have to evaluate the consequences of the choice ATLAS Software Workshop - CERN - 20 September 2001

  12. Reconstruction • Reconstruction • we want to use the ‘new reconstruction’ code being run in Athena framework • Input should be from OO-db • Output in OO-db: • ESD (event summary data) • AOD (analysis object data) • TAG (event tag) • Atrecon could be a back-up possibility • To be decided ATLAS Software Workshop - CERN - 20 September 2001

  13. Analysis • We are just starting to work on this but • Analysis tools evaluation should be part of the DC • It will be a good test of the Event Data Model • Performance issues should be evaluated • Analysis scenario • number of analysis group, number of physicists per group, number of people who want to access the data at the same time • is of ‘first’ importance to ‘design’ the analysis environment • to measure the response time • to identify the bottle necks • for that we need input from you ATLAS Software Workshop - CERN - 20 September 2001

  14. Data management • Several ‘pieces’ of what I call ‘Infrastructure’ will have to be decided, prepared and put in place. Not only the software but also the hardware and the tools to manage the data. Among them: • Everything related to the OO-db (Objy or/and ORACLE) • Tools for creation, replication, distribution, ... • What do we do with ROOT I/O • Which fraction of the events will be done with ROOT I/O • We said that the evaluation of more than one technology is part of the DC • Few thousand of files will be produced and we will need a “bookkeeping” to keep track of what happened during the processing of the data and a “catalog” to be able to locate all pieces of information • Where is the “HepMC” data ? • Where is the corresponding “simulated” or AOD data ? • Which selection criteria have been applied with which selection parameters, etc ? • Correlation between different pieces of information? ATLAS Software Workshop - CERN - 20 September 2001

  15. DC scenario • For DC0 (end of September ?) we will have to see what is in place and decide on the strategy to be adopted in terms of: • Software to be used • Dice geometry (which version ?) • Reconstruction adapted to this geometry • Database • Infrastructure • I hope that we will have in place ‘tools’ for: • Automatic job-submission • catalog and bookkeeping • allocation of “run numbers” and of “random numbers” (bookkeeping) • we have to check with people involved in ‘grid’ projects or other projects (projects are not in phase) • I believe that the ‘validation’ of the various components should start now ATLAS Software Workshop - CERN - 20 September 2001

  16. DC scenario • For DC1, • On the basis of what we will learn from DC0 we will have to adapt our strategy • Simulation & pile-up will be of great importance • strategy to be defined (I/O rate, number of “event” servers?) • Since we say that we would like to do it ‘world-wide’ we will have to see what can be used from the GRID developments • We will have to ‘port’ our software to the GRID environment (we have already a kit based on 1.3.0 release) • Don’t forget that we have to provide data to our HLT colleagues and the schedule should take into account their needs ATLAS Software Workshop - CERN - 20 September 2001

  17. DC1-HLT - CPU ATLAS Software Workshop - CERN - 20 September 2001

  18. DC1-HLT - data ATLAS Software Workshop - CERN - 20 September 2001

  19. DC1-HLT data with pile-up • Inaddition to ‘simulated’ data, assuming ‘filtering’ after simulation (~14% of the events kept). • (1) keeping only ‘digits’ • (2) keeping ‘digits’ and ‘hits’ ATLAS Software Workshop - CERN - 20 September 2001

  20. DC scenario • For DC1, • On the hardware side we will have to insure that we have enough resources in terms of CPU, disk space, tapes, data servers … • We have started to do the evaluation of our needs but this should be checked • What will we do with the data generated during the DC? • Keep it on CASTOR (CERN mass storage system)? Tapes? • Outside institutes will use other systems (HPSS, …) • How will we exchange the data? • Do we want to have all the information at CERN?, everywhere? • What are the networking requirements? ATLAS Software Workshop - CERN - 20 September 2001

  21. Ramp-up scenario ATLAS Software Workshop - CERN - 20 September 2001

  22. What next • Prepare a first list of goals & requirements • with • HLT, Physics community • simulation, reconstruction, database communities • people working on ‘infrastructure’ activities • bookkeeping, cataloguing, ... • In order to • prepare a list of tasks • Some Physics oriented • But also like testing code, running production, … • set a list of work packages • define the priorities ATLAS Software Workshop - CERN - 20 September 2001

  23. What next • In parallel • Start to build a task force • Volunteers? • Should come from the various activities • Start discussion with: • people involved in GRID projects and • responsible of Tier centers • Evaluate the necessary resources • @ CERN (COCOTIME exercise) • Outside CERN ATLAS Software Workshop - CERN - 20 September 2001

  24. Then • Start the validation of the various components in the chain (putting dead lines for readiness) • Software • Simulation, pile-up, … • Infrastructure • Database, bookkeeping, … • Estimate what it will be realistic (!) to do • For DC0, DC1 • where (sharing of the work) • Insure that we have the resources • including manpower • “And turn the key” ATLAS Software Workshop - CERN - 20 September 2001

  25. Expression of interests • So far, after the NCB meeting of July 10th: • Canada, France, Germany, Italy, Japan, Nederland, Nordic Grid, Poland, Russia, UK, US, … • Proposition to help in DC0 • Proposition to participate to DC1 • Contact with HLT community • needs input from other (physics) communities • Contact with Grid projects • EU-Data-GRID • Kit of ATLAS software • Other projects • contact with Tier centers • The question of the entry level to DC1 has been raised (O(100)?) ATLAS Software Workshop - CERN - 20 September 2001

  26. Work packages • First (non exhaustive) list of work packages: • Event generation • Simulation • Geant3 • Geant4 • Pile-up • Detectors responses (Digitization) • “Zebra” – “OO-db” conversion • Event filtering • Reconstruction • Analysis • data management • Tools • Job submission & monitoring • Bookkeeping & cataloguing • Web interface ATLAS Software Workshop - CERN - 20 September 2001

More Related