1 / 22

The LHCb Software and Computing GridPP10 meeting, June 2nd 2004 Ph. Charpentier, CERN

The LHCb Software and Computing GridPP10 meeting, June 2nd 2004 Ph. Charpentier, CERN. 01 1 0 100 11 1 011 01 01 00 01 01 010 10 11 01 00 B 00 l e. Outline. Core software LHCb Applications Production and Analysis tools Data Challenges plans Summary. Software Strategy.

tavon
Download Presentation

The LHCb Software and Computing GridPP10 meeting, June 2nd 2004 Ph. Charpentier, CERN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The LHCb SoftwareandComputingGridPP10 meeting, June 2nd 2004Ph. Charpentier, CERN 0110100111011010100010101010110100 B00le

  2. Outline • Core software • LHCb Applications • Production and Analysis tools • Data Challenges plans • Summary GridPP10 meeting, 02/6/2004

  3. Software Strategy • Develop an Architecture (‘blueprint’) and a Framework (real code) to be used at all stages of LHCb data processing • high level triggers, simulation, reconstruction, analysis • a single framework used by all members of the collaboration • Avoid fragmentation and duplication of computing efforts • common vocabulary, better understanding of the system • better specifications of what needs to be done • identify and build common components • guidelines and coordination for SD groups • Transparent use of third-party components wherever possible • Leverage from LCG applications area software • GUI, persistency, simulation.… • Applications are developed by customizing the Framework GridPP10 meeting, 02/6/2004

  4. Converter Converter Application Manager Converter Event Selector Transient Event Store Data Files Message Service Persistency Service Event Data Service JobOptions Service Algorithm Algorithm Algorithm Data Files Transient Detector Store Particle Prop. Service Persistency Service Detec. Data Service Other Services Data Files Transient Histogram Store Persistency Service Histogram Service Gaudi Architecture and Framework GridPP10 meeting, 02/6/2004

  5. SEAL Pool SEAL SEAL LCG DDDD HepPDT Other LCG services AIDA Pool Impact of LCG projects GridPP10 meeting, 02/6/2004

  6. Core Software - Status and Outlook • Sept ‘98 – project started GAUDI team assembled • Feb ‘99 - GAUDI first release (v1) • Nov ’99 GAUDI Open-Source style • experiment independent web and release area • ATLAS started contributing to its development • used by Atlas, Harp, Glast • Jun ‘02 - All the basic functionality was available • Object Persistency, Detector Description, Data Visualization, etc. • Dec ‘03 - POOL used for object persistency • More work still needed on .. • Detector Conditions, Distributed computing (Grid), Interactive environment, etc. • Integrate more LCG SEAL services (plugin manager…) GridPP10 meeting, 02/6/2004

  7. Detector Groups Event model / Physics event model RawData GenParts Detector Description Conditions Database Simul. Gauss Analysis DaVinci MCParts Recons. & HLT Brunel AOD Digit. Boole Digits MCHits DST Gaudi MiniDST Applications and datasets 0110100111011010100010101010110100 B00le GridPP10 meeting, 02/6/2004

  8. Gauss - Geant4-based simulation application • Gaudi application • Uses • Pythia 6.205 and EVTGEN (Babar) for event generation • HepMC as exchange model • Geant4 as simulation engine • Simulation framework: GiGa • Converts HepMC to Geant4 input • Interfaces with all Gaudi services (geometry, magnetic field…) • Converts Geant4 trajectories to LHCb event model (MCHits, MCParticles and MCVertices) GridPP10 meeting, 02/6/2004

  9. Boole & BrunelDigitisation and reconstruction 0110100111011010100010101010110100 B00le • Boole • From MCHits to digits (Raw buffer format) • Running trigger algorithms • Output: Raw buffer & MC truth (+relations) • Brunel • Complete pattern recognition • Charged tracks: long, upstream, downstream • Calorimeter clusters & electromagnetic particle identification • RICH particle identification • Muon particle identification • Output: DST (ESD) format based on the LHCb event model GridPP10 meeting, 02/6/2004

  10. DaVinci - The LHCb Analysis Framework • Gaudi application • Facilitates migration of algorithms from analysis to reconstruction • Interactive analysis through Python scripting • Physicists only manipulate abstract objects (particles and vertices) • Concentrate on functionality rather than on technicality • Manipulation and analysis tools for general use • Physics event model for describing all physics related objects produced by the analysis algorithms • Keep loose connection to reconstruction entities (tracks, clusters) GridPP10 meeting, 02/6/2004

  11. DaVinci - toolset • To access and filter data : • Physics Desktop, Particle Filters (PID, kinematics etc.), Particle Stuffer • Vertexing and constrained fitters : • Geometrical Vertex Fitter, Mass Constrained Vertex Fitter, Primary Vertex, Kinematic Fitter, … • MC analysis tools : • MC Decay finder, Associators, … • Utilities • GeometricalTools, Particle transporter, Debug tool • Long list of decay selections at different levels of development • Tutorial – attended by total of 60 physicists • All results shown in the ‘03 TDRs were obtained using this software • ROOT (or PAW) used to produce ultimate plots GridPP10 meeting, 02/6/2004

  12. Panoramix - Event & Geometry Display • Panoramix package based on OpenInventor • Is able to display: • Geometry from XML files • MC data objects • Reconstruction objects • Scripting based on python • Gaudi application, hence can be integrated with e.g. DaVinci algorithms GridPP10 meeting, 02/6/2004

  13. Event Viewing GridPP10 meeting, 02/6/2004

  14. Dirac - Workload management software User interfaces Job monitor Production manager GANGA UI User CLI BK query webpage FileCatalog browser BookkeepingSvc FileCatalogSvc DIRAC Job Management Service DIRAC services JobMonitorSvc InfomarionSvc MonitoringSvc JobAccountingSvc AccountingDB Agent Agent Agent DIRAC resources DIRAC Storage LCG Resource Broker DIRAC Sites CE 3 DIRAC CE gridftp bbftp DIRAC CE DIRAC CE DiskFile CE 2 CE 1 rfio GridPP10 meeting, 02/6/2004

  15. Catalogs • File metadata (bookkeeping) • LHCb bookkeeping database (BKDB) • Successfully tested during DC03 • Supports full cross-reference of datasets • Full traceability of data history • XML-RPC remote access + web browser • Support by GridPP-funded position (Carmine Cioffi) • File and replica catalog • Replica table in LHCb-BKDB • All files entered into an AliEn catalog • Data management tools • Being developed, based on catalogs above • Essential for distributed analysis GridPP10 meeting, 02/6/2004

  16. Ganga - Interfacing Gaudi to the Grid • Goal • Simplify the management of analysis and production jobs for end-user physicists by developing a tool for accessing Grid services with built-in knowledge of how Gaudi works • Required functionality • Job preparation and configuration • Job submission, monitoring andcontrol • Resource browsing, booking, etc. • Done in collaboration with ATLAS • Aim for being used for DC’04 analysis • Using grid middleware services GANGA GUI GUI GUI Collective Collective Collective & & & Histograms Monitoring Results Job Options Algorithms Resource Resource Resource Grid Grid Grid Services Services Services GAUDI Program GAUDI Program GAUDI Program GridPP10 meeting, 02/6/2004

  17. Ganga - Functionality for DC’04 • Prepare job • Prepare options for defining the processing • Set algorithm tuning parameters (criteria, cuts…) • Select datasets • Submit job • All settings can be saved / retrieved for further use • Collaborate with Grid services • File/Metadata catalogs • Workload management • Jobs submitted to • Interactive run • Local batch system • LCG Grid (possibly via Dirac) Select WorkFlow DaVinci Workflow Edit AlgorithmFlow DaVinci AlgorithmFlow DLLs Prepare AlgFlowOptions and DLLs AlgFlowOptions AlgOptions catalog Edit AlgParamOptions AlgParamOptions Metadata catalog Select Datasets DatasetOptions File catalog Sandbox DLLs JobOptions FileCatalog slice Prepare Sandbox Submit Job GridPP10 meeting, 02/6/2004

  18. Ganga: theproject • Ganga is an ATLAS-LHCb joint venture • Mainly funded by GridPP • Shared ATLAS-LHCb positions (GridPP) • Alexander Soroko, Karl Harrison, Alvin Tan • Project management • David Adams (ATLAS-BNL), Ulrik Egede (LHCb-Imperial College) • Strong interaction with dataset selection • LHCb bookkeeping & Ganga integration (GridPP) • Carmine Cioffi • Committed to ARDA • EGEE NA4 manpower integrated in the Ganga team (middleware integration and testing) GridPP10 meeting, 02/6/2004

  19. Data Challenges • Plan series of Data Challenges • measure quality (# crashes / # events) and performance of software • scalability tests for simulation, reconstruction and analysis • production tests over grid using all LHCb regional centres GridPP10 meeting, 02/6/2004

  20. Data Challenge ‘03 • Still using Geant3 for simulation • Used for 2003 LHCb TDR studies • 59 days needed (90 foreseen) • ~5. 107 events produced • 17 sites used • 36600 jobs lauched (success rate 92%) • 80% outside CERN • 2/3 in UK GridPP10 meeting, 02/6/2004

  21. Data Challenge ‘04 53% CPU power pledged by UK GridPP10 meeting, 02/6/2004

  22. Summary • A software framework (Gaudi) with full set of services has been developed for use in all event processing applications • Common set of high level components developed • e.g. detector geometry, event model, interactive/visualization tools • provides guidelines and minimizes work for physicists developing detector and physics algorithms • Migration of LHCb software to use Geant4 and LCG-AA software completed • POOL used for persistency, SEAL full integration still to come • The production of large datasets of ~200 M events in progress (DC04) • Production architecture (DIRAC) and toolset • Dirac using LCG2 in place and running (~50% of processing power) • Set of Data Challenges planned for future deployment and validation of computing infrastructure and software • Main challenge for 2004: distributed analysis (post DC04) GridPP10 meeting, 02/6/2004

More Related