1 / 24

Distributed Simulation with Geant4

Distributed Simulation with Geant4 Preliminary results of the LowE / DIANE joint project Jakub T. Moœcicki, CERN/IT credits also to: Alfonso Mantero, INFN Genova. History. Parallelization of Geant4 simulation is a joint project between Geant4 – DIANE – Anaphe

jariah
Download Presentation

Distributed Simulation with Geant4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Simulation with Geant4 Preliminary results of the LowE / DIANE joint projectJakub T. Moœcicki, CERN/IT credits also to: Alfonso Mantero, INFN Genova

  2. History • Parallelization of Geant4 simulation is a joint project between Geant4 – DIANE – Anaphe • DIANE is an R&D project in IT/API to study distributed analysis and simulation and create a prototype • initiated early 2001 with very limited resources • Anaphe is an analysis project supported by IT • provides the analysis framework for HEP • The pilot programme includes G4 simulation which produces AIDA/Anaphe histograms • Collaboration started late spring 2002

  3. Sequential Geant4 Simulation • the goal of simulation: • optimize the detectors used for x-ray fluorescence emission from Mercury's crust in the context of Hermes, Bepi Colombo ESA mission. • requires high statistics è many events • 20 Mio events ~ 3 hours • up to 100 Mio events might be useful • estimated time ~16 hours

  4. Parallel Geant4 Simulation • increase performance • shift from batch to semi-interactive simulation • speed up the analysis cycle • generate more events – debug simulation faster • from sequential to parallel simulation • preserve reproducability of the results • minimize deployment overhead • when moving from sequential to parallel simulation • both in terms of time and amout of code/expertise one must invest

  5. Performance Increase

  6. Benchmarking environment • parallel cluster configuration • lxplus: 70 redhat 61 nodes • 7 Intel STL2 (2 x PIII 1GHz, 512MB) • 31 ASUS P2B-D ( 2 x PIII 600MHz, 512MB) • 15 Celsius 620 (2 x PIII, 550MHz, 512MB) • the rest – Kayak 450 Mhz (2 x PIII, 450Mhz, 128MB) • reference sequential machine • pcgeant2 (2x Xeon 1700Mhz, 1GB)

  7. Benchmarking Caveat • non-exclusive access to interactive machines • 'load-noise' background, unpredictible load peaks • different CPU and RAM on nodes • AFS used to fetch physics config data • try to remove the noise: • repeat simulations many times to get the correct mean • work at night and off-peak hours (what about US people using CERN computing facilities ?) • etc... • conclusion: • results should be taken with caution and are approximate

  8. Structure of the simulation • initialization phase (constant) • load ~10-15 Mb of physics tables, config data etc. • reference sequential machine: ~ 4 minutes (user time) • cluster nodes: ~ 5-6 minutes • beamOn ~ f( event number ) • small job: 1-5 Mio events • medium job: 20-40 Mio events • big job: > 50 Mio events

  9. Scalability test (job time)

  10. Normalized efficency

  11. Benchmarking (comments) • results are approximate • scaling factors for different CPU speeds • but seem with agreement with expectations • move from batch to semi interactive simulation feasible • small jobs do not gain so much – large constant initialization time

  12. Problems & solutions • time of job execution = slowest machine... • ...or most loaded one at the moment • often had to wait a long time for last worker to finish • possible solution: • use larger number of smaller workers • fast machines run workers sequentially many times, but... • constant initialization time rather important • initialize once, beamOn many times... to be checked • if this problem is solved we may move towards more interactive simulation

  13. From sequential to parallel simulation

  14. Reproducability • initial seed of the random engine • make sure that every parallel simulation starts with a seed uniquely determined by the job's initial seed • number of times engine is used depends on the initial seed • make sure that correlations between the workers' seeds are avoided • our solution: • use two uncorrelated random engines • one to generate a table of initial seeds (one seed for each worker) • another for the simulation inside the worker

  15. Reproducability • parameters which need to be fixed to reproduce the simulation: • total number of events • initial seed • ... but also: • number of workers • number of events per worker

  16. Minimizing deployment overhead

  17. Ease of use • user-friendliness • G4 simulation developer should not need to fight with irrelevant technical problems when moving from sequential to parallel G4 simulation • as non-intrusive as possible • minimize necessary code changes in original simulation • good separation of the subsystems • G4 simulation does not need to know that it runs in parallel... • the distributed framework (DIANE) does not need to care about what actually is being simulated (see #Slide 20)

  18. What is DIANE? R&D project in IT/API semi-interactive parallel analysis for LHC middleware technology evaluation & choice CORBA, MPI, Condor, LSF... also see how to integrate API products with GRID prototyping (focus on ntuple analysis) time scale and resources: Jan 2001: start (< 1 FTE) June 2002: running prototype exists sample Ntuple analysis with Anaphe event-level parallel Geant4 simulation

  19. What is DIANE? frameworkfor parallel cluster computation application-oriented master-worker model common in HEP applications application-independent apps dynamically loaded in a plugin style callbacks to applications via abstract interfaces component-based subsystems and services packaged into component libraries core architecture uses CORBA and CCM (CORBA Component Model ) integration layer between applications and the GRID environment and deployment tools

  20. Master/Worker model applications share the same computation model so also share a big part of the framework code but have different non-functional requirements CPU vs IO intensive semi-interactive vs batch etc....

  21. What DIANE is not DIANE is not a replacement for a GRID and its services a hardwired analysis toolkit

  22. DIANE and GRID DIANE as a GRID computing element ...via a gateway that understands Grid/JDL ... Grid/JDL must be able to descibe parallel jobs/tasks DIANEas a user of (low level) Grid services ...authentication, security, load balancing... and profit from existing 3rd party implementations python environment is a rapid prototyping platform and may provide a convinient connection between DIANE and Globus Toolkit via pyGlobus API

  23. Architecture Overview layering: abstract middleware interfaces and components plugin-style application loading

  24. Conclusions • prototype deployment of G4-DIANE • significant performance improvement possible • scalability tests: • 140 Mio Events • 70 nodes in the cluster • 1 hour total parallel execution • putting together DIANE and G4 is fairly easy • done in several days... • DIANE may bridge G4 to the GRID world • without necessarily waiting for fully-fledged GRID infrastructure to become available

More Related