1 / 51

Detector Simulation Project Prototype

Detector Simulation Project Prototype. SFT meeting, July 11 2011 René Brun. My talk. Project context My December talk in a thumbnail Findings in January Progress with our prototype in the past 6 monthes Next steps. Project context.

Download Presentation

Detector Simulation Project Prototype

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detector Simulation ProjectPrototype SFT meeting, July 11 2011 René Brun Detector Simulation Prototype Rene Brun

  2. My talk • Project context • MyDecember talk in a thumbnail • Findings in January • Progress withour prototype in the past 6 monthes • Nextsteps Detector Simulation Prototype Rene Brun

  3. Project context • La Mainaz meeting in Jan 2010->Bettersynergybetween G4&ROOT teams. • Many discussions between April and October 2010. • In November 2010, new Project approuvedwith more focus on medium and long term. • Myunderstandingthat the core G4 team istoobusywith the currentstuff and unwilling to make a substantial contribution to the project (at least in the short term). • Main workso far between Andrei and me + Federico and Philippe/FNAL interest to join. • Discussions with Atlas (Andi Salzburger) and OpenLab (Alfio). Detector Simulation Prototype Rene Brun

  4. MyDecember talk in a thumbnail • Increasesynergybetween G4&ROOT teams. • Particlestackoutside G4. • Virtual transporterswithconcrete instances for fast or/and full simulation, reconstruction,visualization. • Investigation of parallel architectures. Detector Simulation Prototype Rene Brun

  5. StartingAssumptions (I) • The LHC experiments use extensively G4 as main simulation engine. They have invested in validation procedures. Any new project must be coherent with their framework. • One of the reasons why the experiments develop their own fast MC solution is the fact that a full simulation is too slow for several physics analysis. These fast MCs are not in the G4 framework (different control, different geometries, etc), but becoming coherent with the experiments frameworks. • Giving the amount of good work with the G4 physics, it is unthinkable to not capitalize on this work. Rene Brun: Next Steps in Detector Simulation

  6. New GEANT in one picture G4 G4 physics G4 transporter Event Generators Abstract Phys&X-sec Abstract transporter Stack manager GEANT Std transporter ROOT Interpreters Fast MC transporter IO EVE transporter GUI Math TGeo build Graphics EtaPhi Geometry transporter GeantE transporter Rene Brun: Next Steps in Detector Simulation

  7. Event loop and stacking Loop over particles Stack Field Stack manager Virtual transporter Push primaries Geometrynavigator Current transporter Current transporter Push secondaries Physicsprocesses Step manager Step actions for selected process User application User step actions Rene Brun: Next Steps in Detector Simulation

  8. Fast and Full MonteCarlo • Wewouldlike an architecture (via the abstract transporters) wherefast and full MC canberuntogether. • To makeit possible one must have a separateparticlestack. • However, itwasclearfrom the verybeginning in Januarythat the particlestackdependsstrongly on the constraints of parrallelism. Multiple threads cannot update efficiently a tree data structure. Detector Simulation Prototype Rene Brun

  9. Findings in January • Decide to concentrate on a verysmall prototype to test our main ideas. • No need to import G4 (at least for some time) • Understanding the geometry of our detectors. We have the real detector geometry of 35 experiments (LHC, LEP, Tevatron, Hera, Babar, etc). • Werapidlyconcludedthat MASSIVE changes are required in the current simulation strategy to takeadvantage of the new parallel architectures. • In this talk, I willdiscussmainly the impact of parrallelism. Detector Simulation Prototype Rene Brun

  10. Steps/volume in Atlas Hugedynamic range 7100 volume types 29 million instances Detector Simulation Prototype Rene Brun

  11. Atlas: volumes sorted/entries 50 per cent of the time spent in 50/7100 volumes Detector Simulation Prototype Rene Brun

  12. Neighbors/volume in Atlas Volumeswithtoomanyneighbors Detector Simulation Prototype Rene Brun

  13. Neighbors/volume in CMS Sameproblemwith CMS Detector Simulation Prototype Rene Brun

  14. LHCB geometrystatistics 90 per cent of steps in 50/700 volumes Better situation withneighborsbecause of a non cylindricalgeometry Detector Simulation Prototype Rene Brun

  15. Conventional Transport T2 o Eachparticletrackedstep by stepthroughhundreds of volumes o o o o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o o o when all hits for all tracks are in memory summable digits are computed o o o T3 Detector Simulation Prototype Rene Brun

  16. Conventional Transport • Ateachstep, the navigator *nav has the state of the particlex,y,z,px,py,pz, the volume instance volume*, etc. • Wecompute the distance to the nextboundarywithsomethinglike • Dist = nav->DistoOut(volume,x,y,z,px,py,pz) • Or the distance to one physicsprocesswith, eg • Distp = nav->DistPhotoEffect(volume,x,y,z,px,py,pz) Detector Simulation Prototype Rene Brun

  17. Voxelisation (naiveview) To speedup the « wheream I » and « where I go » a 2d/3d grid of voxelsiscomputed. Eachvoxel has a list of the included volumes (or a bitmap) Detector Simulation Prototype Rene Brun

  18. Voxelisation 2 Veryoftenthereis a mismatchbetween the voxelarray and the detector geometry (toomany volumes/voxel) Detector Simulation Prototype Rene Brun

  19. Voxelisation 3 One solution is to decrease the voxel size, but thisrequires more memory and contribute to the level 2 cache misses. Detector Simulation Prototype Rene Brun

  20. Detector Simulation Prototype Rene Brun

  21. Detector Simulation Prototype Rene Brun

  22. Detector Simulation Prototype Rene Brun

  23. parallelism From a recent talk by Intel Detector Simulation Prototype Rene Brun

  24. If you trust Intel Detector Simulation Prototype Rene Brun

  25. If you trust Intel 2 Detector Simulation Prototype Rene Brun

  26. Current Situation • Werun jobs in parallel, one per core. • Nothingwrongwiththatexceptthatitdoes not scale in case of manycoresbecauseitrequirestoomuchmemory. • A multithreaded version mayreduce (say by a factor 2 or 3) the amount of requiredmemory, but alsoat the expense of performance. • A multithreaded version does not fit wellwith a hierarchy of processors. • So, we have a problem, in particularwith the waywe have designedsome data structures, egHepMC. Detector Simulation Prototype Rene Brun

  27. Back to Vectorization !! • Wespent a few yearsbetween 1985 and 1990 trying to vectorize GEANT3 on CRAY, Cyber205, Eta10 and IBM3090. • Wefailed!! The gain in usingvectorswaslost in the memory management on smallmemorysystems (machines hadlessthan 8 Mbytes of memory!). And whenlooking a posteriori, ourstrategywas not the right one. • However, one maylearnsomething positive fromthisfailure. Detector Simulation Prototype Rene Brun

  28. Detector Simulation Prototype Rene Brun

  29. x Data Structures & parallelism C++ pointers specific to a process event event vertices Copying the structure implies a relocation of all pointers I/O is a nightmare tracks Update of the structure from a different thread implies a lock/mutex Detector Simulation Prototype Rene Brun

  30. Can wemakeprogress? • Weneed data structures withinternal relations only. This canbeimplemented by using pools and indices. • When looping on collections, one must avoid the navigation in large memory areas killing the cache. • We must generatevectors of reasonable size wellmatched to the degree of parallelism of the hardware and the amount of memory. • We must find a system to avoid the taileffects Detector Simulation Prototype Rene Brun

  31. tails, tails, tails Detector Simulation Prototype Rene Brun

  32. Tailsagain A killer if one has to wait the end of col(i) beforeprocessing col(i+1) Averagenumber of objects in memory Detector Simulation Prototype Rene Brun

  33. New Transport Scheme T2 o All particles in the same volume type are transported in parallel. Particles entering new volumes or generated are accumulated in the volume basket. o o o o o o o o o o o o T4 o o o o o o o T1 o o o o o o o o o o o o o o o o o Events for which all hits are available are digitized in parallel o o o o o T3 Detector Simulation Prototype Rene Brun

  34. New Transport • Ateachstep, the navigator *nav has the state of the particles *x,*y,*z,*px,*py,*pz, the volume instances volume**, etc. • Wecompute the distances (array *Dist) to the nextboundarieswithsomethinglike • nav->DistoOut(volume,x,y,z,px,py,pz,Dist) • Or the distances to one physicsprocesswith, eg • nav->DistPhotoEffect(volume,x,y,z,px,py,pz,DispP) Detector Simulation Prototype Rene Brun

  35. New Transport • The new transport system impliesmany changes in the geometry and physics classes. These classes must bevectorized (a lot of work!). • Meanwhilewecan survive and test the principle by implementing a bridge functionlike MyNavigator::DisttoOut(int n, TGeoVolume **vol, double *x,..) { for int i=0;i<n;i++) { Dist[i] = DisttoOutOld(vol[i],x[i],…); } } Detector Simulation Prototype Rene Brun

  36. A better solution Checkpoint Synchronization. Only 1 « gap » every N events Pipeline of objects This type of solution requiredanyhow for pile-up studies Detector Simulation Prototype Rene Brun

  37. Abetterbetter solution checkpoints Ateachcheckpointwe have to keep the non finishedobjects/events. Wecannowdigitizewithparallelism on events, clear and reuse the slots. Detector Simulation Prototype Rene Brun

  38. Generations of baskets • When a particleenters a volume or isgenerated, itisadded to the basket of particles for the volume type. • The navigator selects the basket with the highest score (with a high and low water mark algorithm). • The user has the control on the water marks, but the ideathatthisshouldbeautomatic in function of the number of processors and the total amount of memoryavailable. (see interactive demo) Detector Simulation Prototype Rene Brun

  39. Detector Simulation Prototype Rene Brun

  40. Detector Simulation Prototype Rene Brun

  41. Vectorizing : work to do Detector Simulation Prototype Rene Brun

  42. Vectorizing the geometry (ex1) Double_tTGeoPara::Safety(Double_t *point, Bool_t in) const { // computes the closest distance from given point to this shape. Double_tsaf[3]; // distance from point to higher Z face saf[0] = fZ-TMath::Abs(point[2]); // Z Double_tyt = point[1]-fTyz*point[2]; saf[1] = fY-TMath::Abs(yt); // Y // cos of angle YZ Double_tcty = 1.0/TMath::Sqrt(1.0+fTyz*fTyz); Double_txt = point[0]-fTxz*point[2]-fTxy*yt; saf[2] = fX-TMath::Abs(xt); // X // cos of angle XZ Double_tctx = 1.0/TMath::Sqrt(1.0+fTxy*fTxy+fTxz*fTxz); saf[2] *= ctx; saf[1] *= cty; if (in) return saf[TMath::LocMin(3,saf)]; for (Int_t i=0; i<3; i++) saf[i]=-saf[i]; return saf[TMath::LocMax(3,saf)]; } Huge performance gain expected in this type of code whereshape constants canbecomputedoutside the loop Detector Simulation Prototype Rene Brun

  43. Vectorizing the geometry (ex2) G4double G4Cons::DistanceToIn( const G4ThreeVector& p, const G4ThreeVector& v ) const { G4double snxt = kInfinity ; // snxt = default return value const G4double dRmax = 100*std::min(fRmax1,fRmax2); static const G4double halfCarTolerance=kCarTolerance*0.5; static const G4double halfRadTolerance=kRadTolerance*0.5; G4double tanRMax,secRMax,rMaxAv,rMaxOAv ; // Data for cones G4double tanRMin,secRMin,rMinAv,rMinOAv ; G4double rout,rin ; G4double tolORMin,tolORMin2,tolIRMin,tolIRMin2 ; // `generous' radii squared G4double tolORMax2,tolIRMax,tolIRMax2 ; G4double tolODz,tolIDz ; G4double Dist,s,xi,yi,zi,ri=0.,risec,rhoi2,cosPsi ; // Intersection point vars G4double t1,t2,t3,b,c,d ; // Quadratic solver variables G4double nt1,nt2,nt3 ; G4double Comp ; G4ThreeVector Normal; // Cone Precalcs tanRMin = (fRmin2 - fRmin1)*0.5/fDz ; secRMin = std::sqrt(1.0 + tanRMin*tanRMin) ; rMinAv = (fRmin1 + fRmin2)*0.5 ; if (rMinAv > halfRadTolerance) { rMinOAv = rMinAv - halfRadTolerance ; } else { rMinOAv = 0.0 ; } tanRMax = (fRmax2 - fRmax1)*0.5/fDz ; secRMax = std::sqrt(1.0 + tanRMax*tanRMax) ; rMaxAv = (fRmax1 + fRmax2)*0.5 ; rMaxOAv = rMaxAv + halfRadTolerance ; // Intersection with z-surfaces tolIDz = fDz - halfCarTolerance ; tolODz = fDz + halfCarTolerance ; …… //here starts the real algorithm Allthesestatements are independentof the particle !!! Huge performance gain expected in this type of code whereshape constants canbecomputedoutside the loop Detector Simulation Prototype Rene Brun

  44. Vectorizing the Physics • This isgoing to be more difficultwhenextracting the physics classes from G4. However important gains are expected in the functionscomputing the distance to the next interaction point for eachprocess. • There is a diversity of interfaces and we have nowsub-branches per particle type. Detector Simulation Prototype Rene Brun

  45. NextSteps • Consolidation of the prototype. • Implementation of the slidingobjects. • Web site construction with a description of the currentstatus and goals. • Thread safety of TGeo • Vectorization of TGeo (at least a criticalsubpart) • Discussion with the G4 team about the consequences for the G4 physics classes. • Discussion of the leadership and membership. Detector Simulation Prototype Rene Brun

  46. Acknowledgements • Several people have contributed to discussions. • Alfio Lazzaro (OpenLab) expert on parallelism • SverreJarp (Open Lab) expert on manytopics • Philippe Canal • Andi Salsburger for the discussions and interest about the fast MC implementation. • Axel, Bertrand, Fons for valuablecomments Detector Simulation Prototype Rene Brun

  47. Geometry of solids • In parallel to thisactivity, but complementary. • The projectbetween the core G4 + Andrei Gheata and Jean-Marie Guyader isprogressing. Among the goals: • Common interface for the basic solidswith bridges for TGeo and the G4 geometry. • Development of the best possible algorithms to find distances, etc. Detector Simulation Prototype Rene Brun

  48. ComparingSequential & Parallel versions • A stress suite usingour detector data bases has been setup to compare severalparameters of the sequential and parallel versions. • We do not want to wait the last minute to test the relative performances. • So far the goal has been to design data structures that do not penalize the parallel version (even on one single thread and no vectorisation). • See an example in the followingslide. Detector Simulation Prototype Rene Brun

  49. Detector Simulation Prototype Rene Brun

  50. Originalidea New Transport code G4 Detector Simulation Prototype Rene Brun

More Related