1 / 23

Mathematical Modeling and Computational Physics 2013

Development of the distributed computing system for the MPD at the NICA collider, analytical estimations. Gertsenberger K . V . Joint Institute for Nuclear Research , Dubna. Mathematical Modeling and Computational Physics 2013. NICA scheme. MMCP’2013. Multipurpose Detector (MPD).

eshana
Download Presentation

Mathematical Modeling and Computational Physics 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna Mathematical Modeling and Computational Physics 2013

  2. NICA scheme GertsenbergerK.V. MMCP’2013

  3. MultipurposeDetector (MPD) The softwareMPDRootis developed for the event simulation, reconstruction and physical analysisof the heavy ions’ collision registered by MPD at the NICA collider. GertsenbergerK.V. MMCP’2013

  4. Prerequisites of the NICA cluster • high interaction rate (to 6 KHz) • high particle multiplicity, about 1000 charged particles for the central collision at the NICA energy • one event reconstruction takes tens of seconds in MPDRoot now, 1M events – months • large data stream fromthe MPD: 100k events~ 5 TB 100 000k events ~ 5 PB/year • unified interface for parallel processing and storing of the event data GertsenbergerK.V. MMCP’2013

  5. Development of the NICA cluster 2 main lines of the development: • data storage developmentfor the experiment • organization of parallel processing of the MPD events development and expansiondistributed cluster for the MPD experimentbased on LHEP farm GertsenbergerK.V. MMCP’2013

  6. Current NICA clusterin LHEPforMPD GertsenbergerK.V. MMCP’2013

  7. Distributed file systemGlusterFS • aggregatesthe existing file systems in common distributed file system • automatic replication works as background process • background self-checking service restores corrupted files in case of hardware or software failure • implemented on application layer and working in user space GertsenbergerK.V. MMCP’2013

  8. Data storageon the NICA cluster GertsenbergerK.V. MMCP’2013

  9. Development of the distributed computing system NICA cluster concurrent data processing on cluster nodes PROOF server parallel data processing in a ROOT macro on the parallel architectures MPD-scheduler scheduling system for the task distribution to parallelize data processing on cluster nodes GertsenbergerK.V. MMCP’2013

  10. Parallel data processing with PROOF • PROOF (Parallel ROOT Facility) – the part ofthe ROOT software, no additional installations • PROOF uses data independent parallelism based on the lack of correlation for MPD events good scalability • Parallelization for three parallel architectures: • PROOF-Lite parallelizesthe data processing on one multiprocessor/multicores machine • PROOF parallelizes processing on heterogeneous computing cluster • Parallel data processing inGRID • Transparency: the same program code can execute both sequentially and concurrently GertsenbergerK.V. MMCP’2013

  11. Using PROOF inMPDRoot • The last parameter of the reconstruction: run_type (default,“local”). Speedup on the user multicore machine: $ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof”) parallel processing of 1000 events with thread count being equal logical processor count $ root reco.C (“evetest.root”, “mpddst.root”, 0, 500, “proof:workers=3”) parallel processing of 500 events with 3 concurrent threads Speedup on the NICA cluster: $ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof:mpd@nc8.jinr.ru:21001”) parallel processing of 1000 events on all cluster nodes of PoD farm $ root reco.C (“eve”, “mpddst”, 0, 500,“proof:mpd@nc8.jinr.ru:21001:workers=10”) parallel processing of 500 events on PoD cluster with 10 workers GertsenbergerK.V. MMCP’2013

  12. Speedup of the reconstruction on 4-cores machine GertsenbergerK.V. MMCP’2013

  13. PROOF on the NICA cluster event count $ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:mpd@nc8.jinr.ru:21001”) GlusterFS mpddst.root *.root evetest.root event №2 event №0 event №1 proof proof proof proof proof proof proof proof (8) (8) (24) (16) (16) (24) (32) proof = master server Proof On Demand Cluster proof = slave node GertsenbergerK.V. MMCP’2013

  14. Speedup of the reconstruction on the NICA cluster GertsenbergerK.V. MMCP’2013

  15. MPD-scheduler • Developed on C++ language with ROOT classes support. • Usesscheduling systemSun Grid Engine (qsub command) for execution in cluster mode. • SGE combinescluster machines on LHEP farminto the pool of worker nodes with 78 logical processors. • The job for distributed execution on the NICA cluster is described and passed to MPD-scheduler as XML file: $ mpd-scheduler my_job.xml GertsenbergerK.V. MMCP’2013

  16. Job description <job> <macro name="$VMCWORKDIR/macro/mpd/reco.C" start_event=”0” count_event=”1000”add_args=“local”/> <file input="$VMCWORKDIR/macro/mpd/evetest1.root" output="$VMCWORKDIR/macro/mpd/mpddst1.root"/> <file input="$VMCWORKDIR/macro/mpd/evetest2.root" output="$VMCWORKDIR/macro/mpd/mpddst2.root"/> <file db_input="mpd.jinr.ru*,energy=3,gen=urqmd" output="~/mpdroot/macro/mpd/evetest_${counter}.root"/> <run mode="local" count="5" config=“~/build/config.sh" logs="processing.log"/> </job> The description starts and ends with tag <job>. Tag<macro> sets information about macro being executed by MPDRoot Tag <file> defines files to process by macro above Tag <run> describes run parameters and allocated resources * mpd.jinr.ru – server name with production database GertsenbergerK.V. MMCP’2013

  17. Job execution on the NICA cluster job_reco.xml <job> <macroname="~/mpdroot/macro/mpd/reco.C"/> <file input="$VMCWORKDIR/evetest1.root" output="$VMCWORKDIR/mpddst1.root"/> <file input="$VMCWORKDIR/evetest2.root" output="$VMCWORKDIR/mpddst2.root"/> <file input="$VMCWORKDIR/evetest3.root" output="$VMCWORKDIR/mpddst3.root"/> <runmode=“global" count=“3"config=“~/mpdroot/build/config.sh"/> </job> job_command.xml <job> <command line="get_mpd_productionenergy=5-9 "/> <run mode="global" config="~/mpdroot/build/config.sh"/> </job> GlusterFS *.root evetest3.root evetest1.root MPD-scheduler evetest2.root qsub SGE mpddst3.root mpddst1.root mpddst2.root job_command.xml SGE SGE SGE SGE SGE SGE SGE free free busy free busy busy busy (8) (8) (24) (16) (16) (24) (32) SGE= Sun Grid Engine server SGE batch system SGE = Sun Grid Engine worker GertsenbergerK.V. GertsenbergerK.V. 17 MMCP’2013 MMCP’2013

  18. Speedup of the one reconstruction on NICA cluster GertsenbergerK.V. MMCP’2013

  19. NICA cluster section on mpd.jinr.ru GertsenbergerK.V. MMCP’2013

  20. Conclusions • The distributed NICA cluster was deployed based on LHEP farm for the NICA/MPD experiment (Fairsoft, ROOT/PROOF, MPDRoot, Gluster, Torque, Maui). 128cores • The data storage was organized with distributed file systemGlusterFS: /nica/mpd[1-8]. 10 TB • PROOF On Demand cluster was implemented to parallelize event data processing for the MPD experiment, PROOF support was added to the reconstruction macro. • The system for the distributed job execution MPD-scheduler was developed to run MPDRoot macros concurrently on the cluster. • The web sitempd.jinr.ru insectionComputing –NICA cluster presents the manuals for the systems described above. GertsenbergerK.V. MMCP’2013

  21. Thank you for attention!

  22. Analytical model for parallel processing on cluster speedup for point (data independent) algorithm of image processing Pnode– count of logical processors, n – data to process (byte), ВD – speed of the data access (MB/s), T1 – “pure”time of the sequential processing (s) GertsenbergerK.V. MMCP’2013

  23. Prediction of the NICA computing power How many are logical processors required to process NTASKphysical analysis tasks and onereconstruction within Tdaydays in parallel? Ifn1 = 2 MB, NEVENT = 10000000 events, TPA = 5 s/event, TREC = 10 s/event., BD = 100 MB/s, Tday = 30 days GertsenbergerK.V. MMCP’2013

More Related