1 / 19

Multi-threaded Event Processing with JANA

Multi-threaded Event Processing with JANA. David Lawrence – Jefferson Lab Nov. 3, 2008. Thomas Jefferson National Accelerator Facility (JLab). Located in Newport News on the east coast of Virginia, USA. 6 GeV electron accelerator user facility funded by the US Dept. of Energy.

aeastman
Download Presentation

Multi-threaded Event Processing with JANA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-threaded Event Processing with JANA David Lawrence – Jefferson Lab Nov. 3, 2008 Multi-threaded Event Processing with JANA - D. Lawrence JLab

  2. Thomas Jefferson National Accelerator Facility (JLab) Located in Newport News on the east coast of Virginia, USA • 6 GeV electron accelerator user facility funded by the US Dept. of Energy for basic research into the quark structure of nuclear matter • 1 of the 2 major nuclear physics research labs in the U.S. 12 GeV CHL2 (CD-3 approval came in Sept. 2008 with data planned in 2014) 11 GeV Multi-threaded Event Processing with JANA - D. Lawrence JLab

  3. The GlueX Experiment Conventional meson has quantum numbers determined only by constituent quarks The “continuous wave” 12GeV electron beam at JLab has a beam bunch every 2 ns Forward EM calorimeter and forward TOF wall downstream 2 Tesla solenoid magnet 30 cm LH2 target Hybrid meson has some quantum properties due to contributions from the “glue” Barrel EM calorimeter inside magnet Cylindrical and planar drift chambers inside magnet real g beam Multi-threaded Event Processing with JANA - D. Lawrence JLab

  4. Data Rates in 12GeV era private comm. JLab CHEP2007 talk Sylvain Chapelin LHC * BNL ** * NIM A499 Mar. 2003 ppg 762-765 ** CHEP2006 talk MartinL. Purschke Multi-threaded Event Processing with JANA - D. Lawrence JLab

  5. CPU development in the coming years • CPU development has shifted from increased clock speed to multiple cores • Dual and quad core CPUs are common today • Some type of parallelization must be done to use all of the power in a next generation CPU expect more than 100 cores in a box by 2014! From “Platform 2015: Intel Platform Evolution for the Next Decade” Multi-threaded Event Processing with JANA - D. Lawrence JLab

  6. Bookkeeping overhead is reduced with multiple threads Multi-threading vs. Multiple Processes for a Single Input File Multiple Processes Multiple Threads option 1 option 2 FILE FILE FILE dispatcher single threaded program single threaded program single threaded program single threaded program single threaded program single threaded program multi-threaded program FILE FILE FILE Merger file output file output file output Accumulator Multi-threaded Event Processing with JANA - D. Lawrence JLab

  7. Threading benefits small scale processing(individual developer cycle) Single Workstation The relevant measure of CPU “power” now includes the number of cores used Total CPU power proportional to area cores processing time single-threaded Multi-threading leads to a more rapid turn around time when developing multi-threaded = multi-threaded = edit/compile = single-threaded Multi-threaded Event Processing with JANA - D. Lawrence JLab

  8. The JANA Factory Model • Traditional factory models pass ownership of created objects to the caller • In JANA, only const pointers are passed out and ownership stays with the factory • Passing out only const pointers guarantees that only the factory may modify the objects • Subsequent requests get the same const pointers vector<const DTrack*> tracks; loop->Get(tracks); • Templated Get() method helps ensure type safety • Framework itself responsible for telling factories to delete objects at end of event • Persistent flag marks factories that should not auto-delete objects Multi-threaded Event Processing with JANA - D. Lawrence JLab

  9. Threads in JANA • Each thread in JANA is composed of its own event processing loop and a complete set of factories • Reconstruction of a given event is done entirely inside of a single thread • No mutex locking is required by authors of reconstruction code • Threads work asynchronously to maximize rates at the expense of not maintaining the event order on output raw data read in reconstructed values written out (e.g. ROOT tree) Multi-threaded Event Processing with JANA - D. Lawrence JLab

  10. Multi-threading when CPU limited Reconstruction of MC data, CPU bound jobs only • CPU intensive jobs are the ideal application for multi-threading • Blue circles are reconstruction of data from a Monte Carlo simulation • Red triangles are from a CPU-hungry speed testing plugin • Both show very good scaling of the event processing rate with the number of threads Overall event processing rate scales linearly with the number of threads Multi-threaded Event Processing with JANA - D. Lawrence JLab

  11. Multi-threading when I/O limited blue circles: one multi-threaded process reading from a single file red triangles: multiple single-thread processes reading different files from the same disk • Multiple processes trying to access different locations on the same disk leads to competition causing the read head to physically move back and forth from one location on the disk to another • A multi-threaded application will access a single file in sequence reducing the number of moves the read head must make No processing of event data, I/O bound jobs only Multi-threaded Event Processing with JANA - D. Lawrence JLab

  12. Features of JANA The Event Processing Framework JANA includes the following features: C++ • C++ , object-oriented, STL • Multi-threaded : reconstruction program can launch any number of processing threads with each event being seen by only one thread • Plug-ins : an existing, compiled program can dynamically load other modules that extend or modify it’s behavior at run-time • Reconstruction Algorithms • Event (Data) sources • Event Processors (i.e. the top-level “conductor”) • Data on demand : modules are not “activated” unless the data they produce is requested for that particular event Multi-threaded Event Processing with JANA - D. Lawrence JLab

  13. Summary • In the 12GeV era, JLab expects to produce more than 5 pB/yr • Performance improvements have been shown for both CPU and I/O limited jobs using a multi-threaded event processing framework. • Taking advantage of multi-core architectures requires very little effort from reconstruction code authors in a multi-thread framework. • Other JANA features not covered: • Automatic TTree creation • Internal profiling and call graphing • Calib. /Cond. DB API • … Multi-threaded Event Processing with JANA - D. Lawrence JLab

  14. Backup Slides Multi-threaded Event Processing with JANA - D. Lawrence JLab

  15. The janaroot plugin (for automatic creation of ROOT TTrees) • Each data object implements a toStrings() method which provides an expression of the data object that may not be a full representation of the object • The toStrings() mechanism was developed for allowing a simple, low-level dump of objects from single events to the screen • This mechanism is leveraged by janaroot to provide a similar expression as TTrees • An empty event tree is also created with all other trees A leaf named “N” is automatically added to each tree Each leaf is an array of size “N” to represent the N objects of this type in the event • listed as friends so that a leaves from multiple objects can be used together in expressions • Limitations make this unsuitable for all applications, but it does provide a quick, easy way to make plots of some reconstructed values for less experienced users Multi-threaded Event Processing with JANA - D. Lawrence JLab

  16. The janadot plugin(for creating a factory call graph) arrows indicate calling sequence data flow is in opposite direction Number of calls and amount of time spent satisfying each is reported Objects at bottom of graph are (mostly) supplied by event source Multi-threaded Event Processing with JANA - D. Lawrence JLab

  17. Important Roles of the Event Processing Framework JANA has been designed to provide all of these! The framework should provide: • A clear structure for modular building of reconstruction code • An easy means for swapping out modules(e.g. replace one calorimeter clustering algorithm with another one) • A mechanism for moving data between modules • Standard interface to event sources(i.e. reconstruction agnostic as to whether event came from file, socket, web service, etc…) • Standard interface to Calibrations and Conditions DB • Centralized area for run-time settings with simple access mechanism (i.e. allow user to modify a setting at runtime and all modules can see it) Multi-threaded Event Processing with JANA - D. Lawrence JLab

  18. Threading benefits large scale processing Single Farm node cores processing time = multi-threaded = single-threaded Multi-threaded Event Processing with JANA - D. Lawrence JLab

  19. Threading benefits large scale processing 1 year of GlueX data =10k to 20k files if 1 file every 10 min. Multi-threaded Event Processing with JANA - D. Lawrence JLab

More Related