distributed parallel processing analysis framework for belle ii and hyper suprime cam n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam PowerPoint Presentation
Download Presentation
Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam

Loading in 2 Seconds...

  share
play fullscreen
1 / 21
addison-camacho

Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam - PowerPoint PPT Presentation

139 Views
Download Presentation
Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE Soohyung (Korea Univ.)

  2. Distributed parallel framework • Analysis framework: ROOBASF • Extended from BASF (Belle’s framework) • Controls analysis workflow • For MPI distributed-memory system* • With a Python interface * • ROOT embedded * • For the use of: • Belle II (High energyphysics) • Hyper Suprime-Cam (Astrophysics) * Newly appended features

  3. Table of contents • Motivation • Hyper Suprime-Cam & Belle II • Distributed parallel framework • MPI & Python • Test pipeline • Summary

  4. MOTIVATION

  5. Hyper Suprime-Cam (HSC) & Belle II • Hyper Suprime-Cam (HSC) • Next-generation camera aiming for dark energy • On the prime focus of the Subaru Telescope. • Data rate: 2GB/shot. • 10 times larger than the current camera’s. • Belle II • Next-generation B factory • With Super KEKB: new high luminosity e--e+ collider at KEK. • Data rate: 600MB/sec. • > 40 times larger than the current Belle detector’s Efficient, distributed parallel analysis system is necessary

  6. Analyses on HSC images Chip-by-chip correction Easily data-parallelized. Assigning chips with processes 1 by 1 116 CCD sensors cover the focal plane Pedestal correction Gain correction Parallelization is not trivial Processes must exchange – object position information – pixel information – etc. “Mosaicking” Processes need communication superpose chips Determine positions by matching celestial objects

  7. Use case in Belle ll • ROOT-based data format. • DAQ cluster needs cooperation

  8. Existing framework • BASF: the framework for the Belle experiment • successfully used for 10 years. • Involved in nearly all of the experiment. • Data Acquisition, Simulation, Users’ analysis • Software pipeline architecture • Enables modular structureof analysis paths. • Flexible and dynamic module linking . • Event-by-eventparallelanalysis • Issues to be improved: • Large data rate: distributed parallelization • with Inter-process communication. • ROOT support / Object-oriented data flow. analysis modules Path Upgrade BASF for Belle II & also for HSC

  9. DISTRIBUTED PARALLEL FRAMEWORK

  10. Parallel framework (ROOBASF) analysis modules • Control analysis paths. • Like BASF in Belle. • Data parallel. • Inter-process comm. • Program parallel. • Python user interface. • ROOT utilization. Path Process 1 Process 2 Process 1 Process 3 Process 2 Process 4

  11. Parallelization • ROOBASF uses Message Passing Interface (MPI) • De-facto standard of distributed parallel computing. • Expected to run in various environments. • Analysis modules use MPI to perform data-parallel algorithms. • Each pipeline stage is given an MPI group (communicator.) • Modules perform parallel processing just like stand-alone MPI programs in the given group. Process group 1 Process group 2

  12. Two layers of analysis paths analysis modules • Sequential paths • Sequence of analysis modules. • Conditional branches. →All executed in one process. • Parallel paths • Sequence of processes & c. branches. • Each of the processes execute a “sequential path. ” • Program-parallelization. • Multiple copies run simultaneously. • Data-parallelization. Con. branch processes

  13. Data flow • Events • Event or image data to be analyzed. • Broadcast messages • Experiment parameters, observation params, etc. • Have to be sent to all modules. • Must not switch order with events. event Suspend b-castuntil it arrives from all branches bcast 1 2 2 overtake event? c. branch

  14. Utilization of Python • Analysis paths are described in the Python language. • Modules can also be described in the script inline. • Modules can be quickly developed in Python. • CPU costly, then be rewritten in C++. →Efficient development of analysis modules. • Implemented with the boost.python library. • Python scripts can call native codes. • Native codes can call Python scripts. • Unique feature of boost.python, absent from SWIG. Native (C++ etc) Python script Analysis code Analysis code call ROOBASF Path Descrpt. call

  15. Python script load = Load(“/data/img%03d.fits") f.Seq_Add("main", load) f.Seq_Add("main", "Astr1Chip") import boostpbasf as basf f = basf.CFrame() Create an instance of ROOBASF framework Create a sequential path “main” f.Plug_Module( "Astr1Chip").SetParam( "config", "matching.scamp”) dopen() “Astr1Chip.so”, link the plugin code, and set its parameter. Python Load class Load(basf.CModule): def __init__(self, namefmt): basf.CModule.__init__(self) self.namefmt = namefmt self.count = 0 def event(self, status, ev, comm): if status == 0: ev.SetFile(namefmt % count) (……) ROOBASF (native) “main” path Astr1Chip.so (native) Define a python module

  16. TEST PIPELINE

  17. Pipeline for the test • Data-parallel analysis path (for on-line monitoring): • Performs pedestal/gain correction • Checks data quality • Performs 1-chip astrometry • Tiny modules in Python: Error detector, Time watch, etc. ROOBASF (Multi-threaded) OSS FLAT AGP STAT SEXT ASTR OSS FLAT AGP STAT SEXT ASTR CCD images OSS FLAT AGP STAT SEXT ASTR correction Check Data Quality 1-chip astrometry

  18. Test environment • 3 PCs only • x64 4-core • Gigabit-Ethernet-linked • Number of processes • 1, 3x1, 3x2, 3x3 • Parallelization will not go linear (though CPU has 4 cores) because of multi-threaded modules. CPU: 4 cores CPU: 4 cores CPU: 4 cores HDD HDD HDD (NFS) (NFS) • In. images • Out. images • Programs • In. images • Out. images • In. images • Out. images Process with threads 1 process 3x1 process 3x2 processes 3x3 processes

  19. Parallelization efficiency 9 Analysis time per image / sec (inversed) 8 Speedup 7 6 5 Parallelization efficiency Analysis time per image / sec (inversed) Ideal speedup 5 4 3 10 2 15 20 1 30 1 3 6 9 Process with threads

  20. SUMMARY

  21. Summary • Analysis framework: ROOBASF • Distributed memory (MPI) • Python script • ROOT I/O • We built a parallel analysis path for astronomical images. • Yet to confirm feasibility in Belle II.