Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE.

Similar presentations


Presentation on theme: "Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE."— Presentation transcript:

1 Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE Soohyung (Korea Univ.)

2 Distributed parallel framework Analysis framework: ROOBASF – Extended from BASF (Belle’s framework) – Controls analysis workflow – For MPI distributed-memory system * – With a Python interface * – ROOT embedded * For the use of: – Belle II (High energy physics) – Hyper Suprime-Cam (Astrophysics) 2 * Newly appended features

3 Table of contents Motivation – Hyper Suprime-Cam & Belle II Distributed parallel framework – MPI & Python Test pipeline Summary 3

4 MOTIVATION 4

5 Hyper Suprime-Cam (HSC) & Belle II Hyper Suprime-Cam (HSC) – Next-generation camera aiming for dark energy On the prime focus of the Subaru Telescope. Data rate: 2GB/shot. – 10 times larger than the current camera’s. Belle II – Next-generation B factory With Super KEKB: new high luminosity e - -e + collider at KEK. Data rate: 600MB/sec. – > 40 times larger than the current Belle detector’s Efficient, distributed parallel analysis system is necessary 5

6 Analyses on HSC images Chip-by-chip correction 116 CCD sensors cover the focal plane Easily data-parallelized. Assigning chips with processes 1 by 1 Pedestal correction Gain correction Determine positions by matching celestial objects superpose chips Parallelization is not trivial Processes must exchange – object position information – pixel information – etc. “Mosaicking” Processes need communication 6

7 Use case in Belle ll ROOT-based data format. DAQ cluster needs cooperation 7

8 Existing framework BASF: the framework for the Belle experiment – successfully used for 10 years. – Involved in nearly all of the experiment. Data Acquisition, Simulation, Users’ analysis – Software pipeline architecture Enables modular structure of analysis paths. Flexible and dynamic module linking. – Event-by-event parallel analysis Issues to be improved: – Large data rate: distributed parallelization – with Inter-process communication. – ROOT support / Object-oriented data flow. analysis modules Path Upgrade BASF for Belle II & also for HSC 8

9 DISTRIBUTED PARALLEL FRAMEWORK 9

10 Parallel framework (ROOBASF) Control analysis paths. – Like BASF in Belle. Data parallel. – Inter-process comm. Program parallel. Python user interface. ROOT utilization. Process 1 Process 2 Process 3 Process 4 analysis modules Process 1Process 2 Path 10

11 Parallelization ROOBASF uses Message Passing Interface (MPI) – De-facto standard of distributed parallel computing. – Expected to run in various environments. Analysis modules use MPI to perform data-parallel algorithms. – Each pipeline stage is given an MPI group (communicator.) – Modules perform parallel processing just like stand-alone MPI programs in the given group. Process group 1Process group 2 11

12 Two layers of analysis paths Sequential paths – Sequence of analysis modules. – Conditional branches. →All executed in one process. Parallel paths – Sequence of processes & c. branches. Each of the processes execute a “sequential path. ” Program-parallelization. – Multiple copies run simultaneously. Data-parallelization. analysis modules Con. branch processes 12

13 Data flow Events – Event or image data to be analyzed. Broadcast messages – Experiment parameters, observation params, etc. – Have to be sent to all modules. – Must not switch order with events. overtake event? c. branch 12 event bcast 2 Suspend b-cast until it arrives from all branches 13

14 Native (C++ etc) Utilization of Python Analysis paths are described in the Python language. – Modules can also be described in the script inline. Modules can be quickly developed in Python. CPU costly, then be rewritten in C++. →Efficient development of analysis modules. Implemented with the boost.python library. – Python scripts can call native codes. – Native codes can call Python scripts. Unique feature of boost.python, absent from SWIG. ROOBASF Python script Path Descrpt. Analysis code call Analysis code 14

15 Python script import boostpbasf as basf f = basf.CFrame() f.Plug_Module( "Astr1Chip").SetParam( "config", "matching.scamp”) Create an instance of ROOBASF framework dopen() “Astr1Chip.so”, link the plugin code, and set its parameter. class Load(basf.CModule): def __init__(self, namefmt): basf.CModule.__init__(self) self.namefmt = namefmt self.count = 0 def event(self, status, ev, comm): if status == 0: ev.SetFile(namefmt % count) (……) Define a python module load = Load(“/data/img%03d.fits") f.Seq_Add("main", load) f.Seq_Add("main", "Astr1Chip") Create a sequential path “main” Python ROOBASF (native) “main” path Astr1Chip.so (native) Load 15

16 TEST PIPELINE 16

17 Pipeline for the test Data-parallel analysis path (for on-line monitoring): – Performs pedestal/gain correction – Checks data quality – Performs 1-chip astrometry – Tiny modules in Python: Error detector, Time watch, etc. ROOBASF OSSFLATAGPSTATSEXTASTR OSSFLATAGPSTATSEXTASTR OSSFLATAGPSTATSEXTASTR CCD images correction Check Data Quality 1-chip astrometry (Multi-threaded) 17

18 Test environment 3 PCs only – x64 4-core – Gigabit-Ethernet-linked Number of processes – 1, 3x1, 3x2, 3x3 Parallelization will not go linear (though CPU has 4 cores) because of multi-threaded modules. 1 process 3x1 process 3x2 processes 3x3 processes HDD In. images Out. images CPU: 4 cores HDD Programs In. images Out. images CPU: 4 cores HDD In. images Out. images CPU: 4 cores (NFS) 18 Process with threads

19 Analysis time per image / sec (inversed) Parallelization efficiency 19 Ideal speedup 1369 20 Process with threads 30 15 10 5 1 2 3 4 5 6 7 8 9 Speedup Analysis time per image / sec (inversed)

20 SUMMARY 20

21 Summary Analysis framework: ROOBASF – Distributed memory (MPI) – Python script – ROOT I/O We built a parallel analysis path for astronomical images. Yet to confirm feasibility in Belle II. 21


Download ppt "Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE."

Similar presentations


Ads by Google