PROOF system for parallel NICA event processing MPD/BM@N Collaboration Joint Institute for Nuclear Research, Dubna
NICA accelerator complex Gertsenberger K.V.
MPD and MpdRoot software The software MpdRoot is developed for the MPD event simulation, reconstruction of experimental or simulated data and following physical analysis of heavy ion collisions registered by the MultiPurpose Detector at the NICA collider. (based on ROOT and FairRoot) The MpdRoot software is available in the GitLab https://git.jinr.ru/nica/mpdroot Gertsenberger K.V.
BM@N and BmnRoot software The software BmnRoot is developed for the BM@N event simulation, reconstruction of experimental or simulated data and following physical analysis of collisions of elementary particles and ions with a fixed target at the NICA collider. (based on ROOT and FairRoot) The BmnRoot software is available in the GitLab https://git.jinr.ru/nica/bmnroot Gertsenberger K.V.
Prerequisites of the parallel processing/storing high interaction rate (up to 7 KHz) high particle multiplicity, up to 1000 charged particles for the central collision at the NICA energy one event reconstruction can take a lot of time in MpdRoot now large data stream from the MPD: is estimated at 5-10 PB of raw data per year 100m simulated events ~ 5 PB MPD event data can be processed concurrently! Gertsenberger K.V.
Current NICA cluster (prototype) Gertsenberger K.V.
Current data storage on the NICA cluster GlusterFS distributed file system free and open source aggregates existing file systems in a common distributed file system has no metadata server automatic replication works as background process background self-checking service restores corrupted files in case of hardware or software failure Gertsenberger K.V.
Parallel event processing NICA cluster concurrent data processing on cluster nodes PROOF server parallel event data processing in ROOT macros on the parallel architectures MPD-Scheduler scheduling system for task distribution to parallelize MPD data processing on the cluster nodes Gertsenberger K.V.
Parallel data processing with PROOF PROOF (Parallel ROOT Facility) is a part of the ROOT software, no additional installations PROOF uses data independent parallelism based on the lack of correlation for MPD events good scalability Parallelization for three parallel architectures: PROOF-Lite parallelizes the data processing on one multiprocessor/multicores machine PROOF parallelizes processing on heterogeneous computing cluster Parallel data processing in GRID system Gertsenberger K.V.
Using PROOF in reconstruction The last parameter of the reconstruction macro: run_type (default, “local”). Speedup on the user multicore machine, e.g. for MpdRoot: $ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof”) parallel processing of 1000 events with thread count being equal logical processor count $ root reco.C(“evetest.root”, “mpddst.root”, 0, 500, “proof:workers=3”) parallel processing of 500 events with three concurrent threads Speedup on the NICA cluster: $ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof:mpd@nc10.jinr.ru:21001”) parallel processing of 1000 events on all cluster’s cores of the PoD farm $ root reco.C(“evetest.root”, …, 0, 500, “proof:mpd@nc10.jinr.ru:21001:workers=15”) parallel processing of 500 events on the PoD cluster with 15 workers Gertsenberger K.V.
The speedup of the reconstruction on 4-cores Gertsenberger K.V.
PROOF on the NICA cluster (file splitting mode) event count $ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:mpd@nc10.jinr.ru:21001”) GlusterFS mpddst.root *.root evetest.root event №0 event №1 event №2 proof proof proof proof proof proof proof = master server proof = slave node Proof On Demand cluster (64 cores) Gertsenberger K.V.
The speedup on the NICA cluster Gertsenberger K.V.
«PROOF Parallelization» section on mpd.jinr.ru Gertsenberger K.V.
Conclusions The distributed NICA cluster contains 364 processor cores for the NICA experiments (Fairsoft, ROOT/PROOF, BmnRoot, MpdRoot, GlusterFS, Sun Grid Engine). The data storage based on GlusterFS distributed file system provides 83+4 TB shared space (mirrored data). PROOF On Demand cluster with 64 processor cores was implemented to parallelize event data processing for the NICA experiments. PROOF support was added to the reconstruction macro. Parallel reconstruction with PROOF can be used on user multicore machine or on the NICA cluster (just one parameter has to be changed). The web site mpd.jinr.ru in section Computing – NICA cluster – PROOF parallelization presents the manual for the PROOF system. Gertsenberger K.V.