Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mock Data Challenge for the MPD experiment on the HybriLIT cluster

Similar presentations


Presentation on theme: "Mock Data Challenge for the MPD experiment on the HybriLIT cluster"— Presentation transcript:

1 Mock Data Challenge for the MPD experiment on the HybriLIT cluster
The International Conference "Mathematical Modeling and Computational Physics“ (MMCP’2017) Mock Data Challenge for the MPD experiment on the HybriLIT cluster K. Gertsenberger VBLHEP, Joint Institute for Nuclear Research on behalf of the MPD collaboration 6 July 2017

2 NICA accelerator complex
Beams: from p, d(h) to 𝐴𝑢 79+ Fixed target experiment: (2017) 2 interaction points: MPD (2020) & SPD Luminosity: (Au), (p) 𝑐 𝑚 −2 𝑠 −1 Collision energy: 𝑆 𝑁 𝑁 𝐴𝑢 = 4 – 11 Gev 6 July 2017

3 MpdRoot framework The software MpdRoot is developed for the MPD event simulation, reconstruction of experimental or simulated data and following physical analysis of heavy ion collisions registered by the MultiPurpose Detector at the NICA collider. C++ classes, Linux OS support, based on ROOT and FairRoot *Stage 2 The MpdRoot software is available in the GitLab 6 July 2017

4 Software of the experiment: MpdRoot
MpdRoot steps UrQMD LAQGSM Pythia… Geant3 Geant4 Fluka… Software of the experiment: MpdRoot Event generator simulate physics process (quantum mechanics and probabilities) Simulation simulate interaction with media and detector materials Digitization translate interactions with detectors into clusters of signals Reconstruction as for experimental data Analysis as for experimental data Phases of QCD matter at high baryon density Hydrodynamics and hadronic observables Femtoscopy, correlations and fluctuations Local P and CP violation in hot QCD matter Cumulative processes Polarization effects and spin physics Hypernuclei production in heavy ion collisions and many others… Interaction of interest Geometry of the system Materials used Particles of interest Generation of test events of particles Interactions of particles with matter and EM fields Response to detectors Records of energies and tracks Analysis of the full simulation at whatever detail you like Visualization of the detector system and tracks Clustering Hits reconstruction in subdetectors Tracks reconstruction Searching for track candidates in main tracker Track propagation using Kalman filter Matching with other detectors Vertex finding Particles identification 6 July 2017

5 Prerequisites of the distributed computing
high interaction rate (up to 7 KHz) high particle multiplicity, up to 1000 charged particles for the central collision at the NICA energy sequential event reconstruction can take a lot of time in MpdRoot large data stream from MPD: is estimated at 10 PB of raw data per year 200m simulated events ~ 1 PB MPD event data can be processed concurrently! 6 July 2017

6 MPD DAQ Data Flow (proposal)
Rootifier ROOT Format Mapping Alignment REC Event Reconstruction PA Physics Analysis FLP First Level Processor Data Check Flow Control Formatting EvB Event Builder Buffering Sorting Distribution HLT High Level L3 Trigger RQ Raw Data Quality Check distributed cluster TDS Transient Data Storage PDS Permanent Data Storage DRE LDC FT REC Fast Event Reconstruction HIST Online Histograms EvM Event Monitor Database DAQ On-line Processing Off-line Processing matched from “MPD Data Acquisition System TDR”, MPD DAQ Collaboration

7 MpdRoot offline data processing
DAQ Data Storage raw data in MPD format Geant3/4, Fluka… Event Generators UrQMD, QGSM, Pythia… digitizer simulation runMC.C mpd_run.data ~0.8 MB/event generator.data ~0.5 MB/event mpd_digits.root ~0.5 MB/event evetest.root ~5 MB/event reconstruction reco.C mpddst.root ~1 MB/event physical analysis DST format 6 July 2017

8 The Unified Database as simulation data storage
It contains all simulated data files for different event generators Easy searching by many parameters event generators: UrQMD QGSM Hybrid UrQMD vHLLE_UrQMD 3FD(Theseus) 1,2TB ~ files ~ events/file 6 July 2017

9 Parallel MPD event processing
concurrent data processing on cluster nodes PROOF server parallel event data processing in ROOT macros on the parallel architectures MPD-Scheduler scheduling system for task distribution to parallelize MPD data processing on the cluster nodes 6 July 2017

10 Parallel data processing with PROOF
PROOF (Parallel ROOT Facility) is a part of the ROOT software, no additional installations PROOF uses data independent parallelism based on the lack of correlation for MPD events  good scalability Parallelization for three parallel architectures: PROOF-Lite parallelizes the data processing on one multiprocessor/multicores machine PROOF parallelizes processing on heterogeneous computing cluster Parallel data processing in GRID system 6 July 2017

11 The speedup of the reconstruction on 4-cores
6 July 2017

12 PROOF server on cluster (file splitting mode)
event count $ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:server_name”) Distributed FS mpddst.root *.root evetest.root event №0 event №1 event №2 proof proof proof proof proof proof proof = master server proof = slave node Proof On Demand cluster 6 July 2017

13 Parallel MPD event processing
concurrent data processing on cluster nodes PROOF server parallel data processing in ROOT macros on the parallel architectures MPD-Scheduler scheduling system for task distribution to parallelize MPD data processing on the cluster nodes 6 July 2017

14 $ mpd-scheduler my_job.xml
MPD-Scheduler is developed on C++ language with ROOT classes support. GIT: nica_modules/mpd_scheduler MPD-Scheduler simplifies and parallelize job executing without knowledge of batch systems, e.g. sbatch or qsub command, and can use the Unified Database It supports SLURM, SGE and Torque scheduling systems for execution in cluster mode Jobs for multithreading execution on one user multicore machine and distributed execution on clusters are described and passed to MPD-Scheduler as XML file: $ mpd-scheduler my_job.xml 6 July 2017

15 Job description for MPD-Scheduler
<macro name="$VMCWORKDIR/macro/mpd/reco.C" start_event=”0” count_event=”1000” add_args=“local”/> <file input="$VMCWORKDIR/EVE/*.root" output="$VMCWORKDIR/DST/mpddst${counter}.root"/> <run mode=“local" count="5" config=“$VMCWORKDIR/build/config.sh“ logs="processing.log"/> </job> The description starts and ends with tag <job>. Tag <macro> sets information about ROOT macro being executed by MpdRoot: name (path), start_event, event_count, add_args… Tag <file> defines files to process by macro above: input (path with regular), file_input, db_input, job_input, output, start_event, event_count, parallel_mode, merge… Tag <run> describes run parameters and allocated resources for the job: mode (‘global’ – on the NICA cluster, ‘local’ – on a multicore machine), count, config, logs… Tag <command> with argument line is used to run a non-ROOT command. 6 July 2017

16 MPD-Scheduler on distributed clusters
job_reco.xml <job> <macro name="~/mpdroot/macro/mpd/reco.C"/> <file input="$VMCWORKDIR/evetest1.root" output="$VMCWORKDIR/mpddst1.root"/> <file input="$VMCWORKDIR/evetest2.root" output="$VMCWORKDIR/mpddst2.root"/> <file input="$VMCWORKDIR/evetest3.root" output="$VMCWORKDIR/mpddst3.root"/> <run mode=“global" count=“3" config=“~/mpdroot/build/config.sh"/> </job> job_command.xml <job> <command line="get_mpd_production energy=5-9 "/> <run mode="global" config="~/mpdroot/build/config.sh"/> </job> Distributed FS *.root evetest1.root evetest3.root MPD-Scheduler evetest2.root mpddst1.root job_command.xml mpddst2.root mpddst3.root WRK WRK SRV WRK WRK WRK free free free busy busy SRV = batch server batch system (SLURM, SGE or Torque) WRK = batch worker node 6 July 2017 16

17 Parallel event processing in MpdRoot
MPD event reconstruction with PROOF server PROOF (Parallel ROOT Facility) is a part of the ROOT software Parallel NICA event data processing in ROOT macros on the parallel architectures: user multicore machines, heterogeneous distributed clusters and GRID system Event reconstruction with MPD-Scheduler Scheduling system (MPD-Scheduler) for task distribution to parallelize NICA data processing on the multicore machines and cluster nodes Supports SLURM, SGE and Torque system Jobs are described and passed as XML file 6 July 2017

18 The purposes of Mock Data Challenge
provides large scale simulation production exercise the full spectrum of experiment software and hardware from simulation through to physics analysis good for stress-testing of distributed computing infrastructure of the experiment identifies potential issues before first data helps to estimate offline computing requirements data quality checking prepares quick turnaround from data taking to publication 6 July 2017

19 HybriLIT cluster web-site: OS: Scientific Linux batch system: SLURM distributed FS (MPD data): NFS All external packages for MpdRoot are installed & configured. MpdRoot is taken from GIT repository. MPD user is available. 6 July 2017

20 MDC on HybriLIT cluster (jobs)
<job name="mpd_simulation"> <macro name="~/mpdroot/macro/mpd/runMC.C"/> <file input="/nfs/main2.jinr.ru/projects/nica/LAQGSM/*" output="/nfs/main2.jinr.ru/projects/nica/temp/evetest_${file_name}.root" start_event="0" count_event="1000"/> <run mode="global" count="0" config="~/mpdroot/build/config.sh" queue="cpu"/> </job> <job name="mpd_reconstruction" dependency="mpd_simulation"> <macro name="~/mpdroot/macro/mpd/reco.C"/> <file job_input="mpd_simulation" output="/nfs/main2.jinr.ru/projects/nica/temp/bmndst_${file_name:~8}.root" start_event="0" count_event="1000"/> <job name="mpd_analysis" dependency="mpd_reconstruction"> <macro name="~/mpdroot/macro/physical_analysis/femto/femtoAna.C"/> <file job_input="mpd_reconstruction" output="/nfs/main2.jinr.ru/projects/nica/temp/femtoana_${file_name:~7}.root" start_event="0" count_event="1000"/> <run mode="global" count="0" config="~/mpdroot/build/config.sh" queue="cpu" priority="1"/> </jobs> Geant3 Event Generators 320 LAQGSM input files runMC.C AuAu_5-11gev_*.root evetest*.root reco.C mpddst*.root femtoAna.C MPD-Scheduler XML description 6 July 2017

21 MDC on HybriLIT cluster (run)
Event Generators runMC.C reco.C femtoAna.C Successful! data quality system is not ready now, so result data was exercised manually 6 July 2017

22 MDC on HybriLIT cluster (time)
6 July 2017

23 «Computing» section on mpd.jinr.ru
6 July 2017

24 Conclusions The MpdRoot environment was deployed on the HybriLIT cluster for MPD data processing: Fairsoft, ROOT/PROOF, MpdRoot, MPD-Scheduler... Two methods were implemented to process event data of the MPD experiment in parallel: using PROOF system and MPD- Scheduler. Batch System based on SLURM is used on the HybriLIT cluster to accelerate processing of user tasks. The MPD-Scheduler tool was developed to automate running MpdRoot macros concurrently. Mock Data Challenge used the simulation-analysis chain to test the software and computing infrastructure. All of the steps were successfully completed. The site mpd.jinr.ru presents the detailed information in the ‘Computing’ section. 6 July 2017

25 Thank you for attention!
The International Conference "Mathematical Modeling and Computational Physics“ (MMCP’2017) Thank you for attention! and thanks to HybriLIT team for their support! MPD site: mpd.jinr.ru forum: 6 July 2017

26 Additional slides 6 July 2017

27 User multicore machine
Multithreading parallelization of time-consuming tasks in MpdRoot TOF matching Intel Core i GHz TPC tracking Intel Core i GHz 6 July 2017

28 MPD data storage levels
Determine particle properties at target vertex Event Generator Transport particles through the detector material MC tracks & points Transport Simulation SIM Digitizer Determine detector response digits RAW Determine physical space point parameters from detector hits Storage Levels Hit Finder hits Determine momentum vector and PID for all tracks Reconstruction tracks & vertex Track finding&fitting DST Analysis Physics Analysis 6 July 2017


Download ppt "Mock Data Challenge for the MPD experiment on the HybriLIT cluster"

Similar presentations


Ads by Google