Download presentation
Presentation is loading. Please wait.
Published byBlaise Richard Modified over 9 years ago
1
LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna
2
NICA scheme Gertsenberger K.V.2
3
Multipurpose Detector (MPD) The software MPDRoot is developed for the MPD event simulation, reconstruction of experimental or simulated data and following physical analysis of heavy ion collisions registered by the MultiPurpose Detector at the NICA collider. 3Gertsenberger K.V.
4
Development of the NICA cluster 2 main directions of the development: data storage development for the experiment organization of parallel processing of the MPD events 4 development and expansion distributed cluster for the MPD experiment based on LHEP farm development and expansion distributed cluster for the MPD experiment based on LHEP farm Gertsenberger K.V.
5
Current NICA cluster in LHEP 5Gertsenberger K.V.
6
Data storage on the NICA cluster 6Gertsenberger K.V. Distributed file system GlusterFS it aggregates existing file systems in a common distributed file system automatic replication works as background process background self- checking service restores corrupted files in case of hardware or software failure
7
Parallel MPD data processing PROOF server parallel data processing in ROOT macros on the parallel architectures concurrent data processing MPD-scheduler scheduling system for the task distribution to parallelize data processing on the cluster nodes 7Gertsenberger K.V.
8
MPD-scheduler Developed on C++ language with ROOT classes’ support. SVN: mpdroot/macro/mpd_scheduler Uses scheduling system the Sun Grid Engine system (qsub command) for execution in cluster mode. SGE combines cluster machines at the LHEP farm (nc10, nc11 and nc13) into the pool of worker nodes with 34 logical processors. Jobs for distributed execution on the NICA cluster are described and passed to MPD-scheduler as XML file: $ mpd-scheduler my_job.xml 8Gertsenberger K.V.
9
9 The description starts and ends with tag. Tag sets information about macro being executed by MPDRoot: name – file path of a ROOT macro to execute, necessary parameter start_event – number of the first event to process for all input files, optional count_event – count of the events to process for all input files, optional add_args – additional arguments of the ROOT macro, if required Job description. Tag. Gertsenberger K.V.
10
10 Tag defines files to process by macro above: input – input file path output – result file path start_event – number of the first event in the input file, optional count_event – count of the events to process in the input file, optional paralell_mode – processor count to parallel event processing of input file, optional merge – whether merge result part files in parallel_mode, default: “true” Gertsenberger K.V. Job description. Tag.
11
11 … … db_input – string for defining a list of files from MPD simulation database mpd.jinr.ru – net address of the server with simulation database and some selection parameters: range of the collision energy, type of the particle generator, particles of the collision, description and other. The list of special variables of argument “output”: ${counter} = file counter with start value and step being equal 1 ${input} = input file path ${file_name} = name of the input file without extension ${file_name_with_ext} = name of the input file with extension Gertsenberger K.V. Processing event files from MPD simulation database.
12
12 Tag describes run parameters and the allocated resources for the job: mode – execution mode: ‘global’ – distributed processing on the NICA cluster, ‘local’ – multithreaded execution on a multicore computer count – maximum count of the processors allocated for this job config – path of a bash file with environment variables (including ROOT environment variables) being executed before macro logs – log file path for multithreaded mode Gertsenberger K.V. Job description. Tag.
13
13 Tag with argument line is used to run a non-ROOT command. Running non-ROOT command on the NICA cluster Gertsenberger K.V. Job description. Non-ROOT command.
14
Local use MPD-scheduler can be used to parallel event processing on user multicore machine in local mode 14Gertsenberger K.V. <file input=“~/mpdroot/macro/mpd/evetest1.root" output="~/mpdroot/macro/mpd/mpddst1.root“ start_event=”0” count_event=”0”/> <file input="~/mpdroot/macro/mpd/evetest2.root" output="~/mpdroot/macro/mpd/mpddst2.root“ start_event=”0” count_event=”1000” parallel_mode=“5” merge=“true”/>
15
MPD-scheduler on the NICA cluster 15Gertsenberger K.V.15Gertsenberger K.V. SGE SGE = Sun Grid Engine server SGE = Sun Grid Engine worker *.root GlusterFS SGE batch system (10) (14) qsub evetest1.root SGE MPD-scheduler evetest2.root evetest3.root free free mpddst2.root job_reco.xml job_command.xml mpddst1.root mpddst3.root job_command.xml
16
The speedup of the one reconstruction on the NICA cluster 16Gertsenberger K.V.
17
The description of the scheduling system on mpd.jinr.ru 17Gertsenberger K.V.
18
Conclusions The distributed NICA cluster was deployed based on LHEP farm for the NICA/MPD experiment (Fairsoft, ROOT/PROOF, MPDRoot, Gluster, Sun Grid Engine). 128 cores The data storage was organized with the GlusterFS distributed file system: /nica/mpd[1-8]. 10 TB The system for the distributed job execution – MPD-scheduler was developed to run MPDRoot macros concurrently on the cluster. It’s based on the Sun Grid Engine scheduling system. The web site mpd.jinr.ru in section Computing – NICA cluster – Batch processing presents the manual for the developed MPD scheduling system. 18Gertsenberger K.V.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.