Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013
The MultiPurpose Detector – MPD to study Heavy Ion Collisions at NICA
Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =
Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =
Computing resources for MPD data processing CPU: 128 XEON cores GPU: ~1500 TESLA cores
Computing resources for MPD data processing CPU: 128 XEON cores => in future ~ XEON cores GPU: ~1500 TESLA cores
Motivation to develop monitoring system -Computing resources information (free space, memory, cpu, etc) -System load (load average, processes) -MPD software information (FairSoft version) -Cluster software information (SGE, xrootd, proof) -User tasks monitoring (batch processing and interactive jobs) MPD users need more information about all own cluster nodes and public computers!
Monitoring system schemes MySQL DB MySQL DB BASH Scripts DSH Software Cron run job Cron run job PHP Scripts WEB Interface WEB Interface MySQL DB MySQL DB Scheme 1 – for collect general information
Monitoring system schemes MySQL DB MySQL DB BASH Scripts DSH Software Cron run job Cron run job PHP Scripts WEB Interface WEB Interface MySQL DB MySQL DB Scheme 1 – for collect general information WEB Interface WEB Interface PHP Scripts DSH Software BASH Scripts BASH Scripts MySQL DB Scheme 2 – for collect information about user tasks and provide data management
Web- interface for Monitoring system 1.MPD software information 2.Computing resources information 3.System load 4.User tasks monitoring
Monitoring system web-interface User tasks
Monitoring system web-interface Interactive nodes
Access to the monitoring system on website mpd.jinr.ru
Thank you for your attention!
MPD users need more information about all own cluster nodes and public computers! Why? If, for example, the concept of grid uses a layer of abstraction from the resources. Because MPD software now still under development and needs testing and debugging. Motivation to develop system monitoring