Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia,

Slides:



Advertisements
Similar presentations
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
Advertisements

ALFA: The new ALICE-FAIR software framework
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Status and roadmap of the AlFa Framework Mohammad Al-Turany GSI-IT/CERN-PH-AIP.
UPPSALA DATABASE LABORATORY Managing Scientific Queries over Distributed Data in a Grid Environment Ruslan Fomkin.
Bondyakov A.S. Institute of Physics of ANAS, Azerbaijan JINR, Dubna.
What are the functions of an operating system? The operating system is the core software component of your computer. It performs many functions and is,
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep ,
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
REVIEW OF NA61 SOFTWRE UPGRADE PROPOSAL. Mandate The NA61 experiment is contemplating to rewrite its fortran software in modern technology and are requesting.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Tech talk 20th June Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
การติดตั้งและทดสอบการทำคลัสเต อร์เสมือนบน Xen, ROCKS, และไท ยกริด Roll Implementation of Virtualization Clusters based on Xen, ROCKS, and ThaiGrid Roll.
ORGANIZING AND ADMINISTERING OF VOLUNTEER DISTRIBUTED COMPUTING PROJECT Oleg Zaikin, Nikolay Khrapov Institute for System Dynamics and Control.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT
DUCKS – Distributed User-mode Chirp- Knowledgeable Server Joe Thompson Jay Doyle.
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
Framework of Job Managing for MDC Reconstruction and Data Production Li Teng Zhang Yao Huang Xingtao SDU
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Testing the dynamic per-query scheduling (with a FIFO queue) Jan Iwaszkiewicz.
Distributed Computing for CEPC YAN Tian On Behalf of Distributed Computing Group, CC, IHEP for 4 th CEPC Collaboration Meeting, Sep , 2014 Draft.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
CS779 Term Project Steve Shoyer Section 5 December 9, 2006 Week 6.
VICOMTECH VISIT AT CERN CERN 2013, October 3 rd & 4 th O.COUET CERN/PH/SFT DATA VISUALIZATION IN HIGH ENERGY PHYSICS THE ROOT SYSTEM.
AliRoot survey P.Hristov 11/06/2013. Offline framework  AliRoot in development since 1998  Directly based on ROOT  Used since the detector TDR’s for.
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
Cluster Computing Applications for Bioinformatics Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH.
+ AliEn site services and monitoring Miguel Martinez Pedreira.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
Participation of JINR in CERN- INTAS project ( ) Korenkov V., Mitcin V., Nikonov E., Oleynik D., Pose V., Tikhonenko E. 19 march 2004.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Status of Grid & RPC-Tests Stand DAQ(PU) Sumit Saluja Programmer EHEP Group Deptt. of Physics Panjab University Chandigarh.
FIFE Architecture Figures for V1.2 of document. Servers Desktops and Laptops Desktops and Laptops Off-Site Computing Off-Site Computing Interactive ComputingSoftware.
Grid technologies for large-scale projects N. S. Astakhov, A. S. Baginyan, S. D. Belov, A. G. Dolbilov, A. O. Golunov, I. N. Gorbunov, N. I. Gromova, I.
LIT participation LIT participation Ivanov V.V. Laboratory of Information Technologies Meeting on proposal of the setup preparation for external beams.
Emulating Volunteer Computing Scheduling Policies Dr. David P. Anderson University of California, Berkeley May 20, 2011.
LOGO Mock Data Challenge for the MPD experiment on the NICA cluster Potrebenikov Yu. K., Schinov B. G., Rogachevsky O.V., Gertsenberger K. V. Laboratory.
Experience of PROOF cluster Installation and operation
PROOF system for parallel NICA event processing
Status of BESIII Distributed Computing
Mock Data Challenge for the MPD experiment on the HybriLIT cluster
Heterogeneous Computation Team HybriLIT
PROOF – Parallel ROOT Facility
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
Hodor HPC Cluster LON MNG HPN Head Node Comp Node Comp Node Comp Node
US CMS Testbed.
Advanced Computing Facility Introduction
Alice Software Demonstration
Operating System Introduction.
HEC Beam Test Software schematic view T D S MC events ASCII-TDS
Presentation transcript:

Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013

The MultiPurpose Detector – MPD to study Heavy Ion Collisions at NICA

Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =

Software for MultiPurpose Detector MpdRoot Framework components: Detectors simulation Data reconstruction Event analysis ROOT + FairRoot (FairBase + FairSoft software packages) =

Computing resources for MPD data processing CPU: 128 XEON cores GPU: ~1500 TESLA cores

Computing resources for MPD data processing CPU: 128 XEON cores => in future ~ XEON cores GPU: ~1500 TESLA cores

Motivation to develop monitoring system -Computing resources information (free space, memory, cpu, etc) -System load (load average, processes) -MPD software information (FairSoft version) -Cluster software information (SGE, xrootd, proof) -User tasks monitoring (batch processing and interactive jobs) MPD users need more information about all own cluster nodes and public computers!

Monitoring system schemes MySQL DB MySQL DB BASH Scripts DSH Software Cron run job Cron run job PHP Scripts WEB Interface WEB Interface MySQL DB MySQL DB Scheme 1 – for collect general information

Monitoring system schemes MySQL DB MySQL DB BASH Scripts DSH Software Cron run job Cron run job PHP Scripts WEB Interface WEB Interface MySQL DB MySQL DB Scheme 1 – for collect general information WEB Interface WEB Interface PHP Scripts DSH Software BASH Scripts BASH Scripts MySQL DB Scheme 2 – for collect information about user tasks and provide data management

Web- interface for Monitoring system 1.MPD software information 2.Computing resources information 3.System load 4.User tasks monitoring

Monitoring system web-interface User tasks

Monitoring system web-interface Interactive nodes

Access to the monitoring system on website mpd.jinr.ru

Thank you for your attention!

MPD users need more information about all own cluster nodes and public computers! Why? If, for example, the concept of grid uses a layer of abstraction from the resources. Because MPD software now still under development and needs testing and debugging. Motivation to develop system monitoring