Experience of PROOF cluster Installation and operation

Slides:



Advertisements
Similar presentations
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
Advertisements

GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
On St.Petersburg State University Computing Centre and our 1st results in the Data Challenge-2004 for ALICE V.Bychkov, G.Feofilov, Yu.Galyuck, A.Zarochensev,
Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –
New CERN CAF facility: parameters, usage statistics, user support Marco MEONI Jan Fiete GROSSE-OETRINGHAUS CERN - Offline Week –
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
Kurochkin I.I., Prun A.I. Institute for systems analysis of RAS Centre for grid-technologies and distributed computing GRID-2012, Dubna, Russia july.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
Parallel Computing with Matlab CBI Lab Parallel Computing Toolbox TM An Introduction Oct. 27, 2011 By: CBI Development Team.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
1 Part III: PROOF Jan Fiete Grosse-Oetringhaus – CERN Andrei Gheata - CERN V3.2 –
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia,
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
CHEP 2013, Amsterdam Reading ROOT files in a browser ROOT I/O IN JAVASCRIPT B. Bellenot, CERN, PH-SFT B. Linev, GSI, CS-EE.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
ROOT I/O in JavaScript Browsing ROOT Files on the Web For more information see: For any questions please use following address:
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
ROOT and PROOF Tutorial Arsen HayrapetyanMartin Vala Yerevan Physics Institute, Yerevan, Armenia; European Organization for Nuclear Research (CERN)
Status of Grid & RPC-Tests Stand DAQ(PU) Sumit Saluja Programmer EHEP Group Deptt. of Physics Panjab University Chandigarh.
Operations in R ussian D ata I ntensive G rid Andrey Zarochentsev SPbSU, St. Petersburg Gleb Stiforov JINR, Dubna ALICE T1/T2 workshop in Tsukuba, Japan,
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.
GRID & Parallel Processing Koichi Murakami11 th Geant4 Collaboration Workshop / LIP - Lisboa (10-14/Oct./2006) 1 GRID-related activity in Japan Go Iwai,
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Lyon Analysis Facility - status & evolution - Renaud Vernet.
PROOF system for parallel NICA event processing
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Solid State Disks Testing with PROOF
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Status of the CERN Analysis Facility
PROOF – Parallel ROOT Facility
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
Experience in ALICE – Analysis Framework and Train
RDIG for ALICE today and in future
GSIAF "CAF" experience at GSI
AliEn central services (structure and operation)
Grid Canada Testbed using HEP applications
TABE PC.
Support for ”interactive batch”
Alice Software Demonstration
PROOF - Parallel ROOT Facility
Presentation transcript:

15th International Workshop of advanced computing and analysis technique in physics Experience of PROOF cluster Installation and operation at Joint Institute for NuclearResearch for ALICE data analysis G. Shabratova, R. Semenov, G.Stiforov, M. Vala, L. Valova ( Joint Institute for Nuclear Research, Dubna, Russia) What is PROOF? PROOF stands for Parallel ROOt Facility It allows parallel processing of large amount of data. The output results can be directly visualized (e.g. the output histogram can be drawn at the end of the proof session). PROOF is NOT a batch system. The data which you process with PROOF can reside on your computer, PROOF cluster disks or grid. The usage of PROOF is transparent: you should not rewrite your code you are running locally on your computer No special installation of PROOF software is necessary to execute your code: PROOF is included in ROOT distribution. JRAF – PROOF cluster at JINR (Joint Institute for Nuclea Research PROOF Cluster consists from 8cores PC like master with RAM = 32 Gb and 4 PCs of 48 cores summary for slaves. Disk space for data is equal 14.13TB. Problems at installation and test of PROOF cluster. Found solutions Problems Solution PROOF session crashed at SLC version higher 5.5 back migration PC to version 5.5 of SLC kPROOF_FATAL error at ROOT v5-34-02 , crash TStatus::Add(char const*) bag has been fixed in VO_ALICE@ROOT::v5-34-02-1 *Break *: segmentation violation during close of PROOF session orrected in VO_ALICE@ROOT::v5-34-05 Staged data problem due to alien token init fault - usage of obsolete alien server name correction of alien server name to modern one at master and slave nodes. Wrong JRAF disk capacity info at ALICE Mona Lisa page (only 7.049 TB instead of 14 .13TB). change parameter of AAF_AFDSMGR_MAX_TRANSFERS from 10 to 40 in afdsmgrd config PROOF session at JRAF [gshabrat@alicepc19 aaf-tutorial-task]$ root.exe ******************************************* * * * W E L C O M E to R O O T * * Version 5.34/02 21 September 2012 * * You are welcome to visit our Web site * * http://root.cern.ch *   ROOT 5.34/02 (tags/v5-34-02@46115, Oct 29 2012, 11:30:06 on linuxx8664gcc) CINT/ROOT C/C++ Interpreter version 5.18.00, July 2, 2010 Type ? for help. Commands must be C++ statements. Enclose multiple statements between { }. root [0] TProof::Open("jraf.jinr.ru"); Starting master: opening connection ... Starting master: OK Opening connections to workers: OK (48 workers) Setting up worker servers: OK (48 workers) PROOF set to parallel mode (48 workers) root [1] gProof->ShowDataSets() Dataset repository: /pool/PROOF-AAF/proof//dataset Dataset URI | # Files | Default tree | # Events | Disk | Staged /default/gshabrat/LHC11h_2_000170267_AllAODs| 161 | /aodTree | 1.575e+05| 66 GB | 84 % /default/gshabrat/LHC11h_2_000170268_AllAODs| 2399 | /aodTree | 2.432e+06| 2 TB | 97 % root [2] x runJRAF.C("jraf.jinr.ru","/default/gshabrat/LHC11h_2_000170268_AllAODs",kFALSE,"VO_ALICE@AliRoot::v5-04-22-AN",10000,0) 07:50:44 13603 Wrk-0.39 | Info in <TXProofServ::HandleCache>: loading macro AliAnalysisTaskCustom.cxx+g ... 07:50:44 5069 Wrk-0.44 | Info in <TXProofServ::HandleCache>: loading macro AliAnalysisTaskCustom.cxx+g ... ===== RUNNING PROOF ANALYSIS CAF test train ON DATASET /default/gshabrat/LHC11h_2_000170268_AllAODs Looking up for exact location of files: OK (2341 files) Validating files: OK (2341 files) +++ +++ About 2.42 % of the requested files (58 out of 2399) are missing or unusable; details in the 'MissingFiles' list Mst-0: Number of mergers set dynamically to 7 (for 48 workers) worker 0.34 on host lxprf04.jinr.ru will be merger for 6 additional workers worker 0.35 on host lxprf05.jinr.ru will be merger for 6 additional workers worker 0.15 on host lxprf05.jinr.ru will be merger for 6 additional workers worker 0.7 on host lxprf05.jinr.ru will be merger for 5 additional workers worker 0.44 on host lxprf02.jinr.ru will be merger for 6 additional workers worker 0.17 on host lxprf03.jinr.ru will be merger for 6 additional workers worker 0.5 on host lxprf03.jinr.ru will be merger for 6 additional workers Mst-0: merging output objects ... done Mst-0: grand total: sent 12 objects, size: 40686 bytes Future development and application of PROOF technology