Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week – 11.07.2008.

Slides:

Advertisements

Similar presentations

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

Advertisements

“Managing a farm without user jobs would be easier” Clusters and Users at CERN Tim Smith CERN/IT.

– Unfortunately, this problems is not yet fully under control – No enough information from monitoring that would allow us to correlate poor performing.

GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.

New CERN CAF facility: parameters, usage statistics, user support Marco MEONI Jan Fiete GROSSE-OETRINGHAUS CERN - Offline Week –

1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.

Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,

Status of CAF and SKAF Arsen Hayrapetyan, Yerevan Physics Institute 1 November 18, 2010 ALICE Offline Week.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

Chapter 3 Operating Systems Introduction to CS 1 st Semester, 2015 Sanghyun Park.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Thomas Finnern Evaluation of a new Grid Engine Monitoring and Reporting Setup.

Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

Online Monitoring with MonALISA Dan Protopopescu Glasgow, UK Dan Protopopescu Glasgow, UK.

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.

May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.

Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.

Ideas for a virtual analysis facility Stefano Bagnasco, INFN Torino CAF & PROOF Workshop CERN Nov 29-30, 2007.

PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,

LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.

Testing the dynamic per-query scheduling (with a FIFO queue) Jan Iwaszkiewicz.

LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.

PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.

Batch Scheduling at CERN (LSF) Hepix Spring Meeting 2005 Tim Bell IT/FIO Fabric Services.

CERN – Alice Offline – Thu, 27 Mar 2008 – Marco MEONI - 1 Status of RAW data production (III) ALICE-LCG Task Force weekly.

Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.

HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.

The CMS CERN Analysis Facility (CAF) Peter Kreuzer (RWTH Aachen) - Stephen Gowdy (CERN), Jose Afonso Sanches (UERJ Brazil) on behalf.

PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.

Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.

CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.

ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.

PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.

PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,

Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)

Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 01/12/2015.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.

ALICE experiences with CASTOR2 Latchezar Betev ALICE.

Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,

Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.

Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.

Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.

Good user practices + Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CUF,

Virtual Cluster Computing in IHEPCloud Haibo Li, Yaodong Cheng, Jingyan Shi, Tao Cui Computer Center, IHEP HEPIX Spring 2016.

AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.

29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.

Grid Operations in Germany T1-T2 workshop 2015 Torino, Italy Kilian Schwarz WooJin Park Christopher Jung.

Lyon Analysis Facility - status & evolution - Renaud Vernet.

ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.

Experience of PROOF cluster Installation and operation

PROOF system for parallel NICA event processing

Analysis trains – Status & experience from operation

Diskpool and cloud storage benchmarks used in IT-DSS

ALICE Monitoring

Report PROOF session ALICE Offline FAIR Grid Workshop #1

Status of the CERN Analysis Facility

GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.

Bernd Panzer-Steindel, CERN/IT

GSIAF "CAF" experience at GSI

AliEn central services (structure and operation)

ALICE-Grid Activities in Bologna

Support for ”interactive batch”

Presentation transcript:

Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –

Outline CAF Usage and Users’ grouping Disk monitoring Datasets CPU Fairshare monitoring User query Conclusions & Outlook

CERN Analysis Facility Cluster of 40 machines since two years 80 CPUs, 8 TB of disk pool 35 machines as PRO partition, 5 as DEV Head node is xrootd redirector and PROOF master Other nodes are xrootd data servers and PROOF slaves

Available resources in CAF must be fairly used Highest attention to how disks and CPUs are used Users are grouped At present, sub-detectors and physics working groups Users can belong to several groups (PWG has precedence over sub-detector) Each group has a disk space (quota) which is used to stage datasets from AliEn has a CPU fairshare target (priority) to regulate concurrent queries CAF Usage

CAF Groups Groups#UsersDisk quota (GB)CPU quota (%) PWG PWG PWG PWG PWG EMCAL110 HMPID110 ITS310 T0110 MUON310 PHOS110 TPC210 TOF110 ZDC110 proofteam testusers4010 marco COMMON Not absolute quotas 18 registered groups ~ 60 users 165 users have used CAF: please register to groups!

Resource Monitoring ML ApMon running on each node Sends monitoring information each minute Default monitoring (Load, CPU, memory, swap, disk I/O, network) Additional information: PROOF and disk servers status (xrootd/olbd) Number of PROOF sessions (proofd master) Number of queued staging requests and hosted files (DS manager)

Status Table

lxb6047: 310 lxb6048: 309 lxb6049: 308 lxb6050: 308 lxb6051: 308 lxb6052: 309 lxb6053: 309 lxb6054: 0 lxb6055: 309 lxb6056: 311 lxb6057: 307 lxb6058: 308 lxb6059: 309 lxb6060: 310 lxb6061: 311 lxb6062: 309 lxb6063: 309 lxb6064: 307 lxb6065: 308 lxb6066: 1089 lxb6067: 309 lxb6068: 311 lxb6069: 309 lxb6070: 313 lxb6071: 311 lxb6072: 309 lxb6073: 312 lxb6074: 312 lxb6075: 310 lxb6076: 311 lxb6077: 309 lxb6078: 307 lxb6079: 312 lxb6080: lxb6047: lxb6048: lxb6049: lxb6050: lxb6051: lxb6052: lxb6053: lxb6054: 0 lxb6055: lxb6056: lxb6057: lxb6058: lxb6059: lxb6060: lxb6061: lxb6062: lxb6063: lxb6064: lxb6065: lxb6066: lxb6067: lxb6068: lxb6069: lxb6070: lxb6071: lxb6072: lxb6073: lxb6074: lxb6075: lxb6076: lxb6077: lxb6078: lxb6079: lxb6080: Hosted files and Disk Usage #Raw files: 11k #Sim files: 54k Raw on disk:154GB Sim on disk:4.5TB #Raw files: 11k #Sim files: 54k Raw on disk:154GB Sim on disk:4.5TB Number of FilesDisk Pool usage (Kb) Raw dataSim dataRaw dataSim data ESDs from RAW data production ready to be staged

Datasets (DS) are used to stage files from AliEn A DS is a list of files (usually ESDs or archives) registered by users for processing with PROOF DSs may share same physical files Staging script issues new staging requests and touch files every 5 mins Files are uniformly distributed by the xrootd data manager Interaction with the GRID

Dataset Manager The DS manager takes care of the quotas at file level Physical location of files is regulated by xrootd The DS manager daemon sends: The overall number of files Number of new, touched, disappeared, corrupted files Staging requests Disk utilization for each user and for each group Number of files on each node and total size

Dataset Monitoring - PWG1 is using 0% of 1TB - PWG3 is using 5% of 1TB - PWG1 is using 0% of 1TB - PWG3 is using 5% of 1TB

Datasets List /COMMON/COMMON/ESD5000_part | 1000 | /esdTree | | 50 GB | 100 % /COMMON/COMMON/ESD5000_small | 100 | /esdTree | | 4 GB | 100 % /COMMON/COMMON/run15034_PbPb | 967 | /esdTree | 939 | 500 GB | 97 % /COMMON/COMMON/run15035_PbPb | 962 | /esdTree | 952 | 505 GB | 98 % /COMMON/COMMON/run15036_PbPb | 961 | /esdTree | 957 | 505 GB | 99 % /COMMON/COMMON/run82XX_part1 | | /esdTree | | 289 GB | 99 % /COMMON/COMMON/run82XX_part2 | | /esdTree | | 289 GB | 92 % /COMMON/COMMON/run82XX_part3 | | /esdTree | | 288 GB | 94 % /COMMON/COMMON/sim_160000_esd | 95 | /esdTree | 9400 | 267 MB | 98 % /PWG0/COMMON/run30000X_10TeV_0.5T | 2167 | /esdTree | | 90 GB | 100 % /PWG0/COMMON/run31000X_0.9TeV_0.5T | 2162 | /esdTree | | 57 GB | 100 % /PWG0/COMMON/run32000X_10TeV_0.5T_Phojet | 2191 | /esdTree | | 83 GB | 100 % /PWG0/COMMON/run33000X_10TeV_0T | 2191 | /esdTree | | 108 GB | 100 % /PWG0/COMMON/run34000X_0.9TeV_0T | 2175 | /esdTree | | 65 GB | 100 % /PWG0/COMMON/run35000X_10TeV_0T_Phojet | 2190 | /esdTree | | 98 GB | 100 % /PWG0/phristov/kPhojet_k5kG_10000 | 100 | /esdTree | 1100 | 4 GB | 11 % /PWG0/phristov/kPhojet_k5kG_900 | 97 | /esdTree | 2000 | 4 GB | 20 % /PWG0/phristov/kPythia6_k5kG_10000 | 99 | /esdTree | 1600 | 4 GB | 16 % /PWG0/phristov/kPythia6_k5kG_900 | 99 | /esdTree | 1100 | 4 GB | 11 % /PWG2/COMMON/run82XX_test4 | 10 | /esdTree | 1000 | 297 MB | 100 % /PWG2/COMMON/run82XX_test5 | 10 | /esdTree | 1000 | 297 MB | 100 % /PWG2/akisiel/LHC500C0005 | 100 | /esdTree | 97 | 663 MB | 100 % /PWG2/akisiel/LHC500C2030 | 996 | /esdTree | 995 | 4 GB | 99 % /PWG2/belikov/40825 | 1355 | /HLTesdTree | | 143 GB | 99 % /PWG2/hricaud/LHC07f_160033DataSet | 915 | /esdTree | | 2 GB | 99 % /PWG2/hricaud/LHC07f_160038_root_archiveDataSet| 862 | /esdTree | | 449 GB | 100 % /PWG2/jgrosseo/sim_1600XX_esd | | /esdTree | | 103 GB | 98 % /PWG2/mvala/PDC07_pp_0_9_82xx_1 | 99 | /rsnMVTree | | 1 GB | 100 % /PWG2/mvala/RSNMV_PDC06_14TeV | 677 | /rsnMVTree | | 24 GB | 100 % /PWG2/mvala/RSNMV_PDC07_09_part1 | 326 | /rsnMVTree | | 5 GB | 100 % /PWG2/mvala/RSNMV_PDC07_09_part1_new | 326 | /rsnMVTree | | 5 GB | 100 % /PWG2/pganoti/FirstPhys900Field_ | 1088 | /esdTree | | 28 GB | 100 % /PWG3/arnaldi/PDC07_LHC07g_ | 615 | /HLTesdTree | | 787 MB | 94 % /PWG3/arnaldi/PDC07_LHC07g_ | 594 | /HLTesdTree | | 744 MB | 95 % /PWG3/arnaldi/PDC07_LHC07g_ | 366 | /HLTesdTree | | 513 MB | 99 % /PWG3/arnaldi/PDC07_LHC07g_ | 251 | /HLTesdTree | | 333 MB | 100 % /PWG3/arnaldi/PDC08_170167_001 | 1 | N/A | 33 MB | 0 % /PWG3/arnaldi/PDC08_LHC08t_ | 976 | /HLTesdTree | | 4 GB | 99 % /PWG3/arnaldi/PDC08_LHC08t_ | 990 | /HLTesdTree | | 4 GB | 100 % /PWG3/arnaldi/PDC08_LHC08t_ | 975 | /HLTesdTree | | 8 GB | 87 % /PWG3/arnaldi/myDataSet | 975 | /HLTesdTree | | 8 GB | 87 % /PWG4/anju/myDataSet | 946 | /esdTree | | 27 GB | 99 % /PWG4/arian/jetjet15-50 | 9817 | /esdTree | | 630 GB | 99 % /PWG4/arian/jetjetAbove_50 | 94 | /esdTree | 8000 | 7 GB | 85 % /PWG4/arian/jetjetAbove_50_real | 958 | /esdTree | | 73 GB | 94 % /PWG4/elopez/jetjet15-50_28000x | 7732 | /esdTree | | 60 GB | 95 % /PWG4/elopez/jetjet50_r27000x | 8411 | /esdTree | | 92 GB | 94 % Jury produced Pt specturm plots staging his own DS (run #40825, TPC+ITS, field on) Start staging common DSs of reconstructed runs? Jury produced Pt specturm plots staging his own DS (run #40825, TPC+ITS, field on) Start staging common DSs of reconstructed runs? ~ 4.7GB used out of 6GB (34 * 200MB - 10%)

Usages retrieved each 5 mins, averaged each 6 hours Compute new priorities applying a correction formula in [  *quota..  *quota] 100% f(x) =  q +  q*exp(kx) k = 1/q*Ln(1/4) 10% 40% quota (q) priorityMin priorityMax 0% 20% CPU Fairshare α = 0.5, β = 2 usage

Priorities are used for CPU fairshare and converge to quotas Usages are averaged to gracefully converge to quotas If no competition, users get max CPUs Only relative priorities are modified! Priority Monitoring

CPU quotas in practice - only PWGs + default groups - default usually has the highest usage - only PWGs + default groups - default usually has the highest usage

Query Monitoring When a user query completes, PROOF master sends statistics: Read bytes Consumed CPU time (base for CPU fairshare) Number of processed events User waiting time Values are aggregated per user and group

accumulated per interval Query Monitoring

Outlook User sessions monitoring in average 4-7 sessions in parallel (daily hours, EU time), with peek of users during the tutorial sessions: running history missing need to monitor #workers per user when load-based scheduling will be introduced Additional monitoring per single query (disk used and Files/sec not implemented yet) Network traffic correlation among nodes Xrootd activity with the new bulk staging requests Debug Tool to monitor and kill a hanging session when Reset doesn’t work (need to restart the cluster) Hardware New ALICE MAC cluster “ready” (16 workers) New IT 8-core machines coming Training PROOF/CAF is the key setup for interactive user analysis (and more) Number of people attending the monthly tutorial is increasing (20 persons last week!)