EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Feedback to sites from the VO auger Jiří Chudoba (Institute of Physics and.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyProxy and EGEE Ludek Matyska and Daniel.
Advertisements

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Pakiti.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
ATLAS DC2 seen from Prague Tier2 center - some remarks Atlas sw workshop September 2004.
Lessons for the naïve Grid user Steve Lloyd, Tony Doyle [Origin: 1645–55; < F, fem. of naïf, OF naif natural, instinctive < L nātīvus native ]native.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
GDB March User-Level, VOMS Groups and Roles Dave Kant CCLRC, e-Science Centre.
EGI-InSPIRE EGI-InSPIRE RI DDM Site Services winter release Fernando H. Barreiro Megino (IT-ES-VOS) ATLAS SW&C Week November
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
Networks ∙ Services ∙ People Enzo Capone (GÉANT) LHCOPN/ONE Meeting, LBL Berkeley (USA) LHCONE Application Pierre Auger Observatory 1-2 June.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Large Simulations using EGEE Grid for the.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Real Life Examples Tickets – Real life examples Mário David LIP - Lisbon.
13 October 2004GDB - NIKHEF M. Lokajicek1 Operational Issues in Prague Data Challenge Experience.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Old and New Requirements John Gordon Revised 22/3/12.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Ops Portal New Requirements.
Materials for Report about Computing Jiří Chudoba x.y.2006 Institute of Physics, Prague.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simulations and Offline Data Processing for.
SRM-2 Road Map and CASTOR Certification Shaun de Witt 3/3/08.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VO auger experience with large scale simulations on the grid Jiří Chudoba.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI SAM New Requirements from the SA1 Survey.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Requirements Status EGI.eu UCB
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Mario Reale – GARR NetJobs: Network Monitoring Using Grid Jobs.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
ATLAS Computing Wenjing Wu outline Local accounts Tier3 resources Tier2 resources.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Requirements Gathering Nuno L. Ferreira EGI.eu UCB
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Pierre Auger Observatory Jiří Chudoba Institute of Physics and CESNET, Prague.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Update on Service Availability Monitoring (SAM) Marian Babik, David Collados,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI APEL Regional Accounting Alison Packer (STFC) Iván Díaz Álvarez (CESGA) APEL.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI MPI VT report OMB Meeting 28 th February 2012.
Management of the Data in Auger Jean-Noël Albert LAL – Orsay IN2P3 - CNRS ASPERA – Oct Lyon.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
Introduction to CAST Technical Support
Bulk production of Monte Carlo
Xiaomei Zhang CMS IHEP Group Meeting December
Outline Benchmarking in ATLAS Performance scaling
Bulk production of Monte Carlo
EGEE VO Management.
Chapter 2: System Structures
CMS transferts massif Artem Trunov.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
WLCG Management Board, 16th July 2013
Short update on the latest gLite status
EMI 1 (Kebnekaise) Updates
Job workflow Pre production operations:
Artem Trunov and EKP team EPK – Uni Karlsruhe
Simulation use cases for T2 in ALICE
Discussions on group meeting
Pole 3 – Dashboard Assessment COD 20 - Helsinki
Chapter 14 User Datagram Protocol (UDP)
Introduction to CAST Technical Support
“All Lawson, All the Time!”
Search for coincidences and study of cosmic rays spectrum
Introduction to CAST Technical Support
EGEE Operation Tools and Procedures
Installation/Configuration
The LHCb Computing Data Challenge DC06
Presentation transcript:

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Feedback to sites from the VO auger Jiří Chudoba (Institute of Physics and CESNET) with input from the Auger production team (J.Lozano Bahilo, G.Rubio, M.D.Serrano - UGR) and Jean-Noel Albert (LAL)

EGI-InSPIRE RI The Observatory PAO is an astroparticle project to measure ultra–high energy cosmic rays See my talk on Friday for more details about the project

EGI-InSPIRE RI VO auger Mostly used for organized production of simulations of cosmic ray showers and detector response CORSIKA with different models - FORTRAN Offline code – C++, but many packages included (GEANT4, ROOT)

EGI-InSPIRE RI Sites supporting VO auger sites 10 countries How shall we acknowledge sites contribution?

EGI-InSPIRE RI Some issues feedback from sites change of VOMS server certificate too many jobs in queue hanging lcg-cp SE occupancy, data movement slow LFC response efficiency evaluation

EGI-InSPIRE RI Feedback from sites Production, VO management, data management, bulk transfers to SRB – done by geographically distributed team Sites should preferably handle all issues via GGUS We may not know about some problems sometimes we learn them only from sites we manage

EGI-InSPIRE RI Change of the VOMS certificate Change of the DN sites must download the new certificate from the CIC portal and reconfigure services broadcast message shall we create a GGUS ticket for each site? we did not succeed with the right configuration on our site at first attempt Can production continue? running jobs with proxy signed by the “old” VOMS server solution could be using two VOMS servers?

EGI-InSPIRE RI Too many waiting jobs Some sites reported too many (thousands) of waiting jobs in the auger queue The distribution is done by WMS servers, we do not send directly to sites wrong values in the BDII ? slow update? bug in WMS? We decreased the parameter submitted/running

EGI-InSPIRE RI Hanging jobs CORSIKA in infinite loop only a small fraction of jobs difficult to debug cpu is used, but there is no update of output files fixed by CORSIKA developers

EGI-InSPIRE RI Hanging jobs II lcg-cp used to download sw if not locally available It hanged in some cases very “expensive” error – jobslot blocked until job is killed on the walltime limit GGUS #90936 Jiri Horky debugged it, Michail Salichos provided a patch a lot of work, took more than 2 months should be fixed in the next release

EGI-InSPIRE RI SE Occupancy Production stores results on available SEs some sites excluded Can fill all available space Space tokens should be used to set quotas – AUGERPROD, limit write access to the production role We are unable to quickly response to requests to move TBs of data from a site there is not enough space on other sites

EGI-InSPIRE RI Data transfers to SRB Decommissioning of an SE with many auger files FTS transfers from Lille to Lyon 2 months, 1.9 M files, 38.7 TB less than 1% of lost files operations/day, 1300 ops/hour 650 GB/day, 27 GB/hour, 8 MB/s FTS transfers from Bordeaux to Lyon 1 month, 700 K files, 7.1 TB.6% of lost files operations/day, 500 ops/hour 160 GB/day, 7 GB/hour, 2 MB/s Many more small files in Bordeaux Large files stored to tapes in Lyon

EGI-InSPIRE RI Effectiveness evaluation Efficiency: cputime/walltime

EGI-InSPIRE RI Top ten VOs efficiency Efficiency of the biggest VOs for to

EGI-InSPIRE RI VO auger efficiency From to efficiency improves

EGI-InSPIRE RI Effectiveness evaluation Effectiveness = cputime of jobs with good output total walltime Difficult to estimate No information about cancelled or lost jobs Some jobs without job log file stored correct results Production maximizes throughput Each job processes 1 shower 5 times Jobs resent if not enough (<3) output files More detailed view from accounting portal could help Just one of many possible definitions

EGI-InSPIRE RI Instead of conclusions We thank all sites supporting the VO auger for their hardware resources and manpower support