S. González de la Hoz, Step09 at IFIC, 2009-06-022 Spanish ATLAS Tier2 meeting Step09 at IFIC Javier Sánchez Alejandro Lamas Santiago González IFIC – Institut.

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

Introduction CSCI 444/544 Operating Systems Fall 2008.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
LT 2 London Tier2 Status Olivier van der Aa LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza,
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
Analysis in STEP09 at TOKYO Hiroyuki Matsunaga University of Tokyo WLCG STEP'09 Post-Mortem Workshop.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
UKI-LT2-RHUL ATLAS STEP09 report Duncan Rand on behalf of the RHUL Grid team.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
EXPERIENCE WITH ATLAS DISTRIBUTED ANALYSIS TOOLS S. González de la Hoz L. March IFIC, Instituto.
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.
Presented by: Santiago González de la Hoz IFIC – Valencia (Spain) Experience running a distributed Tier-2 and an Analysis.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
Status of BESIII Distributed Computing
OPERATING SYSTEMS CS 3502 Fall 2017
Computing Operations Roadmap
The EDG Testbed Deployment Details
The Beijing Tier 2: status and plans
Xiaomei Zhang CMS IHEP Group Meeting December
“A Data Movement Service for the LHC”
LCG Service Challenge: Planning and Milestones
Virtualization and Clouds ATLAS position
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
StoRM: a SRM solution for disk based storage systems
Diskpool and cloud storage benchmarks used in IT-DSS
Summary on PPS-pilot activity on CREAM CE
ATLAS Distributed Analysis tests in the Spanish Cloud
Service Challenge 3 CERN
Data Challenge with the Grid in ATLAS
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
INFN-GRID Workshop Bari, October, 26, 2004
Bernd Panzer-Steindel, CERN/IT
Update on Plan for KISTI-GSDC
Experience of Lustre at a Tier-2 site
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
RDIG for ALICE today and in future
STORM & GPFS on Tier-2 Milan
Conditions Data access using FroNTier Squid cache Server
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
WLCG Collaboration Workshop;
Grid Canada Testbed using HEP applications
ATLAS Distributed Analysis tests in the Spanish Cloud
The LHCb Computing Data Challenge DC06
TSDS - Texas Student Data System PEIMS
Presentation transcript:

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Step09 at IFIC Javier Sánchez Alejandro Lamas Santiago González IFIC – Institut de Física Corpuscular Centro mixto CSIC –Universitat de València VII Reunión Presencial del Tier-2 ATLAS España

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Step09 activities Dates May Setup June Running (N.B. 1 June is a holiday at CERN) Tier-2 Activities Tier-2s will participate in MC simulation and in user analysis components of the challenge. User analysis in the EGEE clouds will happen through both WMS job submission and through pilot jobs. supporting PanDA analysis, to have analysis queues (named ANALY_MYSITE)PanDA To facilitate this Tier-2s should configure their batch systems to accept the /atlas/Role=pilot VOMS role and set up the following shares: ActivityActivity VOMS Role ATLAS Batch System ShareVOMS RoleATLAS Batch System Share Production /atlas/Role=production 50% Analysis (pilot based) /atlas/Role=pilot 25% Analysis (WMS submission) /atlas 25% Data Distribution Total data volume to distribute to Tier-2s is 112TB during the 2 week test. This will be distributed through the ''functional test'' framework according to the shares here (50% IFIC, 25% IFAE, 25% UAM): Tier-1 Tier-1s should ensure they define file families for this project path in their tape system so that tapes can be effectively recycled at the end of the exercise. (N.B. this project will produce data in both the ATLASDATATAPE and ATLASMCTAPE areas.)

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Step09 feedback Main information requested for sites providing feedback to help with the evaluation of STEP09. Sites should of course feel free to be creative and report anything they think is interesting or useful. Tier2 Data Distribution See _and_thanks_for_all_the_d for sites who had problems (next slide) Did your site have problems with getting the large volume of data distributed during STEP09? Was your infrastructure adequate? Did you have enough headroom to cope with backlogs quickly? Did other activities, specifically user analysis, interfere?

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Step09 feedback

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Step09 feedback Workload N.B. Panda pilot based analysis used your site's ANALY queue local mover to copy data to the worker node; the WMS based analysis used a mixture of the file stager (also a copy to worker node method) and local LAN protocol (known as DQ2_LOCAL, resolved to dcap, rfio or xroot). Because of problems with the analyses producing too much output the file stager analyses were switched off in week 2. Number of jobs run each day for production, WMS analysis and pilot analysis. Efficiencies in the batch system for each type of job. The throughput (running jobs x job efficiency) of analysis jobs is a useful metric of the capacity of your site to perform analysis Do you know what the limitations are in your site infrastructure in delivering successful analysis results (network, i/o on servers or WNs, etc.). Information about fairshares: The request was for However, we think there were significant problems with delivering enough analysis pilots in some clouds ( slide 8) and many sites introduced caps to manage the analysis during the STEP challenge, which was more useful than letting things run wild. This probably overwhelmed any fairshare effects, but if you believe there were problems you might wish to comment. What specific tests would help you to improve analysis performance over the summer? A description of your network layout and disk servers might be useful for other sites.

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Data Transfer Data distribution in the week previous to the step09 exercise showed a bottleneck in the gridftp server. A new machine had been deployed on Sunday May 31st. Transfer bandwidth increased from 200 Mbps to 600 Mbps. During the exercise (weeks 23rd and 24th) data transfer rate was steady during long periods. No problems were observed in the network infrastructure neither local or remote.

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Data Transfer From Jun3 1 st to 14 th data size transferred to our SE is: 57 TBytes incoming (112TB total, 50% to IFIC ~56 TB) 1 TBytes outgoing distributed in: 44 TBytes /lustre/ific.uv.es/grid/atlas/atlasdatadisk/step GBytes /lustre/ific.uv.es/grid/atlas/atlasproddisk/step09 During the exercise disk went full two times. More space was added on the fly with the deployment of two new disk servers (17 TB + 34 TB) to reach 102 TBytes total capacity installed. Space tokens were updated to reflect the used and available space but: ATLAS tools seems to ignore the values and continue transfers even when no space available was published for a specific token.

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Data Transfer Data occupancy during this exercise increased exponentially. A quick reaction and the flexibility to add more servers easily to LUSTRE file system lead us to absorb the data without major problems.

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Data Transfer Worker nodes accessed the data using local posix I/O (read only) and gridftp protocols depending on the jobs policy. WNs and disk servers are connected to Cisco switch at 1 Gpbs. In the present configuration: WNs share 1 Gbps in groups of 8 disk servers share a 1 Gpbs in groups of 2. This will change in the near future to have a dedicated 1 Gpbs for every WN and a 2 Gbps for every disk server.

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Data Transfer High load have been observed sometimes but performance was reasonable Bandwidth graph. LAN Posix I/O (Lustre) + GridFTPBandwidth graph. LAN Posix I/O (Lustre) + WAN GridFTP

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing Pilot role was added in the VOMS mapping but no proper scheduling policy could be setup in time, so user jobs coming from panda and WMS were treated equally. In order to achieve this, a new account pool had to be setup but it was not easy with the present configuration tool (quattor) so we decided not to split the mappings. Fair share at June 8 th : Fair share at June 16 th : This unbalance is under investigation. Group%Tatget Atlas users Atlas production Group%Target Atlas user Atlas production

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing Occupancy was high and we added cores more in the last week (24). We had some instabilities in the CE during last Sunday. We think that the number of cores we added stressed the CPU power at the CE (Pentium D CPU 3.20GHz). An upgrade is foreseen in the coming weeks (hardware is already bought).

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing Job statistics from ANALY and IFIC queue from 17 th May to 17 th June. Problems with Athena release versions

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing In the first week we had a lot of problems with panda jobs trying to use an non existing Athena version at our site. The release was installed during the last week. The problem to install the version seems not to be related with our setup but with the procedure itself (installation bug). For that reason, our efficiency for pilot jobs was around 21% (ANALY_IFIC) while for WMS jobs (IFIC-LCG2_MCDISK) was around 82% as is shown in next table (17 th June).

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing After installing the Athena versions used for panda jobs our pilot job efficiency raised to 17% in a few minutes being around 100% in the last days. 100% efficiency

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing Next figures (next slide) are showing: IFIC efficiency CPU Wall/time CPU Percent Utilization Events per second (Hz) This is the event rate for the Athena execution only, i.e. the denominator is the time from Athena start to finish. Overall efficiency and details statistics for the Distributed Analysis test at IFIC:

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Job processing Influence on the input data protocol on CPU/Wall time efficiency using natural storage system protocol was tested in previous distributed analysis tests. Sites with POSIX-like access to the file system (Lustre) perform better without triggering copies in the background using lcg tools (file stager). The same results was found in this test for WMS jobs but not for PANDA pilot jobs, where the CPU/Wall time efficiency was reduced from 83.4% to 49.5 %. Similar behavior is observed in other sites using Lustre (like LIP- Lisbon), therefore it would be nice to investigate why using pilot jobs the CPU Wall time efficiency is reduced in a factor 2 for a (our) site. BECAUSE: Panda pilot based analysis used your site's ANALY queue local mover to copy data to the worker node; the WMS based analysis used a mixture of the file stager (also a copy to worker node method) and local LAN protocol (known as DQ2_LOCAL, resolved to dcap, rfio or xroot).

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Conclusions Data transfer rate was steady during long periods. No problems were observed in the network infrastructure neither local nor remote. Disk went full two times. More space was added to reached 102 TB Quick reaction and flexibility to add more servers easily to Lustre file system Space Tokens were updated but ATLAS tools seems to ignore the values. Our Worker Nodes accessed to the data using local posix I/O (read only) and gridftp protocols depending on the jobs policy. High load have been observed some times but performance was reasonable: In the present configuration WNs share a 1 Gbps in group of 8 while disk servers share a 1 Gbps in group of 2. THIS WILL CHANGE SOON. Pilot role was added in the VOMS mapping but not in the proper scheduling policy. New account pool had to be setup but not easy with quattor. The fair share between analysis and production was 50% in the first week but 70% production and 30% analysis the second week. Under investigation !!!

S. González de la Hoz, Step09 at IFIC, Spanish ATLAS Tier2 meeting Conclusions WN occupancy was high We added cores in the last week Instabilities in our CE probably because the high number of cores stressed the CPU’s CE. Upgrade in coming weeks!! Problems with pilot jobs in the first week Non existing Athena release version in our site. The problem to install the version seems not to be related with our setup but with the procedure itself (installation bug). Release installed as soon as possible Efficiency for pilot jobs was 21% while for WMS jobs was 82%. After installing the release pilot job efficiency in last days was 100%. CPU/Wall time for WMS jobs (~82%) is twice for pilot jobs (~21%). This is because panda pilot based analysis used your site's ANALY queue local mover to copy data to the worker node; the WMS based analysis used a mixture of the file stager (also a copy to worker node method) and local LAN protocol (known as DQ2_LOCAL, resolved to dcap, rfio or xroot). ATLAS Post Mortem There will be an ATLAS internal post mortem on the afternoon of 1 st July (40-R-C10 is booked - preliminary agenda soon) N.B. The WLCG post mortem is 9-10 July,