Performance Tests of DPM Sites for CMS AAA Federica Fanzago on behalf of the AAA team.

Slides:

Advertisements

Similar presentations

LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.

Advertisements

Distributed Xrootd Derek Weitzel & Brian Bockelman.

Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.

GRID Workload Management System Massimo Sgaravatto INFN Padova.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.

Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.

1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.

Storage Wahid Bhimji DPM Collaboration : Tasks. Xrootd: Status; Using for Tier2 reading from “Tier3”; Server data mining.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.

CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano

EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.

Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.

Efi.uchicago.edu ci.uchicago.edu FAX status developments performance future Rob Gardner Yang Wei Andrew Hanushevsky Ilija Vukotic.

T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.

1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.

Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.

LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.

1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.

Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.

Stephen Gowdy FNAL 9th Feb 2015CMS Computing Model Simulation 1.

Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.

Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.

FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.

03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.

Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.

Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.

DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.

INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.

LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.

An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)

Any Data, Anytime, Anywhere Dan Bradley representing the AAA Team At OSG All Hands Meeting March 2013, Indianapolis.

SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,

DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.

DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.

Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.

SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.

Federating Data in the ALICE Experiment

Daniele Bonacorsi Andrea Sciabà

Jean-Philippe Baud, IT-GD, CERN November 2007

WLCG IPv6 deployment strategy

Kevin Thaddeus Flood University of Wisconsin

DPM at ATLAS sites and testbeds in Italy

Troubleshooting Tools

ATLAS Use and Experience of FTS

StoRM: a SRM solution for disk based storage systems

Introduction to Data Management in EGI

Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.

1 VO User Team Alarm Total ALICE ATLAS CMS

Ákos Frohner EGEE'08 September 2008

Any Data, Anytime, Anywhere

Presentation transcript:

Performance Tests of DPM Sites for CMS AAA Federica Fanzago on behalf of the AAA team

Introduction to AAA AAA=“Any data, Anytime, Anywhere” an effort to create a storage federation of the CMS sites AAA makes CMS data access transparent at any CMS sites sites’ content is federated on the fly using the native clustering of the xrootd framework Federica Fanzago - INFN Padova DPM workshop, Napoli

Why AAA The CMS data discovery system lets us know where data are stored around the world CMS sends “jobs to where the data is”, for analysis and reprocessing – design decision from 10 years ago, still valid some pitfalls: – jobs at a remote site may fail for data access reasons, e.g. a file “disappeared” – if a small number of sites has the right data, queue waiting time can be long – a data transfer preparatory step may be necessary before submitting the jobs Federica Fanzago - INFN Padova DPM workshop, Napoli

AAA goals Make all data available to any CMS physicist, anywhere, interactively – reliability: no access failure › improvement of process efficiency, CPU utilization – transparency: never notice where data actually reside › run jobs independently from data location – universality: fulfill the promise of opportunistic grid computing › much more flexible use of resources globally › allow jobs to run on sites not hosting data, only providing CPU Federica Fanzago - INFN Padova DPM workshop, Napoli

How it works In AAA the underlying technology is xrootd – interface with different storage backends sites in data federation subscribe to a hierarchical system of redirectors (local, regional and global) sites provide common namespace for data (LFN) applications access data by querying their regional xrootd redirector using LFN if the file is not found, the search falls back to the global one Federica Fanzago - INFN Padova DPM workshop, Napoli

Redirectors schema Federica Fanzago - INFN Padova DPM workshop, Napoli EU regional redirector US regional redirector global redirector User application Query using LFN Xrootd interface 1) Querying regional redirector 2) Searching file in subscribed sites Xrootd interface 3) if file not found: search in other regions querying the global redirector 3) If file is found: application is redirected to the xrootd server of site 5) If file is found in a site of another region, application is redirected here Sites have to: suscribe to the regional redirector provide xrootd access to data verify server isn’t blocked by firewall verify xrootd server requires GSI auth. Site A Site B Site C

AAA deployment First deployment at US sites, demonstrating AAA functionality and improving CMS analysis and reprocessing – fallback, error rate, queue time, etc. additional sites later joined federation – currently 60 sites (8 tier1s) › 18 DPM › 22 dCache › 5 Castor › 7 Hadoop/BeStMan › 2 Lustre / BeStMan › 6 StoRM Xrootd protocol is the common access interface Federica Fanzago - INFN Padova DPM workshop, Napoli

Performance assessment To evaluate the potential of data federation, CMS needs to understand the current performance of each site – how are the sites performing? Is their performance and quality of service sufficient? through “File opening and reading scale tests” CMS checks if sites are able to sustain the expected load for LHC Run2 Federica Fanzago - INFN Padova DPM workshop, Napoli

File opening and reading tests Tests emulate CMS jobs running at CMS sites, choosing the site through regional redirectors CMS target for initial tests: – File-opening test: access total rate of 100 Hz at a site – File-reading test: 600 jobs reading average rate of 2,5 MB every 10 s at a site  reading total rate of 150MB/s these numbers come from internal CMS analysis, based on historical figures tests reveal sites that need further optimization and possible improvements these are not meant to be a stress test for the site Federica Fanzago - INFN Padova DPM workshop, Napoli

How tests are run Sites have to provide a “special” path to allow redirector to match only the site we want to test: /store/xrootd/test/ /LFN tests are submitted from a condor pool in Wisconsin – necessary to correctly manage the ramp-up of running jobs submission to US sites goes through the Fermilab redirector; others through Bari list of input files obtained via Phedex (dataset required to be complete and on disk) Federica Fanzago - INFN Padova DPM workshop, Napoli

DPM sites in AAA CMS_SITE_NAME T2_AT_ViennaT2_RU_PNPI T2_FR_GRIF_IRFUT2_RU_RRC_KI T2_FR_GRIF_LLRT2_RU_SINP T2_FR_IPHCT2_TH_CUNSTDA T2_GR_IoanninaT2_TR_METU T2_HU_BudapestT2_TW_Taiwan T2_IN_TIFRT2_UA_KIPT T2_PK_NCPT2_UK_London_Brunel T2_PL_WarsawT1_TW_ASGC Sites in red are not ready for testing because ‘special path’ isn’t available Federica Fanzago - INFN Padova DPM workshop, Napoli

Opening plots for some DPM sites – Tests run up 100 jobs simultaneously, opening files at rate of 2 Hz each. TEST TARGET IS 100 Hz Federica Fanzago - INFN Padova DPM workshop, Napoli

Opening plots Federica Fanzago - INFN Padova DPM workshop, Napoli

Debugging Budapest Before tuning After tuning Strong collaboration between site manager and DPM developers Optimization of DPM configuration parameters Performance not optimal due to headnode’s hardware Federica Fanzago - INFN Padova DPM workshop, Napoli

Debugging Brunel Before tuning After tuning Performance not optimal but not due to hardware limitations. Multi-VO site so a slow rate can be due to contention with other VO’s Wait for the next DPM release? Federica Fanzago - INFN Padova DPM workshop, Napoli

Reading plots for some DPM sites TEST TARGET is 600 jobs, reaching 150 MB/s - Tests run up to 800 simultaneously jobs reading block of 2,5 MB every 10 s Sites didn’t reach the target. It could be a network or filesystem or disk bottleneck. Only the site monitoring of nodes’ load can say something Avg. Read time per 2.5 MB block Low time and flat trend is better Federica Fanzago - INFN Padova DPM workshop, Napoli

Reading plots 2 Federica Fanzago - INFN Padova DPM workshop, Napoli

Reading plots 3 Federica Fanzago - INFN Padova DPM workshop, Napoli

Issues found Some files are missing (“file not found”) giving a performance penalty up to 40 s -not synchronized info between files really stored in a site and CMS catalogues some false negatives (“files not found” even if they are in the storage) sometimes some files need more than 200 s to open the first file needs generally more time to be opened (redirector caching system of info) “no connection available”: comunication error with server failure during the file read (xrootd server goes down) Federica Fanzago - INFN Padova DPM workshop, Napoli

An evaluation attempt An attempt to classify sites as “good” or “to debug” is done using as threshold – opening test: access rate 6 % – reading test: the number of jobs 5 s The complete table with all the sites is on twiki: #Summary_table_of_EU_tests CMS_SITE_NAME T2_AT_Vienna td T2_FR_IPHCtd T2_TH_CUNSTDAtd T2_FR_GRIF_IRFU ok T2_HU_Budapestok T2_IN_TIFRtd T2_FR_GRIF_LLR ok T2_IN_TIFRtd T2_UK_London_Brunel ok Federica Fanzago - INFN Padova DPM workshop, Napoli

Open questions Can sites reach the needed figures with the resources they have? – access rate 100 Hz, reading rate 150MB/s Can we improve the overall performance that we see? Suggestions: – Make sure that the site is well tuned – Make sure that the site has updated sw What can DPM sysadmins do to help us? – Answer to the survey that has been sent to site managers – Suggestions are welcome Federica Fanzago - INFN Padova DPM workshop, Napoli

Other ideas These tests are useful to “debug” remote sites May be practical to run tests from a condor pool in EU An automatic system to run these tests every night (currently done in US, not in EU) Federica Fanzago - INFN Padova DPM workshop, Napoli

Conclusion CMS is exploring the current performance of remote sites joined AAA federation This exercise spots site configuration problems and infrastructure weaknesses With the collaboration of site managers, storage backend developers and the AAA team a lot can be done This common effort is a important component of readiness for the LHC Run2 Thanks a lot to all the collaborators A special thanks to Fabrizio Furano for helping us in site debugging and providing useful suggestions for this talk Federica Fanzago - INFN Padova DPM workshop, Napoli

Documentation and Feedback AAA tests twiki page nFileTests Survey for site managers DPM tuning hints ngHints Test code AAA support mailing list Federica Fanzago - INFN Padova DPM workshop, Napoli

Backup Federica Fanzago - INFN Padova DPM workshop, Napoli

Test benchmarks Minimum benchmarks were determined based upon a historical study of CMS jobs. An average CMS job opens a new file once per 1000 s and reads from a file at an average rate of 0.25MB/s. Assuming the worst case of jobs opening files at a site at once gives a benchmark of 100 Hz for file-opening rates For file reading, as assumption of 600 simultaneous jobs actively reading files from a site gives a total rate of 150 MB/s Federica Fanzago - INFN Padova DPM workshop, Napoli

Budapest details MySQL daemon: /etc/my.cnf  1500 (was not set) DMLite MySQL plugin: /etc/dmlite.conf.d/mysql.conf NoPoolSize  256 (was 32) Memcached  2GB (was non installed) Dpmmgr account has enough file descriptors – ulimit –n  (was 1k) Mysql user – Ulimit –n  Restarting daemons How fast are CPU and mysql disk – Pentium 8 year old (headnode) – During test: user CPU 30%, system CPU 25% and iowait 10% (what is producing iowait during metdata exercise?) Federica Fanzago - INFN Padova DPM workshop, Napoli