Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

Introduction to CMS computing CMS for summer students 7/7/09 Oliver Gutsche, Fermilab.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
WLCG/8 July 2010/MCSawley WAN area transfers and networking: a predictive model for CMS WLCG Workshop, July 7-9, 2010 Marie-Christine Sawley, ETH Zurich.
WLCG ‘Weekly’ Service Report ~~~ WLCG Management Board, 22 th July 2008.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Status of CMS Matthew Nguyen Recontres LCG-France December 1 st, 2014 *Mostly based on information from CMS Offline & Computing Week November 3-7.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Experience with the WLCG Computing Grid 10 June 2010 Ian Fisk.
AMOD Report Doug Benjamin Duke University. Running Jobs last 7 days 120K MC sim Users MC Rec Group.
CMS STEP09 C. Charlot / LLR LCG-DIR 19/06/2009. Réunion LCG-France, 19/06/2009 C.Charlot STEP09: scale tests STEP09 was: A series of tests, not an integrated.
1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.
Wahid, Sam, Alastair. Now installed on production storage Edinburgh: srm.glite.ecdf.ed.ac.uk  Local and global redir work (port open) e.g. root://srm.glite.ecdf.ed.ac.uk//atlas/dq2/mc12_8TeV/NTUP_SMWZ/e1242_a159_a165_r3549_p1067/mc1.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
WLCG Service Report ~~~ WLCG Management Board, 9 th August
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
V.Ilyin, V.Gavrilov, O.Kodolova, V.Korenkov, E.Tikhonenko Meeting of Russia-CERN JWG on LHC computing CERN, March 14, 2007 RDMS CMS Computing.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics.
Marco Cattaneo LHCb computing status for LHCC referees meeting 14 th June
Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.
The CMS CERN Analysis Facility (CAF) Peter Kreuzer (RWTH Aachen) - Stephen Gowdy (CERN), Jose Afonso Sanches (UERJ Brazil) on behalf.
May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
Claudio Grandi INFN Bologna CERN - WLCG Workshop 13 November 2008 CMS - Plan for shutdown and data-taking preparation Claudio Grandi Outline: Global Runs.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
US-CMS T2 Centers US-CMS Tier 2 Report Patricia McBride Fermilab GDB Meeting August 31, 2007 Triumf - Vancouver.
LHCbComputing LHCC status report. Operations June 2014 to September m Running jobs by activity o Montecarlo simulation continues as main activity.
The CMS Computing System: getting ready for Data Analysis Matthias Kasemann CERN/DESY.
NA62 computing resources update 1 Paolo Valente – INFN Roma Liverpool, Aug. 2013NA62 collaboration meeting.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
LHCb report to LHCC and C-RSG Philippe Charpentier CERN on behalf of LHCb.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.
Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
McFarm Improvements and Re-processing Integration D. Meyer for The UTA Team DØ SAR Workshop Oklahoma University 9/26 - 9/27/2003
LHCb Current Understanding of Italian Tier-n Centres Domenico Galli, Umberto Marconi Roma, January 23, 2001.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Maria Girone, CERN  CMS in a High-Latency Environment  CMSSW I/O Optimizations for High Latency  CPU efficiency in a real world environment  HLT 
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
WLCG November Plan for shutdown and 2009 data-taking Kors Bos.
GGUS summary (3 weeks) VOUserTeamAlarmTotal ALICE7029 ATLAS CMS LHCb Totals
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Pledged and delivered resources to ALICE Grid computing in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
1. Maria Girone, CERN  Highlights of Run I  Challenges of 2015  Mitigation for Run II  Improvements in Data Management, Organized Processing and Analysis.
Maria Girone, CERN CMS Status Report Maria Girone, CERN David Lange, LLNL.
The Beijing Tier 2: status and plans
Xiaomei Zhang CMS IHEP Group Meeting December
AWS Integration in Distributed Computing
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Update on Plan for KISTI-GSDC
CMS transferts massif Artem Trunov.
LHCb Software & Computing Status
Luca dell’Agnello INFN-CNAF
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
ATLAS Sites Jamboree, CERN January, 2017
ALICE Computing Upgrade Predrag Buncic
The ATLAS Computing Model
Presentation transcript:

Claudio Grandi INFN Bologna CMS Operations Update Ian Fisk, Claudio Grandi 1

Claudio Grandi INFN Bologna 2 Year in Progress Machine performance has averaged to about what was planned for –Months with technical stops are lower, others are higher

Claudio Grandi INFN Bologna 3 Events CMS Trigger performance is about what was expected as well –This is counting all events with an expected 25% overlap Fluctuating around the nominal value of 375Hz Total events expected by this point was 1.3B events and 1.1B events are collected MonthAverage Trigger Rate (with overlap) March356Hz April334Hz May393Hz June431Hz July361Hz

Claudio Grandi INFN Bologna 4 Events Event size is very close to estimates –MC Reco is larger due to out of time pile-up, which wasn’t in the original planning In general the RECO time is about 20% higher than anticipated before the technical stop –After technical stop the reco time is essentially doubled TierObserved SizeExpectation Data RAW230kB390KB Data RECO590kB530kB Data AOD165kB200KB MC Reco970kB600kB MC AOD250kB265kB

Claudio Grandi INFN Bologna 5 Tier-0 We have observed periods of 100% utilization at the Tier-0 –Memory of the application is reducing the number of cores we can run –Newer version of the code is available

Claudio Grandi INFN Bologna 6 Tier-1 CPU Usage Pledge Utilization of Tier-1s has increased as we have ramped up more simulation production at Tier-1s –High utilization and good CPU efficiency

Claudio Grandi INFN Bologna 7 Tier-1 Storage Usage All the Disk is used by the LHC Experiments and CMS is a reasonable trajectory to use the tape pledge for 2011

Claudio Grandi INFN Bologna 8 Tier-2 Usage Generally Tier-2 CPU efficiency noticeably improves with the change to CMSSW_4_2 –Improvements in the IO layer –Good Usage over all

Claudio Grandi INFN Bologna 9 Simulation We’ve produced about 2B new events in 2011, and re- reconstructed an additional 1B –This is a little ahead of where we expected to be A reprocessing of the data sample is expected in the Fall Increase in the Production of FastSim

Claudio Grandi INFN Bologna 10 Analysis Usage Over the summer we were at 2M terminated jobs a week (1.5M analysis jobs) –200k Analysis jobs a day

Claudio Grandi INFN Bologna 11 Transfers Looking at the destination Tier-2 transfers it’s surprisingly stable over the year

Claudio Grandi INFN Bologna 12 Tier-3 Trying to continue making Tier-3s work well for analysis –Basically this is extra computing for analysis

Claudio Grandi INFN Bologna 13 Improvements from Last Year We have much better accounting of the data that is being accessed –CRAB reports to the Data Popularity service for every file access, CPU hour, and individual Project delivered to us by CERN IT-Experiment Support We are using the analysis data tiers as intended and the speed at which the collaboration switches between reprocessing passes seems to be much faster

Claudio Grandi INFN Bologna 14 Data Access The access integrated over week of the various data tiers in CMS –Good adoption of AOD and AOD SIM –Low user level access may indicate a lot of that access is outside CRAB

Claudio Grandi INFN Bologna 15 Example May10 Re-reco In May we transitioned to CMSSW_4_2 and reprocessed all the data collected up to that point in 2011 with 4_2 –May10 Re-reco replaced PromptReco V1, V2, and V3 –New Data was then Promptly Reconstructed with 4_2 –If a growing consistent dataset was needed, the analyzer had to switch

Claudio Grandi INFN Bologna 16 Most Popular Data Types May/June

Claudio Grandi INFN Bologna 17 Most Popular Data Types May/June Within about 2 weeks of the processing being finished more than half of the analysis CPU is on the new sample –16% on May Re-reco –12% on the original Prompt-Reco Number of individuals accessing the data is about the same

Claudio Grandi INFN Bologna Recent problems (last 2 weeks) myproxy upgrade on Aug 30 th (GGUS:73926) –initially problems with firewall Fixed in < 1 h –myproxy bug related to use of proxies created with and without the –r option blocked some analysis crab server uses -r, crab standalone, via gLite WMS, doesn’t) 8 hours of debugging; patch requested to developers, installed the next day –myproxy bug related to the use of wildcards blocked phedex 1.5 more days to debug the problem and find a workaround patched version of myproxy put in production the week after Alias for cmsdoc broken on Sep 6 th (GGUS:74099) –CRAB analysis blocked outside CERN –Fixed in 2 hours 18

Claudio Grandi INFN Bologna Data taking problems after TS 19 LSF replaced by transfermanager in Castor during the TS First stress as soon as data taking restarted. Many rfcps hanging impacting transfers from P5 and to T1s and T0 processing (GGUS:74085) Patch installed on 8 Sep solved the problem. Final patch expected this week Internal CMS problem caused by an inconsistency in the DB after the castor problem, blocked reprocessing of new runs for 24 h Now dealing with the backlog and with the increased processing times due to high PU –x2 processing time in express and 300 MB memory increase –x2.5 processing time in reco and 800 MB memory increase

Claudio Grandi INFN Bologna The problem seen by the T0 queue 20 Castor problem Internal CMS problem (no new runs acquired) Reabsorbing backlog and coping with increased PU