Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015.

Slides:



Advertisements
Similar presentations
UK NGI Operations John Gordon 10 th January 2012.
Advertisements

Storage Accounting John Gordon, STFC GDB June 2012.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI GPGPU Accounting John Gordon STFC 09/04/2013 EGI CF – Accounting and Billing1.
LCG Introduction John Gordon, SFTC GDB December 2 nd 2009.
ARC Accounting John Gordon. Limitations Resilience – Religious objection to using the BDII for service discovery so only one message broker is hardcoded.
Multicore Accounting John Gordon, STFC-RAL WLCG MB, July 2015.
Accounting Update Stuart Pullinger, STFC Scientific Computing Department, APEL Team GDB 10 th December 2014.
Storage Accounting John Gordon, STFC GDB March 2013.
EMI INFSO-RI Accounting John Gordon (STFC) APEL PT Leader.
LCG Introduction John Gordon, STFC GDB June 8 th 2011.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks John Gordon SA1 Face to Face CERN, June.
Accounting non-Grid Use John Gordon Management Board 7/6/2007.
APEL Cloud Accounting Status and Plans APEL Team John Gordon.
Accounting For Multicore Jobs John Gordon, STFC, UK Scientific Computing Department, APEL Team MB 17 th March 2015.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG Accounting/Reporting John Gordon, STFC MB November 9 th 2011.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI-InSPIRE APEL for Accounting John Gordon, Stuart Pullinger STFC.
RI EGI-InSPIRE RI UMD 2 Decommissioning Status Cristina Aiftimiei EGI.eu.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Requirements Stuart Pullinger STFC 09/04/2013 EGI CF – Accounting.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
Installation Accounting Status Flavia Donno CERN/IT-GS WLCG Management Board, CERN 28 October 2008.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Storage Accounting John Gordon, STFC OMB August 2013.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Accounting Requirements Stuart Pullinger STFC 09/04/2013 EGI CF – Accounting.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
John Gordon EMI TF and EGI CF March 2012 Accounting Workshop.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
CMS Multicore jobs at RAL Andrew Lahiff, RAL WLCG Multicore TF Meeting 1 st July 2014.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
WLCG Accounting Task Force Introduction Julia Andreeva CERN 9 th of June,
HTCondor Accounting Update
HTCondor Accounting Update
Communication, Communication, Communication
EGI Operations Management Board
John Gordon STFC OMB 26 July 2011
The New APEL Client Will Rogers, STFC.
Benchmarking Changes and Accounting
Operational Tools Update OMB 27/07/2010
Update on Plan for KISTI-GSDC
The CREAM CE: When can the LCG-CE be replaced?
Raw Wallclock in APEL John Gordon, STFC-RAL
JRA1.4 - Account for different resource types
Accounting Requirements
John Gordon, STFC-RAL GDB April 8th 2009
WLCG Accounting Task Force Update Julia Andreeva CERN WLCG Workshop 08
John Gordon, STFC GDB October 12th 2011
John Gordon, STFC-RAL GDB June 6, 2007
APEL Storage Accounting
John Gordon, STFC GDB April 6th 2011
Cristina del Cano Novales STFC - RAL
All Wales Safeguarding Procedures Review Project
Peter Solagna – EGI Foundation
New Types of Accounting Beyond CPU
This is probably your first free Calendar for 2011
2300 (11PM) September 21 Blue line is meridian..
Accounting Repository
UMD 2 Decommissioning Status
UMD 2 Decommissioning Status
McDonald’s calendar 2007.
Strategic Planning Timeline Overview
APEL as a Global Accounting Repository
Teacher name August phone: Enter text here.
McDonald’s calendar 2007.
User Accounting Integration Spreading the Net.
2015 January February March April May June July August September
Presentation transcript:

Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015

Outline History Recent Progress Current Status Issues Plan John Gordon, MB July2015

History I first raised the issue of sites publishing details of the cores used per job at the EGI OMB last December with an update in January. There was some initial improvement but then progress flattened off. WLCG are now running many more multicore jobs and wish to see this reflected in accounting. Knowing the number of cores used is important in calculating the effective wallclock time and thus the overall occupancy of a cluster. John Gordon, MB July 2015

Recent Progress At the June meeting of the WLCG Grid Deployment Board I reported that 87% of LHC CPU use was now reported as coming from Sites/CEs which reported the number of cores per job. Since there were some obvious omissions from important sites and countries I was asked to address this. I raised tickets against all NGIs and gave them a link to the publishing of cores for June by their sites which run LHC work. This has mainly been successful. By the end of June we had 95% publishing By mid August 99.5% of cpu time was accounted by sites reporting cores. John Gordon, MB July 2015

Status 99.5% of CPU time has cores reported There are still about 60 sites who have published jobs without cores in the last few days but there is a long tail of failed jobs and rogue CEs that don’t amount to significant CPU use. There is a smaller number of sites with some or all CEs not reporting cores. There are few with problems not under their control Many have never responded to tickets. John Gordon, MB July 2015

Sept 2015 – 99.5% of cpu with cores

Germany John Gordon, MB July 2015 WLCG View

Within a Site John Gordon, MB June 2015

Issues DESY-HH – Outstanding ticket with ARC team – CREAM PBS publishes Processors=1 for multicore jobs. This may also be an issue at other sites. – MPPMU have the same ARC issue – DESY have patched the PBS parser and have successfully published cores. Currently cleaning up their old data. Insert footer here

SubmitHostCores=1 ce01.tier2.hep.manchester.ac.uk:8443/cream-pbs-gpu atlas-ce-02.roma1.infn.it:8443/cream-lsf-atlasgshort atlas-creamce-01.roma1.infn.it:8443/cream-lsf-atlasgshort ce03.clumeq.mcgill.ca:8443/cream-pbs-atlas_mcore ce02.clumeq.mcgill.ca:8443/cream-pbs-atlas_mcore grid-cr3.desy.de:8443/cream-pbs-mcore grid-cr0.desy.de:8443/cream-pbs-mcore grid-cr2.desy.de:8443/cream-pbs-mcore grid-cr1.desy.de:8443/cream-pbs-mcore gb-ce-amc.amc.nl:8443/cream-pbs-express ce.lsg.psy.vu.nl:8443/cream-pbs-express gb-ce-emc.erasmusmc.nl:8443/cream-pbs-long gb-ce-tud.ewi.tudelft.nl:8443/cream-pbs-long gb-ce-lumc.lumc.nl:8443/cream-pbs-medium gb-ce-amc.amc.nl:8443/cream-pbs-long ce.lsg.bcbr.uu.nl:8443/cream-pbs-long ce.lsg.psy.vu.nl:8443/cream-pbs-infra gridce03.ifca.es:8443/cream-sge-biomed dc2-grid-66.brunel.ac.uk:8443/cream-pbs-biomed gridce02.ifca.es:8443/cream-sge-biomed gb-ce-rug.sara.usor.nl:8443/cream-pbs-infra gb-ce-tud.ewi.tudelft.nl:8443/cream-pbs-infra ce.irb.egi.cro-ngi.hr:8443/cream-pbs-sunx cert-37.pd.infn.it:8443/cream-lsf-grid ce04.ncg.ingrid.pt:8443/cream-sge-opsgrid ce05.ncg.ingrid.pt:8443/cream-sge-opsgrid grid002.jet.efda.org:8443/cream-pbs-biomed ce06.ncg.ingrid.pt:8443/cream-sge-opsgrid ce02.lip.pt:8443/cream-sge-opsgrid cirigridce01.univ-bpclermont.fr:8443/cream-pbs-lhcb ce05.ncg.ingrid.pt:8443/cream-sge-dteamgrid Veracity of PBS cores CEs with eff>100% cpu/(wall*cores) >1200 CEs reporting Top 3 probably GPUs Then DESY Then NL sites, prob not HEP Several biomed CEs PT and NL sites also reporting cores>1 Worth investigation

What are yours? Those Were My Issues

Soon the dev portal will hold all historical APEL data. – This will only contain cores from when sites started publishing it. When this id done the Portal will use this data including cores for the production T1 and T2 reports and show the current dev tree(EMI3) in parallel with the current production EGI views. The portal is currently undergoing a major rewrite. When this is released next April it will use the data with cores only. WLCG is encouraged to track the portal developments. There will be a prototype demo at the EGI meeting in Bari with a webcast. Timescales

Summary John Gordon, MB July 2015