Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015.

Similar presentations


Presentation on theme: "Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015."— Presentation transcript:

1 Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015

2 Outline History Recent Progress Current Status Issues Plan John Gordon, MB July2015

3 History I first raised the issue of sites publishing details of the cores used per job at the EGI OMB last December with an update in January. There was some initial improvement but then progress flattened off. WLCG are now running many more multicore jobs and wish to see this reflected in accounting. Knowing the number of cores used is important in calculating the effective wallclock time and thus the overall occupancy of a cluster. John Gordon, MB July 2015

4 Recent Progress At the June meeting of the WLCG Grid Deployment Board I reported that 87% of LHC CPU use was now reported as coming from Sites/CEs which reported the number of cores per job. Since there were some obvious omissions from important sites and countries I was asked to address this. I raised tickets against all NGIs and gave them a link to the publishing of cores for June by their sites which run LHC work. This has mainly been successful. By the end of June we had 95% publishing By mid August 99.5% of cpu time was accounted by sites reporting cores. John Gordon, MB July 2015

5 Status 99.5% of CPU time has cores reported There are still about 60 sites who have published jobs without cores in the last few days but there is a long tail of failed jobs and rogue CEs that don’t amount to significant CPU use. There is a smaller number of sites with some or all CEs not reporting cores. There are few with problems not under their control Many have never responded to tickets. John Gordon, MB July 2015

6 Sept 2015 – 99.5% of cpu with cores

7 Germany John Gordon, MB July 2015 WLCG View

8 Within a Site John Gordon, MB June 2015

9 Issues DESY-HH – Outstanding ticket with ARC team – CREAM PBS publishes Processors=1 for multicore jobs. This may also be an issue at other sites. – MPPMU have the same ARC issue – DESY have patched the PBS parser and have successfully published cores. Currently cleaning up their old data. Insert footer here

10 SubmitHostCores=1 ce01.tier2.hep.manchester.ac.uk:8443/cream-pbs-gpu2018.22 atlas-ce-02.roma1.infn.it:8443/cream-lsf-atlasgshort1103.00 atlas-creamce-01.roma1.infn.it:8443/cream-lsf-atlasgshort932.60 ce03.clumeq.mcgill.ca:8443/cream-pbs-atlas_mcore700.47 ce02.clumeq.mcgill.ca:8443/cream-pbs-atlas_mcore699.19 grid-cr3.desy.de:8443/cream-pbs-mcore561.22 grid-cr0.desy.de:8443/cream-pbs-mcore561.10 grid-cr2.desy.de:8443/cream-pbs-mcore559.81 grid-cr1.desy.de:8443/cream-pbs-mcore535.88 gb-ce-amc.amc.nl:8443/cream-pbs-express330.34 ce.lsg.psy.vu.nl:8443/cream-pbs-express259.72 gb-ce-emc.erasmusmc.nl:8443/cream-pbs-long259.57 gb-ce-tud.ewi.tudelft.nl:8443/cream-pbs-long252.13 gb-ce-lumc.lumc.nl:8443/cream-pbs-medium200.88 gb-ce-amc.amc.nl:8443/cream-pbs-long150.30 ce.lsg.bcbr.uu.nl:8443/cream-pbs-long147.87 ce.lsg.psy.vu.nl:8443/cream-pbs-infra141.10 gridce03.ifca.es:8443/cream-sge-biomed138.22 dc2-grid-66.brunel.ac.uk:8443/cream-pbs-biomed117.50 gridce02.ifca.es:8443/cream-sge-biomed116.17 gb-ce-rug.sara.usor.nl:8443/cream-pbs-infra113.69 gb-ce-tud.ewi.tudelft.nl:8443/cream-pbs-infra112.74 ce.irb.egi.cro-ngi.hr:8443/cream-pbs-sunx2200111.43 cert-37.pd.infn.it:8443/cream-lsf-grid110.45 ce04.ncg.ingrid.pt:8443/cream-sge-opsgrid107.67 ce05.ncg.ingrid.pt:8443/cream-sge-opsgrid106.56 grid002.jet.efda.org:8443/cream-pbs-biomed105.80 ce06.ncg.ingrid.pt:8443/cream-sge-opsgrid105.78 ce02.lip.pt:8443/cream-sge-opsgrid102.73 cirigridce01.univ-bpclermont.fr:8443/cream-pbs-lhcb101.17 ce05.ncg.ingrid.pt:8443/cream-sge-dteamgrid100.55 Veracity of PBS cores CEs with eff>100% cpu/(wall*cores) >1200 CEs reporting Top 3 probably GPUs Then DESY Then NL sites, prob not HEP Several biomed CEs PT and NL sites also reporting cores>1 Worth investigation

11 What are yours? Those Were My Issues

12 Soon the dev portal will hold all historical APEL data. – This will only contain cores from when sites started publishing it. When this id done the Portal will use this data including cores for the production T1 and T2 reports and show the current dev tree(EMI3) in parallel with the current production EGI views. The portal is currently undergoing a major rewrite. When this is released next April it will use the data with cores only. WLCG is encouraged to track the portal developments. There will be a prototype demo at the EGI meeting in Bari with a webcast. Timescales

13 Summary John Gordon, MB July 2015


Download ppt "Multicore Accounting John Gordon, STFC-RAL WLCG Operations Coordination, 1 st October 2015."

Similar presentations


Ads by Google