LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.

Slides:



Advertisements
Similar presentations
Delivering Experiment Software to WLCG sites A new approach using the CernVM Filesystem (cvmfs) Ian Collier – RAL Tier 1 HEPSYSMAN.
Advertisements

CREAM John Gordon GDB November CREAM number of sites now – gstat2 says 24. Batch systems supported Experiment Tests Feedback from sites. Evaluation.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
LCG Introduction John Gordon, SFTC GDB December 2 nd 2009.
GGUS summary ( 4 weeks ) VOUserTeamAlarmTotal ALICE ATLAS CMS LHCb Totals 1.
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
LCG Introduction John Gordon, STFC GDB September 9th 2009.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team 12 th CERN-Korea.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 November 2007.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Information System Status and Evolution Maria Alandes Pradillo, CERN CERN IT Department, Grid Technology Group GDB 13 th June 2012.
EMI INFSO-RI Accounting John Gordon (STFC) APEL PT Leader.
LCG Introduction John Gordon, STFC GDB September 14 th 2011.
LCG Introduction John Gordon, STFC GDB December14 th 2011.
LCG Introduction John Gordon, STFC GDB June 8 th 2011.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES GGUS Ticket review T1 Service Coordination Meeting 2010/10/28.
Procedure to follow for proposed new Tier 1 sites Ian Bird CERN, 27 th March 2012.
GGUS summary (4 weeks) VOUserTeamAlarmTotal ALICE1102 ATLAS CMS LHCb Totals
WLCG Service Report ~~~ WLCG Management Board, 7 th September 2010 Updated 8 th September
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
LCG Support for Pilot Jobs John Gordon, STFC GDB December 2 nd 2009.
Julia Andreeva on behalf of the MND section MND review.
LCG Pilot Jobs and glexec John Gordon.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
LCG WLCG Accounting: Update, Issues, and Plans John Gordon RAL Management Board, 19 December 2006.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
LCG Introduction John Gordon, STFC GDB July8th 2009.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
LCG User Level Accounting John Gordon CCLRC-RAL LCG Grid Deployment Board October 2006.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES CVMFS deployment status Ian Collier – STFC Stefan Roiser – CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
Testing CernVM-FS scalability at RAL Tier1 Ian Collier RAL Tier1 Fabric Team WLCG GDB - September
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Criteria for Deploying gLite WMS and CE Ian Bird CERN IT LCG MB 6 th March 2007.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
Status of gLite-3.0 deployment and uptake Ian Bird CERN IT LCG-LHCC Referees Meeting 29 th January 2007.
WLCG Service Report ~~~ WLCG Management Board, 10 th November
HEPiX IPv6 Working Group David Kelsey david DOT kelsey AT stfc DOT ac DOT uk (STFC-RAL) HEPiX, Vancouver 26 Oct 2011.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
APEL Architecture Alison Packer. Overview Grid jobs accounting tool APEL Client software - installed in sites (CEs, gLite- APEL node) APEL Server accepts.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Storage Accounting John Gordon, STFC OMB August 2013.
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
LCG Introduction John Gordon, STFC-RAL GDB June 11 th, 2008.
WLCG Operations Coordination report Maria Dimou Andrea Sciabà IT/SDC On behalf of the WLCG Operations Coordination team GDB 12 th November 2014.
Accounting Update John Gordon. Outline Multicore CPU Accounting Developments Cloud Accounting Storage Accounting Miscellaneous.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
LCG Introduction John Gordon, STFC GDB December 7 th 2010.
Communication, Communication, Communication
gLite->EMI2/UMD2 transition
John Gordon, STFC-RAL GDB 10 October 2007
How to enable computing
Update on Plan for KISTI-GSDC
The CREAM CE: When can the LCG-CE be replaced?
John Gordon, STFC-RAL GDB March 11, 2009
John Gordon, STFC-RAL GDB April 8th 2009
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
John Gordon, STFC GDB October 12th 2011
John Gordon, STFC GDB April 6th 2011
Cristina del Cano Novales STFC - RAL
Presentation transcript:

LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010

LCG Topics at September GDB OPN Monitoring APEL CERNVMFS Experiments’ Operational Issues (Quarterly) Others

LCG Missing a central view of LHCOPN HADES data exists (at DFN?) Prototype dashboard Site status is up when OWD between +/-15% from baseline and packet loss less than 0.1% per five minutes Site status is down when packet loss = 100% per five minutes Site status is degraded when measurement values are between a) and b). J. Shade/GDB LHCOPN Update3 Monitoring 08-SEP-2010

LCG J. Shade/GDB LHCOPN Update 4 Prototype Dashboard 08-SEP-2010

LCG J. Shade/GDB LHCOPN Update 5 Prototype Dashboard 08-SEP-2010

LCG DANTE baulked at the idea of developing their prototype further and supporting it  SARA and CERN have picked up the gauntlet. An historical view was requested and is foreseen. Questions raised about problem solving procedures. J. Shade/GDB LHCOPN Update6 Monitoring 08-SEP-2010

LCG APEL Update on latest status. Version using ActiveMQ message passing has been in production since June –New node type glite-apel replaces glite-MON. –Performant and reliable Sites encouraged to migrate Anticipate switching off central R-GMA registry at end of Requested WLCG input for EGI/EMI development plans 7

LCG CERNVMFS for Software Servers The stress on shared software servers has been an issue for experiment and site operations over the summer PIC and RAL have tested CERNVMFS as a mechanism for distributing experiment software from CERN to worker nodes. CERNVMFS was developed in OpenLab and has been used to build virtual machine images on demand with experiment software It uses squid caches to bring software to a site on demand and also caches on WN relieving pressure on site servers. Removes the need to run jobs to install software at site. Only caches versions used at that site. Removes duplicate files between and within releases. Initial feedback encouraging. Tests will be scaled up to full site in cooperation with experiments. ATLAS for now but other interested. 8

LCG Experiment Operations Feedback Alice were happy ATLAS raised the issue of disk server reliability. What they measured were the # incidents where a server was out >24 hours. This is a combination of hardware/software reliability and promptness of the site in restoring the service. Scope for standardising responses across Tier1s. –Concerns about ASGC performance CMS interested in CernVMFS work for their Tier3s. –Discussion around information publishing (related to L Field proposal on WLCG Information Officer) 9

LCG Experiment Operations Feedback LHCb have problems with differing configurations at sites. They believe they can adapt their use if they only have enough information. One suggestion would be a Site Card (cf the VO Card) which specified enough information about the site to enable LHCb to automate optimisation of their use. Discussion in the meeting doubted whether this could be automated and suggested one to one discussion with the site as a better route. 10

LCG gLite 3.1 Support Further work on retiring some glite 3.1 services. Glite developers have proposed the end of life of some services. WLCG asked for comment. – EGI Operations will plan with NGIs and their sites taking WLCG views on board. Potential gap in EMI support filled. Specific sites have agreed to continue middleware support of batch systems required by WLCG. This covers support of CE Information Providers, blahd, and APEL parser. 11

LCG Misc. Gstat – –announced new wlcg gstat to be checked by sites. –Gave Ian’s timeline glexec. –New Condor release over summer should address concerns of ATLAS. ATLAS and CMS asked to runs tests again with latest Condor. 12

LCG October GDB Feedback from the DAaMonstrators –What can they show now? –What will they deliver for the end of the year? –Review by panel early in new year. Security Incident response glite 3.1 retiral Installed capacity glexec testing 13