Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 VO User Team Alarm Total ALICE 1 2 ATLAS CMS 4 LHCb 20

Similar presentations


Presentation on theme: "1 VO User Team Alarm Total ALICE 1 2 ATLAS CMS 4 LHCb 20"— Presentation transcript:

1 1 VO User Team Alarm Total ALICE 1 2 ATLAS 14 116 6 136 CMS 4 LHCb 20
GGUS summary (2 weeks) VO User Team Alarm Total ALICE 1 2 ATLAS 14 116 6 136 CMS 4 LHCb 20 22 Totals 137 9 166 1

2 Support-related events since last MB
We need WLCG shifters, alarmers, management to give us meaningful values for the GGUS ‘Problem Type’ field, in order for periodic reporting to show better weak areas in support. There were 9 ALARM tickets since the last MB (2 weeks), 5 of which were real, all submitted by ATLAS. Details follow… 6/27/2018 WLCG MB Report WLCG Service Report

3 ATLAS ALARM->CERN-CNAF transFers
What time UTC What happened 2010/10/05 9:13 GGUS ALARM ticket opened, automatic notification to AND automatic assignment to ROC_Italy. 2010/10/05 10:23 Site acknowledges ticket and finds a StoRM backend problem. 2010/10/05 12:03 Service restored. Site puts the ticket to ‘solved’ and refers to GGUS:62745 for details. 2010/10/11 Submitter ‘verifies’ ticket GGUS: Not sure how ‘symptomatic’ the solution was… 6/27/2018 WLCG MB Report WLCG Service Report

4 ATLAS ALARM->transFers to .fr cloud
What time UTC What happened 2010/10/08 5:56 GGUS ALARM ticket opened, automatic notification to AND automatic assignment to NGI_France. 2010/10/08 6:31 Site acknowledges ticket and finds a network problem preventing all DB server access. 2010/10/08 7:29 Service restored. 2010/10/08 10:41 Site puts ticket to status ‘solved’. 6/27/2018 WLCG MB Report WLCG Service Report

5 ATLAS ALARM-> CERN slow lsf
What time UTC What happened 2010/09/27 15:34 GGUS ALARM ticket opened, automatic notification to AND automatic assignment to ROC_CERN. 2010/09/27 16:01 Operator acknowledges ticket and contacts the expert. 2010/09/27 16:37 Expert’s 1st diagnosis. Too many queries. 2010/09/27 20:10 Service mgr kills a home-made robot by another experiment launching >> bjob queries and puts ticket to status ‘solved’. 6/27/2018 WLCG MB Report WLCG Service Report

6 ATLAS ALARM-> CERN slow AFS
What time UTC What happened 2010/10/01 7:13 GGUS ALARM ticket opened, automatic notification to AND automatic assignment to ROC_CERN. 2010/10/01 7:33 Operator acknowledges ticket and contacts the expert. 2010/10/01 9:37 IT Service manager re-classifies in CERN Remedy PRMS. 2010/10/11 15:33 Still ‘in progress’. Reminder sent during this drill. 6/27/2018 WLCG MB Report WLCG Service Report

7 ATLAS ALARM-> CERN CASTOR
What time UTC What happened 2010/10/01 16:24 GGUS ALARM ticket opened, automatic notification to AND automatic assignment to ROC_CERN. 2010/10/01 16:41 Operator acknowledges ticket and contacts the expert. 2010/10/01 16:42 Expert starts investigation. 2010/10/01 17:23 Solved. PutDONE in SRM not propagated to CASTOR. Done by hand. 6/27/2018 WLCG MB Report WLCG Service Report


Download ppt "1 VO User Team Alarm Total ALICE 1 2 ATLAS CMS 4 LHCb 20"

Similar presentations


Ads by Google