EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks LHCOPN Operations update Guillaume Cessieux.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Network trouble ticket standardisation -
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.
INFSO-RI Enabling Grids for E-sciencE SRMv2.2 experience Sophie Lemaitre WLCG Workshop.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Ops WG Act 4 – Conclusion Guillaume.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What GGUS can do for you JRA1 All hands.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Steven Newhouse EGEE’s plans for transition.
John Gordon STFC-RAL Tier1 Status 9 th July, 2008 Grid Deployment Board.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Ops WG Act 5 Guillaume Cessieux (CNRS/IN2P3-CC,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROD model assessment ROC UKI John Walsh.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
EGEE-III Enabling Grids for E-sciencE EGEE and gLite are registered trademarks 2008 report on LHCOPN from ASPDrawer
LHCOPN operational working group Guillaume Cessieux (CNRS/FR-CCIN2P3 – EGEE SA2) third meeting CERN – December th, 2008
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Quality Plan for EGEE III Geneviève.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Report from GGUS BoF Session at the WLCG.
LHCOPN operational working group report Guillaume Cessieux (FR-CCIN2P3 / EGEE-SA2) on behalf of the Ops WG LHCOPN meeting, , Copenhagen.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-III Network activity overall Xavier.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
8 th CIC on Duty meeting Krakow /2006 Enabling Grids for E-sciencE Feedback from SEE first COD shift Emanoil Atanassov Todor Gurov.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Dashboard Cyril L’Orphelin - CNRS/IN2P3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC Security Contacts R. Rumler Lyon/Villeurbanne.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Vassiliki Pouli
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Task tracking SA3 All Hands Meeting Prague.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ENOC - Status and plans Guillaume Cessieux.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Standard network trouble tickets exchange.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Communication tools between Grid Virtual.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Operations procedures: summary for round table Maite Barroso OCC, CERN
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA3 partner collaboration tasks & process.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Alistair.
EGEE-III INFSO-RI Enabling Grids for E-sciencE COD20. June 2009 Helsinki R-COD in UKI Claire Devereux, Jeremy Coles & Co. COD-20,
LHCOPN: Operations status LHCOPN: Operations status cc.in2p3.fr Network team, FR-CCIN2P3 LHCOPN meeting, Barcelona,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-17
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
EGEE-II INFSO-RI Enabling Grids for E-sciencE NA2: Dissemination Outreach & Communication Hannelore Hämmerle – NA2 Activity Manager.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Networking support for EGEE III Xavier.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1 & SA2-ENOC Interactions status and plans.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Operations WS: Introduction & Objectives.
CERN IT Department CH-1211 Geneva 23 Switzerland t James Casey CCRC’08 April F2F 1 April 2008 Communication with Network Teams/ providers.
LHCOPN operational model - 4 use-cases Guillaume Cessieux (FR-CCIN2P3 / EGEE networking support) on behalf of the Ops WG LHCOPN meeting, , Berlin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Operational model: Roles and functions.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN operations Presentation and training.
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What all NGIs need to do: Helpdesk / User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Best Practices and Use cases David Bouvet,
LHCOPN operational model Guillaume Cessieux (CNRS/FR-CCIN2P3, EGEE SA2) On behalf of the LHCOPN Ops WG GDB CERN – November 12 th, 2008.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-16 (Transition to EGEE-III) Report to.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-17
Scuola Grid - Martina Franca, Thursday 08 November Il Sistema di Supporto INFNGrid & GGUS ( Global Grid User.
Site notifications with SAM and Dashboards Marian Babik SDC/MI Team IT/SDC/MI 12 th June 2013 GDB.
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ENOC status LHC-OPN meeting – ,
Enabling Grids for E-sciencE EGEE-II INFSO-RI ROC managers meeting at EGEE 2007 conference, Budapest, October 1, 2007 Admin Matters Vera Hanser.
LHCOPN operational handbook Documenting processes & procedures Presented by Guillaume Cessieux (CNRS/IN2P3-CC) on behalf of CERN & EGEE-SA2 LHCOPN meeting,
LHCOPN Operations: Yearly review
‘s tools targeted to be useful for COD activity
Presentation transcript:

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks LHCOPN Operations update Guillaume Cessieux (CNRS/IN2P3-CC, EGEE SA2) LHCOPN meeting, , London

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Background LHCOPN meeting, , Bologna – –Outcomes  Tools and processes are ok  Only 25% of “significant” events seems reported within GGUS  Need to take lessons from first production period to improve Ops Ops phoneconf, – –Detailed reports about GGUS tickets requested (NL-T1) Ops WG6, , CERN – 2

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Agenda Ops model deployment status Tools –GGUS –Twiki Processes –KPIs –Improving LHCOPN Operations 3

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Operation: Implementation status 4 Trained R/W Access to the twiki verified Access to the TTS verified Started Ops production mode Review of twiki CA-TRIUMF Partial CH-CERN DE-KIT ES-PIC (Twiki access issue) FR-CCIN2P IT-INFN-CNAF NDGF NL-T TW-ASGC UK-T1-RAL Postponed US-FNAL-CMS Started US-T1-BNL (twiki access issue) Deployment completed ~10 months

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Twiki (1/2) Twiki review is really taking too long (started ) 5

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Twiki (2/2) Twiki review –Serialised –Thought to be important –Why this is not going ahead? Is the process not ok? Benefit of ongoing review –Clarified if some links are in the LHCOPN or not –Lot wrong content updated  IP prefixes, technical contacts 6

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Around GGUS (1/2) Should we integrate/merge LHCOPN helpdesk within the standard GGUS? +Consider networks like other resources (computing, storage...) +Maybe better fit in reporting reports +Now standard way to send enquiries to sites? +Maybe some central manpower could be gained +Regularly chasing pending tickets... +Less specific software and support from GGUS –We have something stable and working –Previously completely tailored for us –Be far from interferences with Grid world No strong preference from the GGUS team 7

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Around GGUS (2/2) Items remaining on the todo list – –Rejected: interface –Done: Improvement in notifications templates –Detailed reports: A CSV output of GGUS ticket’s will be available Support might be “slightly” reduced after EGEE –Better to have all major requests in the status list before 8

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX First KPIs We ended with only two KPIs for the moment –KPIs will be computed before each LHCOPN Ops Phoneconf and reviewed during  Next Ops phoneconf is KPI-1: Number of events >= 1 hours per site with number of corresponding GGUS tickets –Main objective is to ensure we have at least tickets for “major” events –Only correlated KPI were said really interesting  –Currently based on CERN’s spectrum BGP monitoring data  Link level, not service level  How to account work of CH-CERN? 9

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX KPI-1 on the last 6 months 10

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX KPI-2 Ensuring backup tests are performed –KPI-2: Number of missing entries in the twiki table reporting backup tests results (or demonstration of resiliency during previous failure)  “Any expected resilience possibility should be demonstrated once a year” Routing policies to be documented to know what is expected  Results in 2009 –No entry for CA-TRIUMF, ES-PIC, IT-INFN-CNAF, NDGF, TW- ASGC, (UK-T1-RAL) and US-T1-BNL –Missing entries for FR-CCIN2P3, NL-T1 and US-FNAL-CMS –Only CH-CERN and DE-KIT seem ok –Unclear which resiliency is expected for pure T1-T1 links 11

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Improving operations (1/3) KPIs are clearly showing there is a place for improvement –Have we only a lack of effort around Ops? Feedback is that some actions appear not vital –“We should focus on events impacting the service delivered by the LHCOPN” –Issues which matter to users We miss a clear service definition –If a path remains up even if a link is down is this an issue requiring strong attention? –LHCOPN SLD activity being carried by WLCG / CERN  Formalise and shape operations around We miss a service view “Ping view” –Ops requirements were pushed to the Monitoring WG  We need to see what we are missing or not 12

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Improving operations (2/3) No one was willing to take responsibility of stressing sites –Then idea of a rotating representative was proposed  Existing people working around LHCOPN ops within sites  Acting on behalf of the community, not the site he/she is from Light central coordination ensuring things are done Rotating time: 1 week?  Proposed duties Chasing up pending GGUS tickets oEnforce closing of pending GGUS tickets Attending WLCG daily phoneconf once a week? oMore only on request if really required Ensuring tickets are opened for detected events >= 1h Monthly review of the correlation with monitoring Can this be realistically working? 13

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Improving operations (3/3) Automatic opening of GGUS tickets from monitoring system –Requirement is this need to be reliable, accurate and wise –Interface with the monitoring need to be clever enough –Nice to have but to be implemented very carefully –Monitoring can only wisely open tickets after a certain amount of time (e.g 1h to know we have a significant event...) –Fearing then some passivity from sites only waiting tickets to arrive Ops WG is near to fall below a critical mass –Volunteers welcome, particularly from sites  Meetings at CERN and EVO enabled  Even passive reviews of agenda/conclusions are welcome 14

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Conclusion Deployment of Ops model is completed LHCOPN operations are in a steady state –For the good... but also the bad Improvement process heavily awaiting outcomes from two areas 1.LHCOPN Service Level Definition - WLCG / CH-CERN 2.Monitoring – LHCOPN Monitoring Working Group (CH-CERN) Currently focus is on tightening operations around “major” network events 15

Enabling Grids for E-sciencE LHCOPN meeting, , London GCX Pending questions 1.Review of twiki, what’s wrong? 2.Merging LHCOPN helpdesk within standard GGUS? 3.Rotating LHCOPN representative? 4.Automatic opening of GGUS tickets from monitoring system? 16