CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Using AI tools for IT-CS Spectrum-based monitoring Véronique Lefébure IT/CS-CE February.

Slides:



Advertisements
Similar presentations
Steve Lewis J.D. Edwards & Company
Advertisements

Creating the global research village The DANTE NOC Network Monitoring System Xavier Martins-Rivas, DANTE TNC 2010, Vilnius, 2 nd June 2010.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS TSM CERN Daniele Francesco Kruse CERN IT/DSS.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
CERN IT Department CH-1211 Genève 23 Switzerland t Messaging System for the Grid as a core component of the monitoring infrastructure for.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service-Now UDS training [Jan 2011] - 1 Service-now training for UDS Service-now training.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
CERN IT Department CH-1211 Geneva 23 Switzerland t Problem management AI Thursday meeting 02/10/2014.
CERN - IT Department CH-1211 Genève 23 Switzerland t Partitioning in COOL Andrea Valassi (CERN IT-DM) R. Basset (CERN IT-DM) Distributed.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Performant and Future Proof: MySQL, Memcache and Raspberry Pi.
CERN IT Department CH-1211 Genève 23 Switzerland t Service Management GLM 15 November 2010 Mats Moller IT-DI-SM.
CERN IT Department CH-1211 Genève 23 Switzerland t Experience with Windows Vista at CERN Rafal Otto Internet Services Group IT Department.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Cross Platform Browser Support Tim Bell 15.
CERN IT Department CH-1211 Genève 23 Switzerland t ITIL at CERN Tony Cass HEPiX LBL, 29 th October 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
1 24x7 support status and plans at PIC Gonzalo Merino WLCG MB
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal Database Selection Tim Bell 6 th June.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Network as a Service Use cases for load balancing.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
CERN IT Department CH-1211 Genève 23 Switzerland t MSG status update Messaging System for the Grid First experiences
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overlook of Messaging.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Tape Monitoring Vladimír Bahyl IT DSS TAB Storage Analytics.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
CERN IT Department CH-1211 Genève 23 Switzerland t IT Monitoring WG IT/CS Monitoring System Virginie Longo September 14th 2011.
CERN IT Department CH-1211 Genève 23 Switzerland t Application security (behind Oracle roles and profiles) Miguel Anjo 8 th July 2008 Database.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
CERN IT Department CH-1211 Geneva 23 Switzerland t CCRC’08 Tools for measuring our progress CCRC’08 F2F 5 th February 2008 James Casey, IT-GS-MND.
Nicole Crémel, Elias Alvarez, John Hefferman (SNOW developers) IT / Service Management 16/07/2013 1Nicole Crémel IT-DI-SM.
CERN IT Department CH-1211 Genève 23 Switzerland t 24x7 Service Support Tony Cass LCG GDB, 24 th November 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t Towards agile software development Marwan Khelif IT-CS-CT IT Technical Forum – 31th May.
CERN IT Department CH-1211 Genève 23 Switzerland PES 1 Ermis service for DNS Load Balancer configuration HEPiX Fall 2014 Aris Angelogiannopoulos,
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Castor incident (and follow up) Alberto Pace.
CERN IT Department CH-1211 Genève 23 Switzerland t DM Database Monitoring Tools Database Developers' Workshop CERN, July 8 th, 2008 Dawid.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES GGUS Ticket review T1 Service Coordination Meeting 2010/10/28.
GGUS summary (4 weeks) VOUserTeamAlarmTotal ALICE1102 ATLAS CMS LHCb Totals
CERN General Infrastructure Services Department CERN GS Department CH-1211 Geneva 23 Switzerland Db Futures Workshop
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CF Monitoring: Lemon, LAS, SLS I.Fedorko(IT/CF) IT-Monitoring.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Alarming with GNI VOC WG meeting 12 th September.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Dashboard Latency monitoring (Update) Alexandre Beche.
Quality assurance - documentation and diagnostics during interventions Corrective maintenance seen from the Technical Infrastructure operation Peter Sollander,
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
CERN - IT Department CH-1211 Genève 23 Switzerland Tape Operations Update Vladimír Bahyl IT FIO-TSI CERN.
1 CERN IT Department CH-1211 Genève 23 Switzerland t Risk of network incident during the last LHC run CERN, 10 January 2013
CERN - IT Department CH-1211 Genève 23 Switzerland CCRC Tape Metrics Tier-0 Tim Bell January 2008.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Infrastructure EMI Kickoff Meeting.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
CERN IT Department CH-1211 Geneva 23 Switzerland t James Casey CCRC’08 April F2F 1 April 2008 Communication with Network Teams/ providers.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Migration of the ITCM workflow from Remedy to Service-Now.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
CERN - IT Department CH-1211 Genève 23 Switzerland t Improving CERN AV Workflow 10 th January 2011 Thomas Baron, Jacques Fichet, Tim Smith.
CERN IT Department CH-1211 Genève 23 Switzerland t Bamboo users meeting IT-CS-CT.
Service Management Nicole Crémel, IT-DI-SM and John Hefferman (SNOW developer) Nicole Crémel IT-DI-SM 10 October
CERN IT Department CH-1211 Genève 23 Switzerland t EIS Section input to GLM For GLM attended by Director for Computing.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
CERN - IT Department CH-1211 Genève 23 Switzerland t Improving CERN AV Workflow 13 nth December 2010 Thomas Baron, Jacques Fichet, Tim Smith.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
1 VO User Team Alarm Total ALICE ATLAS CMS
1 VO User Team Alarm Total ALICE ATLAS CMS
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Using AI tools for IT-CS Spectrum-based monitoring Véronique Lefébure IT/CS-CE February 2014

CERN IT Department CH-1211 Genève 23 Switzerland t Content SNOW tickets Monitoring data storage

CERN IT Department CH-1211 Genève 23 Switzerland t Operator’s role today Checks Spectrum screen:

CERN IT Department CH-1211 Genève 23 Switzerland t Operator Sees “Critical (red)” alarms Follows SNOW KB procedure –Possibly calls expert –And/or opens SNOW ticket: link to SNOW ticket form –for Firstline or for Wigner support: »2 SNOW Record Producer forms –Copy and paste information –Types INC ID (and comments) into Spectrum alarm info:

CERN IT Department CH-1211 Genève 23 Switzerland t Operator vs Netcom team During Working HoursOutside Working Hours Netcom TeamCritical (red) alarms Major (orange) alarms OperatorCritical (red) alarms after 10 minutes Critical (red) alarms Netcom follows alarm  procedure link (Sharepoint)

CERN IT Department CH-1211 Genève 23 Switzerland t Creation of SNOW tickets with GNI Avoids “copy-and-paste” Useful both for Operator and Netcom team Easier follow-up on alarms when there are a lot of them Correlation between alarms and SSB interventions or incidents GNI dashboard  correlation between Network alarms and other alarms Need: INC ID back from GNI (and not EVT ID)

CERN IT Department CH-1211 Genève 23 Switzerland t Data storage CS uses Spectrum for storage of –SNMP Events and alarms for last days (limited by Spectrum MySQL DB size) –Service Outages CS has home-made storage system for –Alarm long-term history –Part of statistics (in RRD files) –SYSLOG data CS provides info to SLS CS lacks –Storage for the rest of statistics –Correlation engine between SNMP and SYSLOG data (for vendors with no syslog trap support)