CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t IT Monitoring WG IT/CS Monitoring System Virginie Longo September 14th 2011.

Slides:



Advertisements
Similar presentations
GridPP7 – June 30 – July 2, 2003 – Fabric monitoring– n° 1 Fabric monitoring for LCG-1 in the CERN Computer Center Jan van Eldik CERN-IT/FIO/SM 7 th GridPP.
Advertisements

Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Overview of network monitoring development at AMRES Slavko Gajin.
ActiveXperts Network Monitor Monitors servers, workstations and devices for availability Alerts and corrects.
Network Management Workshop intERlab at AIT Thailand March 11-15, 2008 Network Operations and Network Management.
Back to the Future Performance Management in an Open Source World.
CERN IT Department CH-1211 Genève 23 Switzerland t The Agile Infrastructure Project Monitoring Markus Schulz Pedro Andrade.
Chapter 19: Network Management Business Data Communications, 4e.
Monitoring a Large-Scale Network: Selecting the Right Tool Sayadur Rahman United International University & Network Manager, Financial Service.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
EHealth Network Monitoring Network Tool Presentation J. Gaston Senior Network Design Seminar Professor Morteza Anvari 10 December 2004.
Guide to TCP/IP, Third Edition Chapter 11: Monitoring and Managing IP Networks.
Academic Network - retrospective. Academic Network – University of Montenegro MREN’s technical body is Center of Information System (CIS) of University.
H-1 Network Management Network management is the process of controlling a complex data network to maximize its efficiency and productivity The overall.
1.  TCP/IP network management model: 1. Management station 2. Management agent 3. „Management information base 4. Network management protocol 2.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Shoes R’ Us Denean Delmundo & Jeremy Steele CST 412 Spring 2014.
Team Member: Dakuo Wang, Li Zhang, Xuejie Sun, Yang Liu NETWORK INFORMATION BASE (NIB) VISUALIZATION SYSTEM.
NMS Labs Mikko Suomi LAB1 Choose SNMP device managment software Features: –Gives Nice overview of network –Bandwith monitoring –Multible.
1 ESnet Network Measurements ESCC Feb Joe Metzger
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
WhatsUp Gold v15 – WhatsUp Companion 3.7 WhatsUp Companion Extended
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
BAI513 - PROTOCOLS SNMP BAIST – Network Management.
Overview of MSS System Human Actors Non-Human Actors In-house developed components Third party products.
Lec 3: Infrastructure of Network Management Part2 Organized by: Nada Alhirabi NET 311.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
CERN IT Department CH-1211 Genève 23 Switzerland t Using AI tools for IT-CS Spectrum-based monitoring Véronique Lefébure IT/CS-CE February.
workshop eugene, oregon What is network management? System & Service monitoring  Reachability, availability Resource measurement/monitoring.
Chapter 19: Network Management Business Data Communications, 4e.
CERN IT Department CH-1211 Geneva 23 Switzerland t Daniel Gomez Ruben Gaspar Ignacio Coterillo * Dawid Wojcik *CERN/CSIC funded by Spanish.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
Network Management Protocols and Applications Cliff Leach Mike Looney Danny Mar Monty Maughon.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
PosView Overall Architecture – Version 2 SNMP Agent MIB DB Discovery Engine Trap HandlerRequest Handler Polling Engine Logging Event Handler Alarm Handler.
Chapter 3  Network Implementation and Management Strategies 1 Chapter 3 Overview  Why is a network implementation strategy necessary?  Why is network.
Lemon Monitoring Presented by Bill Tomlin CERN-IT/FIO/FD WLCG-OSG-EGEE Operations Workshop CERN, June 2006.
Manchester University Tiny Network Element Monitor (MUTiny NEM) A Network/Systems Management Tool Dave McClenaghan, Manchester Computing George Neisser,
Network Management using OPENNMS System
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon for Quattor I.Fedorko CERN CF/IT 16 March 2011.
Internet2’s Dynamic Circuit Infrastructure Ciena CoreDirectors OSCARS + DRAGON for dynamic circuit allocation ION.
CERN IT Department CH-1211 Genève 23 Switzerland t MSG Status update Daniel Rodrigues.
Management of the LHCb Online Network Based on SCADA System Guoming Liu * †, Niko Neufeld † * University of Ferrara, Italy † CERN, Geneva, Switzerland.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Network management Network management refers to the activities, methods, procedures, and tools that pertain to the operation, administration, maintenance,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CF Monitoring: Lemon, LAS, SLS I.Fedorko(IT/CF) IT-Monitoring.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Company LOGO Network Management Architecture By Dr. Shadi Masadeh 1.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
CEG 2400 FALL 2012 Chapter 15 Network Management 1Network Management.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Lemon monitoring and Lemon Alarm System (sensors, exception, alarm)
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CC Monitoring I.Fedorko on behalf of CF/ASI 18/02/2011 Overview.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
© 2014 Level 3 Communications, LLC. All Rights Reserved. Proprietary and Confidential. Simple, End-to-End Performance Management Application Performance.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The featuring DownCollector Guillaume Cessieux.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Present and Future Pedro Andrade (CERN IT) 31 st August.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
Lec 3: Infrastructure of Network Management Part2 Organized by: Nada Alhirabi NET 311.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Storage ISM Management Pre-sales Product Training Materials Easy and Efficient WEU IT Solution Team.
OpenNMS Case Studies SCALE 5x 2007 Feb 10. Agenda ● What the heck is OpenNMS? ● What can it do? ● Case Studies – New Edge Networks – Hospitality Services.
Chapter 19: Network Management
Presented By: #NercompPDO3
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Network Monitoring System
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t IT Monitoring WG IT/CS Monitoring System Virginie Longo September 14th 2011

CERN IT Department CH-1211 Genève 23 Switzerland t Summary  CS Monitoring Systems Spectrum CA Performance Analysis Others Tools  Data storage  Requirements NMS Status Requirements Researches

CERN IT Department CH-1211 Genève 23 Switzerland t CS Monitoring systems

CERN IT Department CH-1211 Genève 23 Switzerland t Spectrum CA Description: Commercial Tool Fault management oriented system Root Cause Analysis/ alarm Correlation Topology View Service Manager => Relation With SLS View Basic Performance manager Volumes: ~3000 devices monitored Support 3K Laser devices for simple alarm (UP/DOWN) Thousands of attributes polled and analyzed 6GB of data events over 30 days Monitoring Protocols: SNMP and ICMP  Information only feed by SNMP (No remote agent) Few other support : DNS / DHCP / TRACEROUTE /NTP /HTTP Few home maid scripts for DHCP, web monitoring.

CERN IT Department CH-1211 Genève 23 Switzerland t Alarm Monitoring Spectrum Architecture (Storage system) Spectrum DB Models, topology, current polling value,alarms Oracle Stats (CSR) Oracle Stats (CSR) Oracle Alarm History (LANDB) Oracle Alarm History (LANDB) Spectrum System Non Spectrum system Mysql Events Mysql Events Remote Mysql Service Manager Remote Mysql Service Manager SLS Devices Info

CERN IT Department CH-1211 Genève 23 Switzerland t Performance Analysis Statistics Architecture - Mix home maid system and Spectrum tool - Extraction data from Spectrum to Oracle DB - Data consolidation into RRD. - Displayed on Netstat website (PHP). Volumes: - ~9000 models (port + devices) for 24K of RRDs - 36 Metrics Attributes - ~160K entries load into Oracle DB for 5MN of poll - Data kept 1 months for oracle - 2 years of consolidated data in RRDs. Note : Metric is a group of attributes such as Bandwidth = in/out bits and in/out packets.

CERN IT Department CH-1211 Genève 23 Switzerland t Performance Analysis

CERN IT Department CH-1211 Genève 23 Switzerland t Other Tools Syslog event recording - Gathering all log from network devices - Stored into Oracle DB - Accessible from CSDB - Filtering and propagation by notification LHCOPN : Perfsonar Tool - Decentralized networks tool - OWD, latency and throughput regular test - Other tools like traceroute - LHCOPN network analysis Implementation ongoing, testing phase with 1BG link, security tests not complete yet. (

CERN IT Department CH-1211 Genève 23 Switzerland t Data storage

CERN IT Department CH-1211 Genève 23 Switzerland t Data Storage Summary: Spectrum proprietary DBs for core and alarms Mysql database for events and service manager Oracle database for stats (CSR) and alarm history (LANDB) Oracle database for Syslog info Standalone Mysql database for Perfsonar tools.  Too many different type of storage.  Missing correlation between Syslog and SNMP

CERN IT Department CH-1211 Genève 23 Switzerland t Requirements

CERN IT Department CH-1211 Genève 23 Switzerland t NMS Status Advantages : - Root cause analysis efficient - Correct Event- Alarm management - High availability - Really good topology views (useful for intervention group) - Support NICE users - Very good level of filtering (topology, alarms) - Notification support Negative points / Weakness - Expensive - Polling limitation is almost reached (new version with complete redraw of polling system will arrive in 2 years) - Not a performance system: can’t handle 50K of statistics - Integration of non certificated manufacturer is complex - Data collection mostly limited to SNMP (changes ongoing)

CERN IT Department CH-1211 Genève 23 Switzerland t Requirements Mandatory:  Root Cause Analysis  High polling system :1-2mn for critical nodes 3-5mn for others  Network topology representation  Notifications (SMS/ MAIL/XMPP) and general console  Distributed environment  High Availability System  Complete performance management  IPv6 Support Nice to have :  Autodiscovery system  Mobile version  Oracle centralized database Numbers and storage time :  Polling capacity for at least 5K nodes  Performance statistics for 56K of ports  Data lifetime: 1 month without aggregation, max with aggregation  Devices Alarm: around 2 years

CERN IT Department CH-1211 Genève 23 Switzerland t Researches List of tools which fit better : Icinga: Nagios like (forked) (Not Yet Tested) Zabbix: Large polling scale, open source, notification, Oracle database, distributed (NYT) ( Solarwind: commercial but include performance and less expensive (NYT) Opennms :  Open source - Completely customizable  High polling system with distributed environment  Events correlation, Alarm management, notification  Many data collection support (SNMP, HTML, JMX, JDBC, NAGIOS-NSCLIENT) ( Links :

CERN IT Department CH-1211 Genève 23 Switzerland t Thanks Questions ?