Presentation is loading. Please wait.

Presentation is loading. Please wait.

Campus Monitoring Service

Similar presentations


Presentation on theme: "Campus Monitoring Service"— Presentation transcript:

1 Campus Monitoring Service
KNOW What IS Working and What is NOT!

2 Eric Frahm, Technology Services
From internal tool, to campus-wide service Friendly UI, low learning curve Many canned reports, graphs, dashboards, and views True ‘out of band’ notification Flexible data input and output Extreme monitoring flexibility The Campus Monitoring Service provides; Any campus IT Professionals and departments Who can use the service? Originally conceived as an internal tool at ‘CITES’, this service was implemented as a ‘Campuswide’ service to meet the status sharing needs of Technology Services and to provide a turnkey solution for small department monitoring needs. Wizards for simple and well-known monitoring needs Canned reports, easy dashboards, and customizable views True out of band notifications with SMS modems and independent battery backup Multiple client, agent, and API options for feeding data in or pulling it out If you can script it, we can turn it into a Nagios probe.

3 Service Infrastructure
Nagios XI on CentOS 6.9 Commercially supported Proprietary UI for ease of use Built Nagios Core 4.2.4 Redundant physical servers Reduced dependencies for datacenter Datacenter redundancy Out of band SMS Nagios XI Based on the Nagios Core opensource platform, Nagios XI provides a commercially supported installation and interface, allowing for easier to use wizards and reporting options, removing the learning curve and accelerating the implementation of basic monitoring and providing vendor support for the underlying infrastructure. This service retains the basic ability of Nagios Core to create a probe for anything you can script, and decades of community contribution offer a lot of solutions. Redundant physical servers Initially deployed as two separate physical servers at two separate datacenters, with physically redundant out-of-band SMS notification, this service is intended for highly available monitoring and notification. If the internet burns down, we can still page you at 1am!

4 The Current Service 1889 hosts 7956 services 176 users 37 Technology Services Teams 3 Distributed IT Units 7 Campus Departments

5 Jason Colwell, Library IT
116 Servers CPU, Memory, Swap, Disk Usage, Puppet, Ping 36 websites DNS IP Match, DNS Resolution, HTTP, Ping, SSL cert status, Web page content Over 1400 service checks

6 Nagios Cross-Platform Agent
Used for Linux and Windows servers Standardize as much as possible Web interface is useful Installation with SCCM and Puppet Apply community secret or token Firewall rules Enable as service

7 Lessons Learned Be careful when cloning hosts and services
Verify that all of your hosts and service checks are in place Adjust the defaults for the NCPA wizard if needed and continue to change alert thresholds

8 Rob Murdock, Technology Services
Nagios is a lot like Tinker Toys: you can do just about anything with it

9 Nagios XI (data destination)
h1 XI h2 h3 Campus Monitoring runs on Nagios XI: hosts (Linux, Windows, etc.) and services are monitored... ● Active (e.g., NRPE) ● Passive (e.g., NRDP) ● SNMP, NSCA, etc.

10 Nagios or ? (data sources)
h1 Core h2 h3 Colleges, units, departments may already be using Core: Essentially the same design pattern as XI...

11 Federated Monitoring Why not hook them together?
Core XI h2 h3 NRDP: Nagios Remote Data Processor NRPE: Nagios Remote Plugin Executor NSCA: Nagios Service Check Acceptor Why not hook them together? Have Core feed XI in certain situations... Notifications? Reporting?

12 Why ‘Growing Strong’ Eric Frahm, Technology Services
Nagios may seem like a ‘Rich Soil’ topic Nagios is best known for infrastructure BUT … Monitoring can be about much more than pinging routers and hosts.

13 Monitoring as a Development Consideration
Monitoring should be a development consideration Developers already use automated test cases Those SAME test cases can be monitoring probes Production metrics feed back into development You never want your NEW website to be slower than your OLD website!! As we design and deploy infrastructure, software, and services, we should be considering how we will monitor these. Developers are used to designing test cases and how to implement those tests in automated ways. Those SAME test cases can be repurposed for testing services and applications in production. Monitoring ongoing production services provides baseline response and performance metrics which inform development priorities and criteria. You never want your NEW website to be slower than your OLD website!!

14 Tools already implemented
Selenium web scripting Splunk-to-Nagios REST API Selenium web scripting We have installed a Selenium RC (standalone) server on a Windows workstation for running selenium scripts against a website or web applications. Splunk-to-Nagios We are currently working with the Splunk team to implement an alert method in Splunk that allows parsed results and alerts (Error events) to be sent from Splunk into Nagios.

15 In Summary … The service is running The service is being used
We want to help!

16

17

18

19 Thank you! Questions? Campus Monitoring, Jason Colwell, Eric Frahm, Rob Murdock, Campus Monitoring Nagios


Download ppt "Campus Monitoring Service"

Similar presentations


Ads by Google