Queensland University of Technology Nagios – an Open Source monitoring solution and it’s deployment at QUT
2 Queensland University of Technology Outline About QUT About Nagios What we use Nagios for Nitty-gritty Questions?
3 Queensland University of Technology Introduction Queensland University of Technology Spans four campuses, five geographical locations Approximately 30,000 student body Who am I, who is Teaching And Learning Support Services
4 Queensland University of Technology Introduction Student Support and Systems We support the Central Labs and Teaching And Learning Support Services staff
5 Queensland University of Technology About Nagios What is Nagios? –Host and service based network monitor –Designed to run under Linux, works under most UNIX variants –Open Source Software
6 Queensland University of Technology Why did we choose Nagios? Urgent need for a monitoring solution Cost (Central Services use BMC Patrol) –Microsoft MOM ($225/CPU monitored) –Configuration, maintenance Monitor many disparate computers rather than a few central servers
7 Queensland University of Technology What we monitor at QUT Multimedia Equipped Lecture Theatre computers (82 across four campuses) Lab servers (eight across four campuses) Staff servers Central Services
8 Queensland University of Technology Proactive application Lecture theatre data share –Able to see problems before Academic staff try to utilise –Audio Visual staff better able to meet 10 minute response time for lecture theatre computers Lecture theatre drive space available
9 Queensland University of Technology Proactive application Lab imaging process –Imaging service available on Lab servers Staff storage space –Continued file storage , web proxy, authentication
10 Queensland University of Technology Nitty-Gritty – Host definitions Types of devices Nagios can monitor –Servers –Switches –Routers –Most kind of network devices Hosts can be grouped together
11 Queensland University of Technology Host definitions Holds configuration information Can hold a ‘parent’ directive Gives a relational view of hosts Creates the logic that then distinguishes between DOWN and UNREACHABLE hosts
12 Queensland University of Technology
13 Queensland University of Technology Host definitions Hosts are not checked until a service fails!
14 Queensland University of Technology Contact definitions Contains contact addresses –Pager/SMS – –Winpopup/IM Contains notification methods Virtual contacts
15 Queensland University of Technology Contact definitions Can be grouped into contactgroups Then applied to services or hosts/hostgroups
16 Queensland University of Technology Service definitions The core of the Nagios process Does the work by running plugins via command definitions Active vs Passive
17 Queensland University of Technology Notifications Sent on a state change occurring –Host change: CRITICAL, UNREACHABLE, WARNING, OK, RECOVERY –Service change: CRITICAL, WARNING, OK, RECOVERY Can be escalated to alternative and inclusive contacts
18 Queensland University of Technology
19 Queensland University of Technology Advanced Features Dependencies – host and service Event Handlers Redundant and Distributed setups Flap detection
20 Queensland University of Technology Reporting Has some built-in reporting Generates graphs on host/service trends, availability, alert histogram, history and summary
21 Queensland University of Technology Capabilities Network services –SNMP, POP, IMAP, Exchange, SQL, Oracle Host Resources/Metrics –Temperature, CPU, memory, disk usage, MS Performance counters, SNMP
22 Queensland University of Technology Limitations Restricted by your network structure Standard interface not very user friendly –Can add links to hosts and services that a person or people are responsible for
23 Queensland University of Technology References user:password guest:guesthttp://nagios.square-box.com
24 Queensland University of Technology Questions? Greg Vickers, QUT –