Presentation is loading. Please wait.

Presentation is loading. Please wait.

10 Quick Steps To Disaster Mike Weber

Similar presentations


Presentation on theme: "10 Quick Steps To Disaster Mike Weber"— Presentation transcript:

1 10 Quick Steps To Disaster Mike Weber mweber@spidertools.com

2 2011 Nagios World Conference 2 Inheriting Aberrations with Objects

3 20123 Where are those settings coming from? Object Inheritance Object Priorities Object Chaining Incomplete Objects Canceling Inheritance Additive Inheritance

4 20124 Object Inheritance

5 20125 Object Inheritance: Templates

6 20126 Object Inheritance: No Hostgroups?

7 20127 Object Inheritance: From Hostgroup

8 20128 Object Inheritance: Info Option

9 20129 Object Priorities: Local then Inheritance

10 201210 Object Priorities: Order in List (Chaining)

11 201211 Incomplete Object: Only Lists One Image

12 201212 Canceling Inheritance: Object Contains Parents

13 201213 Canceling Inheritance: Wrong Parents

14 201214 Canceling Inheritance: Cancel Parents

15 201215 Canceling Inheritance: Canceled Parents

16 201216 Additive Inheritance: Append Object Contents

17 201217 Additive Inheritance: Append Object Contents

18 2011 Nagios World Conference 18 Hoping BAD Things Won't Happen

19 201219 Real BAD Things Will Happen Backups Updates Dependencies

20 201220 XI: Automated Backup /etc/cron.d/nagiosxi 0 7 * * * root /root/scripts/automysqlbackup 0 8 * * * root /root/scripts/autopostgresqlbackup /store/backups/mysql daily weekly monthly /store/backups/postgresql daily weekly monthly

21 201221 XI: Upgrade Backup #!/bin/bash ##### BackUp Of Nagios Before Upgrade ##### # Timestamp Backups TIMESTAMP=$(date +%Y%m%d_%H%M); echo $TIMESTAMP service nagiosxi stop service npcd stop service ndo2db stop service nagios stop mkdir /bk/upgrade_$TIMESTAMP tar cjf /bk/upgrade_$TIMESTAMP/nagios_$TIMESTAMP.tar.bz2 /usr/local/nagios tar cjf /bk/upgrade_$TIMESTAMP/nagiosxi_$TIMESTAMP.tar.bz2 /usr/local/nagiosxi pg_dump -U nagiosxi -c -F p nagiosxi | bzip2 -c > /bk/upgrade_$TIMESTAMP/pg_nagiosxi_$TIMESTAMP.sql.bz2 mysqldump -u root -pnagiosxi nagios | bzip2 -c > /bk/upgrade_$TIMESTAMP/my_nagios_$TIMESTAMP.sql.bz2 mysqldump -u root -pnagiosxi nagiosql | bzip2 -c > /bk/upgrade_$TIMESTAMP/my_nagiosql_$TIMESTAMP.sql.bz2 service nagios start service ndo2db start service npcd start service nagiosxi start

22 201222 Core: Backup #!/bin/sh # Timestamped Back Up TIMESTAMP=`date +%Y%m%d_%H%M%S`; echo $TIMESTAMP tar czvf /bk/nagios_dir_$TIMESTAMP.tar.gz /usr/local/nagios tar czvf /bk/pnp4nagios_dir_$TIMESTAMP.tar.gz /usr/local/pnp4nagios

23 2011 Nagios World Conference 23 Ignoring/Encouraging System Warnings

24 201224 Configuration Errors: Service Checks

25 201225 Solution: Service Template Management

26 201226 Service Template: Check Settings

27 201227 Service Template: Alert Settings

28 201228 Service Template: Add Hostgroup

29 201229 Solution: Service Template Management

30 201230 Max Concurrent Service Checks

31 201231 Maximum Concurrent Checks Edit nagios.cfg to avoid latency issues. max_concurrent_checks=0

32 2011 Nagios World Conference 32 Mangling Users and Contacts

33 201233 Managing Users and Contacts Users (access to the web interface) Contacts (notifications)

34 201234 Creating Users: Web Interface

35 201235 Creating Users: Web Interface

36 201236 Creating Users: Restricted

37 201237 Creating Users: Restricted

38 201238 Managing Administrators: Full Access

39 201239 Managing Administrators: Full Access

40 201240 Core: cgi.cfg authorized_for_system_information=nagiosadmin,john,sue,mark,tom,mary,ralph authorized_for_configuration_information=nagiosadmin,john,sue,mark,tom,mary,ralph authorized_for_system_commands=nagiosadmin,john,sue,mark,tom,mary,ralph authorized_for_all_services=nagiosadmin,management,john,sue,mark,tom,mary,ralph authorized_for_all_hosts=nagiosadmin,management,john,sue,mark,tom,mary,ralph authorized_for_all_service_commands=nagiosadmin,john,sue,mark,tom,mary,ralph authorized_for_all_host_commands=nagiosadmin,john,sue,mark,tom,mary,ralph authorized_for_read_only=management

41 201241 Contacts

42 2011 Nagios World Conference 42 Monitoring Non-Existent Ports on Switches

43 201243 Save Resources Use AdminDown on Ports * Administratively set unused ports as AdminDown * Modify ifoperstatus Turn Off Monitoring on Used Ports Remove the Checks

44 201244 Unused Switch Ports: Wasting Resources * check port status * check bandwidth * send notifications * ignore notifications

45 201245 Modify check_ifoperstatus Here is the code the affects output. You need to modify the line: if ( not defined $adminWarn or $adminWarn eq "w" ) { $state = 'WARNING'; to $state = 'OK'; It is highlighted in the example. ## if ( not ($response->{$snmpIfAdminStatus} == 1) ) { $answer = "Interface $name (index $snmpkey) is administratively down."; if ( not defined $adminWarn or $adminWarn eq "w" ) { $state = 'OK'; } elsif ( $adminWarn eq "i" ) { $state = 'OK'; } elsif ( $adminWarn eq "c" ) { $state = 'CRITICAL'; } else { # If wrong value for -a, say warning $state = 'WARNING'; }

46 201246 Administratively Down Ports

47 201247 Disable Port Checks * 790 port checks disabled * 1.5 GB of RAM saved * 18% reduction in max service check execution time

48 2011 Nagios World Conference 48 Encouraging Non-Accountability for Changes

49 201249 Who Makes Changes on Your Nagios? Limit Admin Access Require Training Create Policy for Changes Use a Test Server

50 201250 Audit Log

51 2011 Nagios World Conference 51 Abusing Nagios XI Wizards

52 201252 Wizard or Manual Creation: Assessment Installation Which method provides the most efficient installation? Example: Using a wizard for a switch is most efficient. Example: Manually creating a service check to be used on 100 servers is most efficient. Visibility Will it provide access to view the grouping of devices? Example: Can effective reports be created from visible devices? Management Does it make management easier in the long run? Example: The use of templates is an efficient method to manage multiple devices that are similar.

53 201253 Template Management

54 2011 Nagios World Conference 54 Disregarding Network Relationships

55 201255 Reachability

56 201256 Host: Manage Parents

57 201257 Host: Manage Parents

58 201258 Network Relationships: Parents

59 2011 Nagios World Conference 59 Importing Infectious Diseases

60 201260 GUI Infection: Lack of Command Line Skills Backups * cron jobs * manual backups * verification Analysis * disk space * logs Troubleshooting * finding stuff * processes * permissions Edit Files * learning vi or nano

61 201261 Short Cut Infection: Auto-Discovery

62 2011 Nagios World Conference 62 Overestimating Human Intelligence

63 2011 Nagios World Conference 63 Some of the Things We Do as Humans Defies Logic


Download ppt "10 Quick Steps To Disaster Mike Weber"

Similar presentations


Ads by Google