How to Monitor Ingres with Open Source Tools Mark Whalley, Ingres Europe Ltd.
For information contact Product Management at products@ingres.com OpenROAD XML For information contact Product Management at products@ingres.com This presentation contains forward-looking statements that are based on management’s expectations, estimates, projections and assumptions. Forward-looking statements are made pursuant to the safe harbor provisions of the Private Securities Litigation Reform Act of 1995, as amended. These statements are not guarantees of future performance or functionality and involve certain risks and uncertainties, which are difficult to predict. Therefore, actual future functionality, features, results and trends may differ materially from what is forecast in forward-looking statements due to a variety of factors. © 2010 Ingres Corporation Slide 2 Slide 2 © Ingres 2010 2
Agenda Monitoring Ingres Why Monitor What to Monitor When to Monitor Tools for Monitoring Ingres Support Appliance / Nagios Architecture Overview Plug-ins Guide to Building a Custom Plug-in Example Plug-in Questions and Answers © 2010 Ingres Corporation
Monitoring Ingres © 2010 Ingres Corporation
Why monitor Pre-empt problems To keep installations up and running 24x7 Planned downtime Trend analysis Root cause analysis The business expects it Basis for performance tuning Evidence for Service Level Agreements (SLA) Auditing Resource planning Hardware growth Archiving © 2010 Ingres Corporation
What to Monitor Ingres Processes Logs Databases Connectivity Locking Space utilisation Applications User access Resource utilisation Process state © 2010 Ingres Corporation
What to Monitor Hardware CPU load Swap space Disk fragmentation Network Inaccessible server Performance © 2010 Ingres Corporation
When to Monitor 24 hours a day 7 days a week 52 weeks of the year It is impossible to manually monitor large scale enterprises, but the need is to monitor them all of the time. © 2010 Ingres Corporation
Closed Source Tools for Monitoring CA Unicentre HP OpenView BMC Patrol Others © 2010 Ingres Corporation
Open Source Tools for Monitoring Ingres utilities ipm, logstat, lockstat, ima, log files, snmp, ... Operating system utilities ps, top, sar, df, ... Ingres Support Appliance (ISA) / Nagios Others Many tools – but not from a single interface Difficult to manage – not user friendly Difficult to automate Not standard across platforms / operating systems Although plugins will be made publically available which users can integrate with Nagios, the ISA is a self-contained, preconfigured solution, delivered within a virtual machine, that will reduce the amount of effort to implement an Ingres monitoring solution that will be supported by Ingres. Many other open and closed source solutions are available, including: OpenNMS, Big Brother, XYMon etc © 2010 Ingres Corporation
Ingres Support Appliance / Nagios © 2010 Ingres Corporation
Nagios A Open Source monitoring application GNU General Public License V3 Monitors Hosts Networks Others Monitor hub runs on Linux/Unix Download from http://www.nagios.org/ © 2010 Ingres Corporation
Nagios Users 250,000 users worldwide 1,300 reference sites Including: © 2010 Ingres Corporation
Monitors Host resources Processor load Disk usage System logs Others Network services SMTP, POP3, HTTP, NNTP, SNMP, FTP, SSH, ... Just about anything else, for example Door access alarms Light sensors Monitoring of network services – SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH) Monitoring of host resources (processor load, disk usage, system logs) on a many network operating systems Monitoring of anything else like probes (temperature, alarms...) which have the ability to send collected data via a network to specifically written plugins Monitoring via remotely-run scripts via Nagios Remote Plugin Executor (NRPE) Plugins available for graphing of data (Nagiosgraph, Nagiosgrapher, PHP4Nagios etc) Parallelized service checks available Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable Automatic log file rotation © 2010 Ingres Corporation
Nagios Monitors via plug-ins Using: Nagios Remote Plug-in Executor (NRPE) SSH Monitoring of network services – SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH) Monitoring of host resources (processor load, disk usage, system logs) on a many network operating systems Monitoring of anything else like probes (temperature, alarms...) which have the ability to send collected data via a network to specifically written plugins Monitoring via remotely-run scripts via Nagios Remote Plugin Executor (NRPE) Plugins available for graphing of data (Nagiosgraph, Nagiosgrapher, PHP4Nagios etc) Parallelized service checks available Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable Automatic log file rotation © 2010 Ingres Corporation
Interface Notifications via E-mail Pager SMS Optional Web interface Event handlers for proactive problem resolution Monitoring of network services – SMTP, POP3, HTTP, NNTP, ICMP, SNMP, FTP, SSH) Monitoring of host resources (processor load, disk usage, system logs) on a many network operating systems Monitoring of anything else like probes (temperature, alarms...) which have the ability to send collected data via a network to specifically written plugins Monitoring via remotely-run scripts via Nagios Remote Plugin Executor (NRPE) Plugins available for graphing of data (Nagiosgraph, Nagiosgrapher, PHP4Nagios etc) Parallelized service checks available Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable Automatic log file rotation © 2010 Ingres Corporation
Nagios architecture Nagios NRPE Config Files Data Files Hub Plug-in Monitoring Host (Nagios Hub) Nagios check_nrpe Config Files Data Files Nagios Remote Program Executor “check” scripts Plug-in Exit status and messages returned to Nagios 0 = OK, 1 = Warning, 2 = Critical, 3 = Unknown Nagios requests execution of a plugin and stored in Nagios data files Remote (monitored) Host NRPE Local Services check_disk check_ingres check_cpu © 2010 Ingres Corporation
Off-the-shelf Plug-ins ~140 supported by Nagios development team ~1600 community contributions Downloadable from http://nagiosplugins.org/ http://exchange.nagios.org/ Remote monitoring supported through SSH or SSL encripted tunnels. Simple plugin design that allows users to easily develop their own service checks depending on needs, by using the tools of choice (shell script, C++, Perl, C++, Perl, Ruby, Python, PHP, C# etc) © 2010 Ingres Corporation
Custom Plug-ins Simple scripts written in Shell script C++ C# Perl Python PHP Ruby Etc. Remote monitoring supported through SSH or SSL encrypted tunnels. Simple plugin design that allows users to easily develop their own service checks depending on needs, by using the tools of choice (shell script, C++, Perl, C++, Perl, Ruby, Python, PHP, C# etc) © 2010 Ingres Corporation
Custom Plug-ins for Ingres To monitor server processes For Error log mining To measure DMF cache hit ratios To identify Tables in overflow Long running queries © 2010 Ingres Corporation
Guide to Building a Custom Plug-in © 2010 Ingres Corporation
Guide to Building a Custom Plug-in Step 1 - Design Step 2 - Develop Step 3 - Test Step 4 - Deploy Step 5 - Execute Step 6 - Contribute © 2010 Ingres Corporation
Step 1 - Design Check there is not a plug-in that already does this Review development guidelines http://nagiosplug.sourceforge.net/developer-guidelines.html Define plug-in specification May include: Triggers for exit codes Messages to be returned to Nagios Performance data for graphing Persistent data for loading into Ingres tables © 2010 Ingres Corporation
Step 2 - Develop Write plugin using preferred development tools Use source code revision control subversion sccs Locate scripts (or executables) in /usr/local/nagios/libexec © 2010 Ingres Corporation
Step 3 - Test For example: su - nagios cd /tmp /usr/local/nagios/libexec/check_isa_ingres_lrq_report -s /opt/IngresII –d demodb echo "Plugin Return Code: $?" © 2010 Ingres Corporation
Step 4 - Deploy For host(s) to be monitored Copy plug-in into /usr/local/nagios/libexec © 2010 Ingres Corporation
Step 4 - Deploy Define plugin as a command within Nagios configuration file: /usr/local/nagios/etc/objects/commands.cfg define command{ command_name check_remote_ingres_lrq_report command_line $USER1$/check_by_ssh -p $ARG1$ -l nagios –i /usr/local/nagios/etc/keys/$HOSTNAME$ -H $HOSTADDRESS$ -C /usr/local/nagios/libexec/check_isa_ingres_lrq_report -s $ARG2$ -d $ARG3$'} © 2010 Ingres Corporation
Step 4 - Deploy Place service definitions within Nagios configuration file: /usr/local/nagios/etc/objects/services.cfg define service{ use remote-service host_name prod_svr service_description demodb – Long Running Queries check_command check_remote_ingres_lrq_report!22! /opt/IngresII!demodb} © 2010 Ingres Corporation
Step 5 - Execute Automatically by Nagios depending on Time periods Service checks enabled / disabled Manually by user: From a command line From the Service Commands menu of the web interface © 2010 Ingres Corporation
Step 6 - Contribute Contributing plug-in to the community http://exchange.nagios.org/ © 2010 Ingres Corporation
Example Plug-in © 2010 Ingres Corporation
Long Running Queries Plug-in Monitor plug-in Generates a list of long running queries Using ima Report A Nagios plug-in Sends list of long running queries back to Nagios hub Optional Ingres loader Load list of long running queries into an Ingres table Nagios plugins are expected to run in less than 10 seconds. By its nature, to identify LRQ requires that monitoring of queries is performed over a period of time (possibly many hours). Thus the solution implemented here is to split the “monitoring” and “reporting” elements into two distinct functions. The optional “Ingres loader” function takes the output generated by the monitor element and load the details of LRQ into an Ingres database. This data may then be used outside Nagios to generate reports, enable further analysis and provides a persistent data store (under the control of the DBA rather than Nagios). © 2010 Ingres Corporation
Monitor Plug-in Shell script (~2300 code lines) Generates a list of long running queries For a defined time period (e.g. 8 hours) Do the following: Identify running queries using ima Record details of running queries Sleep To reduce impact on installation being monitored only session temporary tables are used for storing data. All persistent data (for example that required between each cycle within the monitoring period) are held within flat-files (the location of which be default is /tmp, but can be defined via a con List of long running queries includes, user name, query and approximate time taken to run. The approximation is due to; queries that may have started or continue to run outside the monitoring period, for large n second values, queries may start / finish part way through a check cycle and server performance may impact the time taken for the monitor script to run. © 2010 Ingres Corporation
ima Query /* Declare a GTT to store queries currently running against the database */ declare global temporary table session.ima_server_sessions as select session_id, effective_user, session_query from ima_server_sessions where session_query != '' and db_name = '${h_clv_database_name}' on commit preserve rows with norecovery \p\g Session query currently restricted to 1000 characters © 2010 Ingres Corporation
Report Shell script (~600 lines of code) Initiated by Nagios Analyse and report on long running queries Sends information back to Nagios Hub OK No long running queries identified Warning 1 or more queries took more than n seconds to run Critical Unable to run monitor script © 2010 Ingres Corporation
Optional Ingres loader Not a requirement for Nagios to work Provides persistence Takes the list of long running queries Loads it into an Ingres table May be used for auditing / reporting Shell script (~800 lines of code) Not part of the Nagios implementation, this module will store collected data in an Ingres database for subsequent reporting / analysis. © 2010 Ingres Corporation
Service status details for all hosts screenshots Dashboard: Summary of host and service status Service status and messages returned from plug-in © 2010 Ingres Corporation
Service status for long running queries Status Information: WARNING - Installation: A9, Database: ltp, Pause: 1(sec), Run Time: 5(min). Last check started 06/05/2010 19:03:21 and completed 06/05/2010 19:08:25. 1 query was identified: User: ingres Query: select * from bw_boat_listing, bw_boat_listing, bw_boat_listing (seen 123 times)
Summary Monitoring Ingres Why Monitor What to Monitor When to Monitor Tools for Monitoring Ingres Support Appliance / Nagios Architecture Overview Plug-ins Guide to Building a Custom Plug-in Example Plug-in © 2010 Ingres Corporation