11 Distributed Monitoring for Web Apps Fernando Hönig

Slides:



Advertisements
Similar presentations
1 Network Monitoring with Nagios Asian Internet Interconnection Initiatives Project Yan Adikusuma Nara Institute of Science and Technology
Advertisements

The following 10 questions test your knowledge of desired configuration management in Configuration Manager Configuration Manager Desired Configuration.
How to Monitor Ingres with Open Source Tools
M. Bechtel, S. Blümel, A. Quignon1 Linux Network Server Group: Nagios Marc Bechtel Sebastian Blümel Alexandre Quignon.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
A Move Toward Agile APM: Application Performance Management Frank Ober, Performance Engineer June 2012.
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
R. Lange, M. Giacchini: Monitoring a Control System Using Nagios Monitoring a Control System Using Nagios Ralph Lange, BESSY – Mauro Giacchini, LNL.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
BY Zoher & Mahmoud. What is WAMP?  - Acronym for Windows/Apache/MySQL/PHP, Python, (and/or) PERL  - WAMP refers to a set of free open source applications,
Using The WDK For Windows Logo And Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
Purpose Intended Audience and Presenter Contents Proposed Presentation Length Intended audience is all distributor partners and VARs Content may be customized.
11 Distributed Monitoring and Cloud Scaling for Web Apps Fernando Hönig
Papeete, French Polynesia PacNOG 5 Papeete, French Polynesia 17 June 2009 Hervey Allen.
These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (
Passive Monitoring with Nagios Jim Prins
Using the WDK for Windows Logo and Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
Your university or experiment logo here Nagios: An introduction and Brief Tutorial Chris Brew SciTech/PPD.
Josh Riggs Utilizing Open Source Network Monitoring.
Conditions and Terms of Use
1. A key measurement tool for actively monitoring availability of devices and services. Possible the most used open source network monitoring software.
V0.1 BlackBerry HTML5/WebWorks Applications for the BlackBerry ® PlayBook™ Tablet BlackBerry Academic Program Module 5 - Writing HTML5/WebWorks API Extensions.
Alerting With MySQL and Nagios Sheeri Cabral Senior DB Admin/Architect
Introduction To Nagios A Linux-based Monitoring System.
Network Monitoring Manage your business without blowing your budget. Learn how the Calhoun ISD utilizes free “Open Source” tools for real-time monitoring.
Nagios The monitoring tool. Why ? Nagios is a powerful, modular network monitoring system that can be used to monitor many network services like smtp,
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
NAGIOS 1. Introduction A key measurement tool for actively monitoring availability of devices and services. Possible the most used open source network.
LegendCorp What is System Center Virtual Machine Manager (SCVMM)? SCVMM at a glance Features and Benefits Components / Topology /
2010 These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (
Rob Davidson, Partner Technology Specialist Microsoft Management Servers: Using management to stay secure.
UNDERSTANDING YOUR OPTIONS FOR CLIENT-SIDE DEVELOPMENT IN OFFICE 365 Mark Rackley
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
1 Grid Monitoring with Nagios Aries Hung, Joanna Huang, Felix Lee, Min Tsai ASGC WLCG T2 Asia Workshop TIFR, Dec 2, 2006.
IBM Express Runtime Quick Start Workshop © 2007 IBM Corporation Deploying a Solution.
2010 NAGIOS APRICOT 2010 Kuala Lumpur, Malaysia.
IBM Software Group ® Jazz Team Build – Part 1 Overview Jonathan.
Maintaining and Updating Windows Server 2008 Lesson 8.
Alerting With MySQL and Nagios Sheeri Cabral Senior DB Admin/Architect
Nagios FTW TriLUG 8/10/06 Presented by: Jason Faulkner Ian Kilgore.
High Availability For Nagios Mike Weber
Advisor : Quincy Wu Speaker : Xang-Ting Date : 2010/06/08
Reporting Services 2012 Data Alerts
Hybrid Management and Security
Convergence /6/2018 © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
What is nagios? Version 2 8/ M.A.Newhall.
About Bill Bill Baer (ˈbɛər)
Microsoft Virtual Academy
NCAR-Developed Tools Bill Anderson and Marc Genty
Virtualization in the gLite Grid Middleware software process
Microsoft Virtual Academy
Excel Services Deployment and Administration
What is Bash Shell Scripting?
How to monitor the $H!T out of Hadoop
Microsoft /12/2018 8:06 AM BRK2103 Deliver more features faster with a modern development and test solution Claude Remillard Group Program Manager.
Performance Point Services in SP2013
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
Server & Tools Business
Working with different JavaScript frameworks and libraries
Getting started with SharePoint Framework
DAT381 Team Development with SQL Server 2005
Microsoft Virtual Academy
4/18/2019 1:52 AM Baan Color 7.0 Change colors and add pictures to your Baan/Infor ERP LN screens according to company number. Prevents confusion, protects.
*AZs available across US, Europe and Asia
Microsoft Virtual Academy
Server & Tools Business
Microsoft Virtual Academy
PnP Partner Pack - Introduction
Presentation transcript:

11 Distributed Monitoring for Web Apps Fernando Hönig

22 * Other names and brands may be claimed as the property of others. About me - From Córdoba, Argentina - System Administrator - Working last 7 years in IT Companies - Working in Intel IT since April Married with Jesica and Father of Benjamin (2)

33 * Other names and brands may be claimed as the property of others. Third Party Vendors / Open Source  This presentation will cover the solution achieved instead of talking about third party vendors.  All products used for this are open source. Best Practices  With this presentation we would like to show processes and best practices.

44 * Other names and brands may be claimed as the property of others. Topics - Problem Overview - Distributed Infrastructure - Failover Solution - Webinject Architecture - API Feeds Integration - Multi Checks + NRPE - Notifications with SMS and VoIP - Q/A

55 * Other names and brands may be claimed as the property of others. Why do we need a Distributed Infrastructure?  More than 500 Services Checks per Customer  Apps from our Customer needs to be reached from diff GEOs  Checks every 1 or 5 minutes  Redundancy / Fast Recovery Why do we need a Centralized Dashboard?  ACLs for different customers and groups.  Fast and simple services/commands/hosts adds/updates/removal.  MySQL stored performance data for external BI solutions.

66 * Other names and brands may be claimed as the property of others. Centralized Dashboard / Nagios Distribution Region #1 Region #2 Region #3 Region #N External Monitoring Provider #1 External Monitoring Provider #2

77 * Other names and brands may be claimed as the property of others. Simple Distributed Nagios with NDOUtils Ndomod.cfg - Example instance_name=Central output_type=tcpsocket output= tcp_port=5668 output_buffer_items=5000 file_rotation_interval=14400 file_rotation_timeout=60 reconnect_interval=15 reconnect_warning_interval=900 data_processing_options=-1 config_output_options=3 Ndo2db.cfg - Example ndo2db_user=nagios ndo2db_group=nagios socket_type=tcp socket_name=/var/run/ndo.sock tcp_port=5668 db_servertype=mysql db_host=localhost db_name=centstatus db_port=3306 db_prefix=nagios_ db_user=username db_pass=password max_timedevents_age=1440 max_systemcommands_age=1440 max_servicechecks_age=1440 max_hostchecks_age=1440 max_eventhandlers_age=1440

88 * Other names and brands may be claimed as the property of others. How to Enable a new distributed node? To enable a new Nagios node you just need to: Install a VM with the latest Nagios/NRPE/NDOMod code. Setup ssh-without-password auth for nagios user. Add some sudo rights. Enable that node in the centralized interface. Create a new poller Create a new nagios.cfg config for that poller Create a new ndomod config for that poller Enable the service checks that you need on that poller All of this could be automated

99 * Other names and brands may be claimed as the property of others. Remote Pollers Visualization Remote Services / Checks

10 * Other names and brands may be claimed as the property of others. Distributed with Failover

11 * Other names and brands may be claimed as the property of others. Failover Infrastructure Scripts / Master Side #!/bin/sh apacherun=`ps ax | grep /usr/sbin/httpd | grep -v grep | cut -c1-5 | paste -s -` nagiosrun=`ps ax | grep /usr/local/nagios/bin/nagios | grep -v grep | cut -c1-5 | paste -s -` if [ "$nagiosrun" == "" ]; then echo "Stopping Apache since Nagios is not running" /etc/init.d/apache2 stop else echo "Nagios running" fi if [ "$apacherun" == "" ]; then echo "Stopping Nagios since Apache is not running" /etc/init.d/nagios stop else echo "Apache running" fi exit 0 command[check_nagios_failover]=/usr/local/nagios/libexec/check_nagios -F /usr/local/nagios/var/status.dat -e 1 -C '/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg' NRPE Command

12 * Other names and brands may be claimed as the property of others. Failover Infrastructure Scripts / Failover Side #!/bin/sh nagcmd=`/usr/local/nagios/libexec/check_nrpe -H c check_nagios_failover` now=`date +%s` commandfile='/usr/local/nagios/var/rw/nagios.cmd' apacherun=`ps ax | grep /usr/sbin/httpd | grep -v grep | cut -c1-5 | paste -s -` nagiosrun=`ps ax | grep /usr/local/nagios/bin/nagios | grep -v grep | cut -c1-5 | paste -s -` CHECK=`echo $nagcmd | grep CRITICAL` if [ "$CHECK" == "" ] ; then echo "Nagios Master is OK" if [ "$apacherun" ]; then echo "Stopping Apache" /etc/init.d/apache2 stop fi if [ "$nagiosrun" ]; then echo "Stopping Nagios" /etc/init.d/nagios stop fi else if [ "$apacherun" == "" ]; then echo "Starting Apache" /etc/init.d/apache2 start fi if [ "$nagiosrun" == "" ]; then echo "Starting Nagios" /etc/init.d/nagios start sleep 15 fi /bin/echo "[%lu] SEND_CUSTOM_HOST_NOTIFICATION;nagiosmaster;2;nagiosfailover;Master Nagios seems to be down. Oncall Group Engaged. Nagios Failover in place.\n" $now > $commandfile fi exit 0

13 * Other names and brands may be claimed as the property of others. Web Apps Monitoring

14 * Other names and brands may be claimed as the property of others. How can we monitor Web Apps? All these white holes are the space that we’re not monitoring using these checks.

15 * Other names and brands may be claimed as the property of others. How can we monitor Web Apps? Probably we’re not covering 100%, but white holes are not as big as before. This is a continuous improvement

16 * Other names and brands may be claimed as the property of others. What’s Webinject? WebInject is a free tool for automated testing of web applications and web services. It can be used to test individual system components that have HTTP interfaces. WebInject offers real-time results display and may also be used for monitoring system response times.* * Source: How it works? What do you receive after a check? ConfigTestCase Web Apps OKCRITICAL

17 * Other names and brands may be claimed as the property of others. Webinject Installation How to Install it?: From Cpan Perl Library use: install Webinject From Consol Labs:

18 * Other names and brands may be claimed as the property of others. Webinject Architecture Config File Model: test.xml no nagios 10 Mozilla/5.0

19 * Other names and brands may be claimed as the property of others. Webinject Architecture Test Case File Model: <case id=“1" description1=“Case Description" url=" method="post" posttype="application/x-www-form- urlencoded" postbody="string={BASEURL2}&id=1" parseresponse='"code":"|",“string"|' verifypositive='"state":"{BASEURL2}"' />

20 * Other names and brands may be claimed as the property of others. Webinject Architecture Test Case File Model: <case id="1" description1="Case Description" url=" method="post" posttype="application/x-www-form-urlencoded" postbody="string={BASEURL2}&id=1" parseresponse='"method":"|"' verifypositive='"state":"{BASEURL2}"' /> <case id="2" description1="Validation" url=" method="post" posttype="application/json" postbody='{"usermethod":"{PARSEDRESULT}","taxes":false}' verifypositive='"transactionMessage":"Transaction Approved"' />

21 * Other names and brands may be claimed as the property of others. Service Definition: define service { use generic-service host_name MyApplication-server service_description WebInject Test is_volatile 0 check_period 24x7 max_check_attempts 3 normal_check_interval 1 retry_check_interval 1 contact_groups myapplication-admins notification_interval 120 notification_period 24x7 notification_options w,u,c,r check_command webinject! -s BASEURL1=url-domain.com!config_file.xml } Webinject and Nagios Integration Command Definition: define command { command_name webinject command_line /usr/local/webinject/webinject.pl $ARG1$ -c $ARG2$ }

22 * Other names and brands may be claimed as the property of others. External API Integration We use a webinject SOAP call to get the availability of a test in an external monitoring provider. #/bin/bash #Arguments #1 = TestCase # Web Inject API Call webinject=`/path/to/webinject.pl -c path/to/config.xml $1` if [[ "$webinject" == *CRITICAL* ]] ; then echo “CRITICAL- $webinject" exit 2 else echo "$webinject" exit 0 fi

23 * Other names and brands may be claimed as the property of others. Multi Check + NRPE: Using check_multi + nrpe we can centralize the distributed execution of several scripts and set warning or critical threshold based on # of tests.

24 * Other names and brands may be claimed as the property of others. Multi Check + NRPE Architecture: Multi Distributed Script Bash script to call check_multi with options cmd file to execute the tests and validate the execution NRPE remote command executes and get the results

25 * Other names and brands may be claimed as the property of others. Multi Check Bash Script #!/bin/bash nagiospluginpath="/usr/local/nagios/libexec" $nagiospluginpath/check_multi –f \ $nagiospluginpath/check_multi_configs/path/webinject.cmd \ -s TEST1="$1" \ -s TEST2="$2" \ -s TEST3="$3" \ -s TEST4="$4" \ -t 60 \ -T 120

26 * Other names and brands may be claimed as the property of others. Multi Check Command Script # Web Inject Calls for Multi Tests command [ place1 ] = check_nrpe -H place1 -c external_webinject -a "$TEST1$" command [ place2 ] = check_nrpe -H place2 -c external_webinject -a "$TEST2$" command [ place3 ] = check_nrpe -H place3 -c external_webinject -a "$TEST3$" command [ place4 ] = check_nrpe -H place4 -c external_webinject -a "$TEST4$" state [ CRITICAL ] = COUNT(CRITICAL) > 3 || COUNT(WARNING)==COUNT(ALL) || COUNT(UNKNOWN)==COUNT(ALL) state [ WARNING ] = COUNT(WARNING) > 0 || COUNT(CRITICAL) > 0 || COUNT(UNKNOWN) > 0

27 * Other names and brands may be claimed as the property of others. Multi Check Service Status:

28 * Other names and brands may be claimed as the property of others. Additional Notifications

29 * Other names and brands may be claimed as the property of others. Notifications with SMS

30 * Other names and brands may be claimed as the property of others. Notifications with VoIP Calls (Nagios calls you)

31 * Other names and brands may be claimed as the property of others. Complete Distributed Monitoring Solution

32 * Other names and brands may be claimed as the property of others. Fernando Q/A onig

33 * Other names and brands may be claimed as the property of others. Legal Notices This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. * Other names and brands may be claimed as the property of others. Copyright © 2012, Intel Corporation. All rights reserved.