Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016.

Similar presentations


Presentation on theme: "Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016."— Presentation transcript:

1 Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016

2 Overview of Talk  Introduction: the Need to Monitor perfSONAR itself  Check_MK  Overview  Current check_mk services  Monitoring perfSONAR  How to install check_mk agents on your perfSONAR  Summary and Questions March 9, 2016NA Throughput Meeting2

3 Monitoring perfSONAR  As most of this group should know, perfSONAR is being used to monitor our networks for OSG and WLCG  WLCG/OSG Deployment status as of today (great progress): Deployment statusDeployment status  3.4.1 : 6  3.4.2 : 8  3.5 : 2  3.5.0 : 42  3.5.1 : 165  Unknown: 23 (These nodes are either down or hung)  One challenge we face is keeping perfSONAR operating correctly among our ~125 sites globally  When data isn’t being measured how do we know? (MaDDash!)  When data isn’t being measured what is the reason? (check_mk!) NA Throughput Meeting3March 9, 2016

4 About OMD/Check_MK  We need ways to track how our perfSONAR toolkit installations are performing and if there are issues with their many services or the underlying OS.  To do this we can use a Nagios like capability to check that the services operating on a specific toolkit instance are functioning.  ESnet perfSONAR developers have provided a set of nagios checks to monitor and verify the various perfSONAR toolkit services are functioning correctly  Rather than just using Nagios we have select the Open Monitoring Distribution (OMD) to do this task (http://omdistro.org )http://omdistro.org  OMD combines Nagios, PNP4Nagios, Nagvis and Check_MK NA Throughput Meeting4March 9, 2016

5 Check_mk Features NA Throughput Meeting5March 9, 2016  We have focused on Check_mk because it provides a number of very nice features  We can easily discover, monitor and track services and their performance data  Integrates well with Linux Oses  Provides graphing, history and availability data automatically  See https://mathias-kettner.de/check_mk.html https://mathias-kettner.de/check_mk.html  Within the WLCG Network and Transfer Metrics WG we have enabled access to OMD/Check_mk via x509 certificates; any valid certificate in a browser should work

6 perfSONAR Monitoring Pages  We have 3 versions of our perfSONAR monitoring pages  Prototype at maddash.aglt2.org (intending to phase this out soon)  Testing at OSG’s ITB instance  Production at OSG’s production instance  Main monitoring types are MaDDash and OMD/Check_MK  Prototype: http://maddash.aglt2.org/maddash-webuihttp://maddash.aglt2.org/maddash-webui https://maddash.aglt2.org/WLCGperfSONAR/check_mk  Testing: http://perfsonar-itb.grid.iu.edu/maddash-webui/http://perfsonar-itb.grid.iu.edu/maddash-webui/ https://perfsonar-itb.grid.iu.edu/WLCGperfSONAR/check_mk /https://perfsonar-itb.grid.iu.edu/WLCGperfSONAR/check_mk /  Production: http://psmad.grid.iu.edu/maddash-webui/http://psmad.grid.iu.edu/maddash-webui/ https://psomd.grid.iu.edu/WLCGperfSONAR/check_mk  Notes:  OSG instances rely upon OSG Datastore: http://psds.grid.iu.eduhttp://psds.grid.iu.edu  X509 cert needed to view check_mk/OMD pages (any IGTF cert) March 9, 2016NA Throughput Meeting6

7 OSG Network Datastore Diagram NA Throughput Meeting7 q OSG is gathering relevant metrics from the complete set of OSG and WLCG perfSONAR instances q Operating now q Running VMs on dedicated hardware q Data also published to CERN Active MQ instance and available for user subscription q Actively tuning and debugging 8 VMs Storage must host 7 distinct areas March 9, 2016

8 OMD for LHCONE/LHCOPN perfSONARs March 9, 2016NA Throughput Meeting8 https://maddash.aglt2.org/WLCGperfSONAR/check_mk/https://maddash.aglt2.org/WLCGperfSONAR/check_mk/ (Prototype) https://psomd.grid.iu.edu/WLCGperfSONAR/check_mk/https://psomd.grid.iu.edu/WLCGperfSONAR/check_mk/ (Production) We monitor: “Expected” test coverage NDT/NPAD running? Memory on hosts (<4GB) New “version” test Access requires x509 credential from IGTF CA Gives us a good view into where problems still exist

9 OMD Hostgroup Summary LHCOPN/LHCONE March 9, 2016NA Throughput Meeting9

10 Jump in…Live Demonstration  Let’s go to the ITB instance and I will try to demonstrate some features. I will be sharing my screen for those attached to Vidyo. Sorry for those on the phone only.  Open the following URL from a browser with your x509 certificate installed:  https://perfsonar-itb.grid.iu.edu/WLCGperfSONAR/check_mk/  Let’s start…. March 9, 2016NA Throughput Meeting10

11 Installing Check_mk Agent  See http://omd.aglt2.org/ http://omd.aglt2.org/  On your perfSONAR toolkit run (as ‘root’):  yum –y install http://omd.aglt2.org/check-mk-agent-plugins- 1.0-5.el6.noarch.rpm http://omd.aglt2.org/check-mk-agent- 1.2.6p16-1.noarch.rpm http://omd.aglt2.org/check-mk-agent-plugins- 1.0-5.el6.noarch.rpmhttp://omd.aglt2.org/check-mk-agent- 1.2.6p16-1.noarch.rpmhttp://omd.aglt2.org/check-mk-agent-plugins- 1.0-5.el6.noarch.rpmhttp://omd.aglt2.org/check-mk-agent- 1.2.6p16-1.noarch.rpm  Then notify Shawn so he can tag and re-inventory your host(s) March 9, 2016NA Throughput Meeting11

12 Discussion/Questions/Comments? March 9, 2016NA Throughput Meeting12

13 References  Network Documentation https://www.opensciencegrid.org/bin/view/Documentation/NetworkingInOSG https://www.opensciencegrid.org/bin/view/Documentation/NetworkingInOSG  Deployment documentation for OSG and WLCG hosted in OSG https://twiki.opensciencegrid.org/bin/view/Documentation/DeployperfSONAR  New MA guide http://software.es.net/esmond/perfsonar_client_rest.html http://software.es.net/esmond/perfsonar_client_rest.html  Modular Dashboard and OMD Prototypes  http://maddash.aglt2.org/maddash-webui https://maddash.aglt2.org/WLCGperfSONAR/check_mk http://maddash.aglt2.org/maddash-webuihttps://maddash.aglt2.org/WLCGperfSONAR/check_mk  OSG Production instances for OMD, MaDDash and Datastore  http://psmad.grid.iu.edu/maddash-webui/ http://psmad.grid.iu.edu/maddash-webui/  https://psomd.grid.iu.edu/WLCGperfSONAR/check_mk/ https://psomd.grid.iu.edu/WLCGperfSONAR/check_mk/  http://psds.grid.iu.edu/esmond/perfsonar/archive/?format=json http://psds.grid.iu.edu/esmond/perfsonar/archive/?format=json  Mesh-config in OSG https://oim.grid.iu.edu/oim/meshconfig https://oim.grid.iu.edu/oim/meshconfig  Use-cases document for experiments and middleware https://docs.google.com/document/d/1ceiNlTUJCwSuOuvbEHZnZp0XkWkwdkPQTQic0VbH1m c/edit https://docs.google.com/document/d/1ceiNlTUJCwSuOuvbEHZnZp0XkWkwdkPQTQic0VbH1m c/edit https://docs.google.com/document/d/1ceiNlTUJCwSuOuvbEHZnZp0XkWkwdkPQTQic0VbH1m c/edit NA Throughput Meeting13March 9, 2016


Download ppt "Using Check_MK to Monitor perfSONAR Shawn McKee/University of Michigan North American Throughput Meeting March 9 th, 2016."

Similar presentations


Ads by Google