Download presentation
Presentation is loading. Please wait.
1
NGI and Site Nagios Monitoring
Emir Imamagic University Computing Centre (SRCE) Croatia EGI-InSPIRE – ROD Teams Workshop
2
EGI-InSPIRE – ROD Teams Workshop
Overview Nagios Monitoring Nagios Web Interface Nagios Internals Credential Management MSG Bridge MyEGEE Bridge SAM CE Metrics Configuration Tuning EGI-InSPIRE – ROD Teams Workshop
3
EGI-InSPIRE – ROD Teams Workshop
Nagios monitoring EGI-InSPIRE – ROD Teams Workshop
4
EGI-InSPIRE – ROD Teams Workshop
Architecture EGI-InSPIRE – ROD Teams Workshop
5
EGI-InSPIRE – ROD Teams Workshop
Nagios Open source monitoring framework Highly flexible with advanced features host/service dependencies, escalation, soft/hard states, flapping detection Widely used & actively developed EGI-InSPIRE – ROD Teams Workshop
6
Nagios Config Generator
Automatic generation of Nagios configuration configuring Nagios is hard Based on multiple information sources Simple bootstrap of Nagios instances EGI-InSPIRE – ROD Teams Workshop
7
Nagios Config Generator – Information Sources
Database components Aggregated Topology Provider (ATP) Metric Description Database (MDDB) Operations services GOCDB, SAM, ENOC Grid information services BDII Static files EGI-InSPIRE – ROD Teams Workshop
8
EGI-InSPIRE – ROD Teams Workshop
Probe Types Local probes probes executed by Nagios as active checks SAM probes (CE, WMS, WN and SRM) WLCG probes (SRCE, CERN) BDII & Gstat probes Nagios native probes lightweight service checks (ENOC Downcollector) grouped in profiles (e.g. ROC, SITE, …) EGI-InSPIRE – ROD Teams Workshop
9
EGI-InSPIRE – ROD Teams Workshop
Probe Types Remote probes results imported from external systems as passive checks remote Nagios instances classic SAM monitoring system ENOC Downcollector EGI-InSPIRE – ROD Teams Workshop
10
EGI-InSPIRE – ROD Teams Workshop
Deployment SL5 RPM packages & metapackages egee-NAGIOS egee-NRPE Yum repository Yaim configuration package glite-NAGIOS glite-NRPE EGI-InSPIRE – ROD Teams Workshop
11
EGI-InSPIRE – ROD Teams Workshop
Nagios Web interface EGI-InSPIRE – ROD Teams Workshop
12
EGI-InSPIRE – ROD Teams Workshop
Tactical Overview EGI-InSPIRE – ROD Teams Workshop
13
EGI-InSPIRE – ROD Teams Workshop
Host Metrics EGI-InSPIRE – ROD Teams Workshop
14
EGI-InSPIRE – ROD Teams Workshop
Host Details EGI-InSPIRE – ROD Teams Workshop
15
EGI-InSPIRE – ROD Teams Workshop
Service Details EGI-InSPIRE – ROD Teams Workshop
16
Force Metric Execution
All services on a host Host Details page Schedule a check of all services on this host Single metric Service Details page Re-schedule the next check of this service Important! don’t force check all services on host or remote metrics EGI-InSPIRE – ROD Teams Workshop
17
EGI-InSPIRE – ROD Teams Workshop
Downtimes Downtimes are imported from GOCDB org.egee.ImportGocdbDowntimes metric Disables notifications of all metrics Metrics are still executed! EGI-InSPIRE – ROD Teams Workshop
18
EGI-InSPIRE – ROD Teams Workshop
External Links Extra Notes red folder image links to metric documentation Extra Actions “bomb” image local probes – links to performance data remote probes – links to original web page EGI-InSPIRE – ROD Teams Workshop
19
EGI-InSPIRE – ROD Teams Workshop
Nagios internals EGI-InSPIRE – ROD Teams Workshop
20
Credential Management
EGI-InSPIRE – ROD Teams Workshop
21
Credential Management – Nagios Metrics
hr.srce.GridProxy-Get-* regenerates VOMS proxy from MyProxy credential hr.srce.GridProxy-Valid-* checks validity of VOMS proxy on Nagios host all metrics using proxy depend on this metric hr.srce.MyProxy-ProxyLifetime-* checks validity of stored MyProxy credential warns admin that MyProxy should be refreshed EGI-InSPIRE – ROD Teams Workshop
22
EGI-InSPIRE – ROD Teams Workshop
MSG Bridge EGI-InSPIRE – ROD Teams Workshop
23
MSG Bridge – Components
ConfigCache SQLite database /var/cache/msg/config-cache/config.db contains configuration of local and remote Nagios instances MsgCache DirQueue /var/spool/msg-nagios-bridge/ contains results from metrics executed by local and remote Nagioses EGI-InSPIRE – ROD Teams Workshop
24
MSG Bridge – Components
msg-to-handler daemon subscribed to list of topics and queues modular implementation (handler per topic/queue) stores configuration to ConfigCache stores remote metric results to MsgCache EGI-InSPIRE – ROD Teams Workshop
25
MSG Bridge – Nagios Metrics
org.egee.SendToMsg publishes configuration & metric results org.egee.RecvFromQueue imports results from local MsgCache to Nagios results imported as passive checks org.egee.ConfigCheck checks if new remote configuration is available EGI-InSPIRE – ROD Teams Workshop
26
EGI-InSPIRE – ROD Teams Workshop
MyEGEE Bridge MyEGEE uses databases Metric Description Database (MDDB) Aggregated Topology Provider (ATP) Metric Result Store (MRS) Nagios executes probes for updating databases EGI-InSPIRE – ROD Teams Workshop
27
MyEGEE Bridge – Nagios Metrics
org.egee.ATPSync synchronizes the local ATP with the central ATP log in /var/log/atp org.egee.MDDBSync synchronizes the local MDDB with the central MDDB log in /var/log/mddb org.egee.SendToMetricStore publishes Nagios results to MRS if critical no data in MyEGEE EGI-InSPIRE – ROD Teams Workshop
28
EGI-InSPIRE – ROD Teams Workshop
SAM CE Metrics org.sam.CE-JobStatus associated with each CE service submits SAM WN job via WMS & holds status of submitted job WN probes communicate back via MSG org.sam.CE-JobMonit associated with Nagios server updates status of all org.sam.CE-JobStatus probes on Nagios EGI-InSPIRE – ROD Teams Workshop
29
EGI-InSPIRE – ROD Teams Workshop
SAM CE Metrics org.sam.CE-JobSubmit associated with each CE service holds the final state of SAM WN job passive check updated by org.sam.CE-JobMonit org.sam.WN-* individual WN metrics (equivalent to old SAM) passive checks updated via MSG EGI-InSPIRE – ROD Teams Workshop
30
EGI-InSPIRE – ROD Teams Workshop
Configuration tuning EGI-InSPIRE – ROD Teams Workshop
31
EGI-InSPIRE – ROD Teams Workshop
Configuration Tuning NCG configuration modifying ncg.conf beware of yaim reruns ncg.d directory will be provided in the next release Static file directives adding files to /etc/ncg/ncg-localdb.d/ directives are documented in perldoc of modules NCG::SiteSet::File, NCG::SiteInfo::File, NCG::LocalMetrics::File, NCG::LocalMetricsAttrs::File, NCG::LocalRules::File EGI-InSPIRE – ROD Teams Workshop
32
NCG Custom Site Config on Multisite Instances
Procedure customized NCG block must be copied at the beginning of block sitename is added, e.g. <NCG::SiteInfo egee.srce.hr>… Useful for adding uncertified sites which require specific information sources adding per site static file directives EGI-InSPIRE – ROD Teams Workshop
33
Adding and Removing Site
Handled by module NCG::SiteSet::File Adding site which is in GOCDB/SAM/ATP ADD_SITE!sitename Adding site which is not in GOCDB/SAM/ATP ADD_SITE_BDII!sitename!site_bdii_address Removing site REMOVE_SITE!sitename EGI-InSPIRE – ROD Teams Workshop
34
Adding and Removing Host
Handled by module NCG::SiteInfo::File Host must be associated to service Adding host/service associated with VO ADD_HOST_SERVICE_VO!hostname!service!VO Adding host/service ADD_HOST_SERVICE!hostname!service Important! on multisite instances adding hosts requires NCG::SiteInfo block to be associated to site EGI-InSPIRE – ROD Teams Workshop
35
Adding and Removing Host
REMOVE_HOST!hostname Removing service from a host REMOVE_HOST_SERVICE!hostname!service Removing service from all hosts REMOVE_SERVICE!service EGI-InSPIRE – ROD Teams Workshop
36
EGI-InSPIRE – ROD Teams Workshop
Notifications Default grid services configuration GOCDB CONTACT_ is configured notifications are disabled Default Nagios internals configuration is configured notifications are enabled EGI-InSPIRE – ROD Teams Workshop
37
EGI-InSPIRE – ROD Teams Workshop
Notifications Enabling grid service notifications set ENABLE_NOTIFICATIONS = 1 in the block <NCG::ConfigGen><Nagios> Changing Nagios internals address NAGIOS_ADMIN = EGI-InSPIRE – ROD Teams Workshop
38
EGI-InSPIRE – ROD Teams Workshop
Notifications Possible to add contacts for grid services Handled by module NCG::LocalRules::File Adding contact for all hosts and metrics Adding contact for a single host EGI-InSPIRE – ROD Teams Workshop
39
EGI-InSPIRE – ROD Teams Workshop
Notifications Adding contact for a given service on host Removing contact useful if you don’t want to receive alerts on the default address EGI-InSPIRE – ROD Teams Workshop
40
EGI-InSPIRE – ROD Teams Workshop
Links OAT page lot of useful links to Nagios, NCG, MSG, packaging, repositories Installation manual EGI-InSPIRE – ROD Teams Workshop
41
EGI-InSPIRE – ROD Teams Workshop
Links Nagios web interface follow “Extra Notes” links where provided Nagios documentation is provided on every instance EGI-InSPIRE – ROD Teams Workshop
42
EGI-InSPIRE – ROD Teams Workshop
Feedback & Support Regional admin mailing list OAT discuss mailing list Nagios GGUS Support Unit Recently migrated to JIRA tracker EGI-InSPIRE – ROD Teams Workshop
43
EGI-InSPIRE – ROD Teams Workshop
Thank you! Questions? EGI-InSPIRE – ROD Teams Workshop
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.