Distributed monitoring system
Why Monitor? Solve them! Identify Problems Ensure conduct Requirements Manage many computers Spot trends in the system Increase Performance Identify problems Applications
Monitoring Grids Grid Cluster Node Grid consists of – Nodes ( A single machine) – Clusters (Collection of Nodes) – Grids (Collection of Clusters) General Objective of Grid: To perform high performance computing. Solution: Monitor at levels.
Monitoring Nodes Nodes A terminal with single/multiple processors. Factors to monitor Temperatures CPU/Memory Usage Disk space Network Activity Jobs Provide vital statistics of each node. Grid Cluster Node
Monitoring Clusters & Grids Clusters & Grids Collection of Nodes Factors to monitor Load Processing power Uptime Availability Provides performance statistics
Ganglia Gangalia: is distributed monitoring system. Based on a hierarchical structure Lightweight :- low overhead and high concurrency. Prominent Features:- Visualization using graphs Selective statistics
Gangalia: Architecture [IBM (2008), 'Perormance Monitoring using Ganglia', IBM Manual - Wiki.]
Gangalia: Gmond Lightweight service Records and sends data via XDR CPU statistics Memory statistics Network statistics Job statistics Uses XML over TCP Gmond Node Gmond Node Gmond Node Gmtead Central Node
Gangalia: Gmtead Lightweight service Receives and sends data obtained from Gmond Gmtead Saves data on disk using RRD (round robin database) Supports multiple creation of monitoring domains Reason: Gangalia is very scalable
Gangalia: Web Server & GUI Tools GUI Tools PHP scripts which extract data from Gangalia Generates visualization using graphs. Web Server Apache + PHP support to hosts and execute scripts SSL and XML support is required.
Gangalia: gstat gstat Command line tool to extract gmond for information. Syntax: $gstat --help Usage: gstat [OPTIONS]... -h --help Print help and exit -V --version Print version and exit -a --all List all hosts. Not just hosts running gexec (default=off) -d --dead Print only the hosts which are dead (default=off) -m --mpifile Print a load-balanced mpifile (default=off) -1 --single_line Print host and information all on one line (default=off) -l --list Print ONLY the host list (default=off) -n --numeric Print numeric addresses instead of hostnames (default=off) -iSTRING --gmond_ip=STRING Specify the ip address of the gmond to query (default=' ') -pINT --gmond_port=INT Specify the gmond port to query (default=8649)
Gangalia:Using gstat
Gangalia:gmetric
Gagalia:Using gmetric
Building Monitoring Domains Using Gangalia
[IBM (2008), 'Performance Monitoring using Ganglia', IBM Manual - Wiki.]
Guidelines to Building Monitoring Domains ServiceFunctionSends ToReceives gmondCollects data from nodes gmond & gmteadgmond gmteadSaves data to diskgmteadgmond & gmtead gstatExtracts information. -gmond gmetricCreates custom metrics --
Prerequisites Finalize your IP Finalize you Domain Name Finalize your time zone Update the time zone of the machine using NTP Download following packages Gangalia [ ] PHP [ ] Apache [ ] rrdtools [ ]
Steps in Installing Gangalia Map Monitoring Domains Choose Central Nodes from Domains Install gmond on Nodes Install gmtead on Central Nodes Install Web Server on Central Nodes