Download presentation
Presentation is loading. Please wait.
Published byAndrew Schroeder Modified over 11 years ago
1
PARMON A Comprehensive Cluster Monitoring System PARMON Team Centre for Development of Advanced Computing, Bangalore, India Contact: Rajkumar Buyya (buyya@computer.org)
2
Topics of Discussion *PARMON System Model & Architecture qPARMON Server qPARMON Client *PARMON Features and Services *PARMON Installation and its Usage *Monitoring with PARMON *PARMON Integration with other products *Conclusions and Future Directions
3
Motivations *Workstation clusters have off late become a cost-effective solution for HPC ?. *C-DACs PARAM OpenFrame is a large cluster of more than 40 Ultra-4 workstations interconnected through low- latency, high bandwidth communication networks. *Monitoring such huge systems is a tedious and challenging task since typical workstations are designed to work as a standalone system, rather than a part of workstation clusters. *System administrators require tools to effectively monitor such huge systems. PARMON provides the solution to this challenging problem.
4
CLUSTER HARDWARE SOLARIS Light Weight Protocols Message Passing Interfaces C-MPI, PVM SYSTEM MANAGEMENT TOOLS Parallel File system C-PFS Languages C, F77, F90, Development Tools F90 IDE, DIVIA APPLICATIONS C-DAC HPCC Software Architecture
5
PARMON - Salient Features *Online creation of Node and Group database *Allows to monitor system activities at Component, Node, Group, or entire Cluster level monitoring *Designed using state-of-the-art Java technology *Monitoring of System Components : qCPU, Memory, Disk and Network *Allows to monitor multiple instances of the same componet. *Facility for definition of events and automatic notification *Miscellaneous facilities : Message broadcast, Invocation of system management commands (halt, reboot, etc.), System Information & Configuration *PARMON provides GUI interface for initiating activities/request and presents results graphically.
6
PARMON System Model PARMON High-Speed Switch parmond parmon PARMON Server on Solaris Node PARMON Client on JVM
7
PARMON Implementation *Server qMultithreaded using POSIX and Solaris qDeveloped using C as it need to access system internals qIt is a stateless server *Client qDeveloped using Java qJava features are extensively used.. qNew Window is created for each client request, which interacts with server qThreads are used extensively to while creating online resource utilization meters qDynamically configures with changes to node date base.
8
Setting up of PARMON *Server installation & invocation qBinding to port qRights (requires root permission for full functionality) qparmond or parmond (either at boot time or on-line) qNeeds to be loaded on all nodes to be monitored *Client installation & invocation qJava based client (client machine can be PC/workstation supporting JVM) qCLASSPATH (pointing to classes.zip, parmon.jar) qjar file (parmon.jar) qjava parmon or java parmon
9
Setting up of PARMON *Server installation & invocation qBinding to port qRights (requires root permission for full functionality) qparmond or parmond (either at boot time or on-line) qNeeds to be loaded on all nodes to be monitored *Client installation & invocation qJava based client (client machine can be PC/workstation supporting JVM) qCLASSPATH (pointing to classes.zip, parmon.jar) qjar file (parmon.jar) qjava parmon or java parmon
10
Monitoring System Activities and Resource Utilization
11
PARMON Launcher
12
Creation of Node Database
13
Node Deletion
14
Group Creation
15
Group Modification/Deletion
16
Resource Utilization at a Glance
17
Selection of Nodes/Group
18
CPU Usage Monitoring
19
Memory Usage monitoring
20
Disk/Network Usage Monitoring
21
Message Viewer (System logs)
22
Process activities
23
Kernel Data Catalog - CPU
24
Kernel Data Catalog - Memory
25
Kernel Data Catalog - Disk
26
Kernel Data Catalog - Network
27
Catalog of CPU Parameters
28
Component View - Physical
29
Component View - Logical
30
Message Broadcast
31
System Configuration
32
System Information
33
Issuing Commands : halt, shutdown, etc.
34
Node Diagnostics - Online (SunVTS)
35
Online Help
36
PARMON Integration with other Products *PARMON can send resource utilization information to any other product if protocols are made available PARAM online bulletin board parmond Node 1 Node N
37
Conclusions and Future Directions *PARMON successfully used in monitoring PARAM OpenFrame Supercomputer, which is a cluster of 48 Ultra-4 workstations running SUN-Solaris operating system. *Portable across platforms supporting Java *Comprehensive monitoring support and GUI *PARMON supports Solaris and Linux clusters and planned for supporting NT clusters. *Can easily be extended to support web-based monitoring of clusters, by creating a interface server (running on web-server) between client and PARMON server running on cluster nodes.
38
Thank YOU ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.