Cluster Consistency Monitor
Why use a cluster consistency monitoring tool? A Cluster is by definition a setup of configurations to maintain the operation of an application even in the case of a hardware resource failure. The primary purpose of a cluster installation is to determine the required resources in terms of hardware, network and process settings to allow an application failover.
Typical view of a cluster configuration for an SAP R/3 application: While we typically find the Central Instance and the Database of an SAP R/3 System along with some important file systems shared by NFS inside the cluster, there are also often a number of dialog instances outside the cluster.
Represented in another way, this installation would look like this: The application resources Filesystems, Volume Groups, IP-Name and Process-Control have been put in a package. The package controls these resources and relocates it in case of a node failure to another node. While important application components are in the Shared Devices, the definition for this Shared Devices as well as, for example, the configuration of the Relocatable IP-Address have to be maintained manually on each node capable of running this application.
Resources of an SAP R/3 Installation The above picture shows an example of an SAP R/3 installation into the file systems of a node. The following description takes the above R/3 application as an example. Each installation of an R/3 application has a lot of dependencies toward the configuration of a hosting Operating System.
If we again change the representation, we see that besides the resources defined and partly managed in a cluster, we may find a lot of other resources of the application outside of the cluster configuration: It should be emphasized that any difference in the configurations between nodes in a cluster may cause problems to an application upon a failover. These problems may appear immediately after a switch and may disable the start of the application, but may also appear only in certain situations during the operation of an application.
Resources in a Application Landscape SAP R/3 systems are typically not isolated units but are usually integrated into a landscape of IT solutions. This landscape includes, of course, services like printing, backup or archiving, security, monitoring, etc., but also may include interfaces to other important services. These services may run at another node or even another Operating System and are required in order to maintain the business critical processes supported by the SAP system. These may be services like a data exchange facility with a production control system, a electronic exchange interface (EDI) or a batch control system.
The following picture gives an example of such a landscape:
Monitor Resources in a Application Landscape All these services are also dependent on resources and configurations inside the Operating System of a node. In our clustered SAP R/3 environment, it is absolute necessary to monitor and maintain not only the resources and configurations of the SAP R/3 system itself, but also the resources and configurations of the subsystems or their interfaces. CCMon is capable of monitoring these infrastructures as well, due to the adaptive resource configurations.
Features of the Cluster Consistency Monitor The basic principle of the Cluster Consistency Monitor is to compare resource configurations of nodes. Each node in the monitor has a program which reads a configuration profile about named resources and creates a resource database. These Resource-DB’s are then compared by another program and possible error conditions are reported. The output of this comparison is an ASCII or HTML- Report.
Features of the Cluster Consistency Monitor The Cluster Consistency Monitor is intended to be an adaptive toolkit. This means that at a startup point every generic resource – like the Shared Volume Groups and the file systems of an application - that may influence the proper operation of that application in the cluster is added to the cluster Resource-DB. Individual configurations for a certain application or in a certain customer environment might be added whenever it is necessary. The only action necessary is adding the resource definition to the Resource-DB. This flexibility enables the monitor to work with all kinds of applications. Even in uniform applications like SAP R/3, many customers have different individual setups and need flexibility to address this.
Features of the Cluster Consistency Monitor It is clearly necessary to run a Cluster Consistency Monitor periodically in order to maintain the cluster fail-over capability. The current implementation of the monitor supports a “Learning Mode“ which gives detailed information about a discovered failure as well as some hints on how to fix the problem. The knowledge on configuring and maintaining a cluster has been put in the monitor. On the other hand, if a cluster is well-tuned this feature may be switched off so the monitor checks only on changes. Of course, potential problems may be fed into a Management System like HP ITO.
The implemented functionalities of the Cluster Consistency Monitor are, in short: Output reports may contain Error Explanation Text in order to help during interpretation of the Compare Results. All resources configured in a package may be easily added to the Resource-DB by using the Statement ANALYSE PACKAGE upon creation of the Resource-DB. It is possible to suppress acknowledged Error-Messages. Changes in a cluster setup that may be OK may be acknowledged by implementing an entry in an Error Exclusion Table in order to suppress them. No Trigger to external systems would be created on suppressed messages. The Cluster Consistency Monitor detects and reports all kind of setup problems of the Monitor itself.
The implemented functionalities of the Cluster Consistency Monitor are, in short: The operation of the Cluster Monitor doesn’t impact running applications. The resource consumption of the Monitor is very low. Typically, the Comparator takes less than 10 seconds in comparing two nodes. The use of the Cluster Consistency Monitor avoids failover tests. Due to the fact that on a failover test only the startup of the package on a secondary node is examined, potential problems which may arise in case the application runs for a longer period on that node are not recognized. The Cluster Consistency Monitor detects even that class of problems as it simulates the view an application would have of the configurations of operating systems. The storage format of the Resource-DB is very dense. It allows fast transfer even over wide-area networks. Due to that density, memory consumption during the Comparator run or storage consumption for the Resource-DB is very low. The current implementation supports HP-UX and 11.xx, 32 Bit and 64 Bit.
Overview about the Monitor Architecture The Cluster Consistency Monitor consists of a configurable Data Supplier, a configurable Monitor and optionally a Presentation Layer.
Reporting Reporting can be set up as ascii and/or html: Cluster Consistency Comparator Report is an ascii report which compares the 2 nodes and s the report once a week. Script called from cron: /usr/local/bin/ccmonitor
The End (for now)