VCS Building Blocks
Topic 1: Cluster Terminology After completing this topic, you will be able to define clustering terminology.
A Nonclustered Computing Environment
Definition of a Cluster A cluster is a collection of multiple independent systems working together under a management framework for increased service availability. Application Node Storage Cluster Interconnect
Definition of VERITAS Cluster Server and Failover VCS detects faults and performs automated failover. Application Node Failed Node Storage Cluster Interconnect
Definition of an Application Service An application service is a collection of all the hardware and software components required to provide a service. If the service must be migrated to another system, all components need to be moved in an orderly fashion. Examples include Web servers, databases, and applications.
Definition of a Service Group A service group is a virtual container that enables VCS to manage an application service as a unit. All components required to provide the service, and the relationships between these components, are defined within the service group. A service group has attributes that define its behavior, such as where it can start and run.
Service Group Types Failover: –The service group can be online on only one cluster system at a time. –VCS migrates the service group at the administrator’s request and in response to faults. Parallel –The service group can be online on multiple cluster systems simultaneously. –An example is Oracle Real Application Cluster (RAC). Hybrid This is a special-purpose type of service group used to manage service groups in replicated data clusters (RDCs). RDCs use replication between systems at different sites instead of shared storage.
Definition of a Resource Resources are VCS objects that correspond to the hardware or software components of an application service. Each resource must have a unique name throughout the cluster. Choosing names that reflect the service group name makes it easy to identify all resources in that group, for example, WebIP in the WebSG group. Resources are always contained within service groups. Resource categories include: –Persistent None (NIC) On-only (NFS) –Nonpersistent On-off (Mount)
Resource Dependencies Resources in a service group have a defined dependency relationship, which determines the online and offline order of the resource. A parent resource depends on a child resource. There is no limit to the number of parent and child resources. Persistent resources, such as NIC, cannot be parent resources. Dependencies cannot be cyclical. Parent/child Child Parent
Resource Attributes Resource attributes define an individual resource. The attribute values are used by VCS to manage the resource. Resources can have required and optional attributes, as specified by the resource type definition. mount –F vxfs /dev/vx/dsk/WebDG/WebVol /Web WebMount resource WebMount resource Solaris
Resource Types Resources are classified by type. The resource type specifies the attributes needed to define a resource of that type. For example, a Mount resource has different properties than an IP resource. mount [-F FSType] [options] block_device mount_point Solaris
Agents have one or more entry points that perform a set of actions on resources. Each system runs one agent for each active resource type. Agents: How VCS Controls Resources Each resource type has a corresponding agent process that manages all resources of that type. onlineofflinemonitorclean NIC eri0 IP Mount /web/log Volume WebVollogVol Disk Group WebDG
Topic 2: Cluster Communication After completing this topic, you will be able to describe cluster communication mechanisms.
Cluster Communication The cluster interconnect serves to: Determine which systems are members of the cluster using a heartbeat mechanism. Maintain a single view of the status of the cluster configuration on all systems in the cluster membership. A cluster interconnect provides a communication channel between cluster nodes.
Low-Latency Transport (LLT) LLT LLT: Is responsible for sending heartbeat messages Transports cluster communication traffic to every active system Balances traffic load across multiple network links Maintains the communication link state Is a nonroutable protocol Runs on an Ethernet network
Group Membership Services/Atomic Broadcast (GAB) GAB: Performs two functions: –Manages cluster membership; referred to as GAB membership –Sends and receives atomic broadcasts of configuration information Is a proprietary broadcast protocol Uses LLT as its transport mechanism LLT GAB LLT GAB
The Fencing Driver Fencing: Monitors GAB to detect cluster membership changes Ensures a single view of cluster membership Prevents multiple nodes from accessing the same Volume Manager 4.x shared storage devices LLTGAB Fence LLT GAB Reboot
The High Availability Daemon (HAD) The VCS engine, the high availability daemon: –Runs on each system in the cluster –Maintains configuration and state information for all cluster resources –Manages all agents The hashadow daemon monitors HAD. HAD hashadow LLTGABFence
Comparing VCS Communication Protocols and TCP/IP HAD hashadow LLTGAB iPlanet NIC TCP IP NIC User Processes Kernel Processes Hardware
Topic 3: Maintaining the Cluster Configuration After completing this topic, you will be able to describe how the cluster maintains the configuration.
Maintaining the Cluster Configuration HAD maintains a replica of the cluster configuration in memory on each system. Changes to the configuration are broadcast to HAD on all systems simultaneously by way of GAB using LLT. The configuration is preserved on disk in the main.cf file. HAD main.cf hashadow HAD hashadow
VCS Configuration Files include "types.cf" cluster vcs ( UserNames = { admin = ElmElgLimHmmKumGlj } Administrators = { admin } CounterInterval = 5 ) system S1 ( ) system S2 ( ) group WebSG ( SystemList = { S1 = 0, S2 = 1 } ) Mount WebMount ( MountPoint = "/web" BlockDevice = "/dev/vx/dsk/WebDG/WebVol" FSType = vxfs FsckOpt = "-y" ) main.cf A simple text file is used to store the cluster configuration on disk. The file contents are described in detail later in the course. A simple text file is used to store the cluster configuration on disk. The file contents are described in detail later in the course.
Topic 4: VCS Architecture After completing this topic, you will be able to describe the VCS architecture.
VCS Architecture Agents monitor resources on each system and provide status to HAD on the local system. HAD on each system sends status information to GAB. GAB broadcasts configuration information to all cluster members. LLT transports all cluster communications to all cluster nodes. HAD on each node takes corrective action, such as failover, when necessary.
Topic 5: Supported Failover Configurations After completing this topic, you will be able to describe the failover configurations supported by VCS.
Active/Passive Before FailoverAfter Failover
Active/Passive N-to-1 Before Failover After Failover
Before Failover After Repair Active/Passive N + 1 After Failover Standby Server
Active/Active Before FailoverAfter Failover
N-to-N Before Failover After Failover