Presentation is loading. Please wait.

Presentation is loading. Please wait.

Best Practices for Implementing Unicenter NSM r11.1 in an HA MSCS Environment Part II -Last Revision April 24, 2006.

Similar presentations


Presentation on theme: "Best Practices for Implementing Unicenter NSM r11.1 in an HA MSCS Environment Part II -Last Revision April 24, 2006."— Presentation transcript:

1 Best Practices for Implementing Unicenter NSM r11.1 in an HA MSCS Environment Part II -Last Revision April 24, 2006

2 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 2 Agenda -This presentation will cover the following topics: -Agent Technology -Management Command Center (MCC) -2dmap -Job Management Option (JMO) -Event Management -Uninstallation -FAQs

3 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 3 Disclaimer -Microsoft Cluster supports more than 2 server nodes. Although the focus of this presentation is 2 node clusters, the same concepts apply to multiple server node clusters -If you are planning to install an Ingres MDB, then you must use Unicenter NSM r11. For a SQL MDB, you must use Unicenter NSM r11.1 -This presentation only covers NSM r11.1

4 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 4 References -For additional information, consult “Appendix C: “Making Components Cluster Aware and Highly Available” in the Unicenter NSM r11.1 Administrator Guide

5 Agent Technology

6 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 6 DSM IP Address Scoping -For Cluster Nodes, change DSM IP Address Scope from LOCALHOST or real cluster node name to Cluster Name. For example: -If real node names = I14YClust1, I14YClust2 -And SQL Virtual Node Name = I14YCLUSTSQL -And SQL Instance Name = SQLINSTA -Update DSM Server to I14YCLUSTSQL -DSM will not manage those hosts if real nodes are specified

7 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 7 DSM IP Address Scoping Default of LOCALHOST will not work in HA. Change this to your SQLVNODE or add another DSM Server entry for your SQLVNODE (in this example, “I14YCLUSTSQL”)

8 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 8 World View objects -DSM runs under SQL Virtual Node. -Agents will run on both nodes -dsmMonitor World View objects will be displayed as “critical” on the inactive nodes as the dsmMonitor only runs on the active node.

9 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 9 Classic GUI Display - Cluster Nodes Active Node Passive Node SQL VNode

10 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 10 Cluster Nodes -In the previous classic 2dmap… -I14YCLUSTSQL is the SQL VNode -This shows objects and status as for active cluster node -You do not need to discover this node, unless you wish to navigate into the active node objects -I14YCLUST1 is the Active Cluster Node A -I14YCLUST2 is the Passive Cluster Node B -dsmMonitor shows as absent as this only runs on the active node

11 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 11 Remote DSM -Remote DSM may connect to the available HA MDB. This DSM may not be HA -aws_dsm will retry the connection until the MDB is available on the new active node -When the MDB is available after failover, the DSM will reconnect

12 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 12 Remote DSM After Failover -The following shows remote DSM re-connecting to an HA MDB after failover

13 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 13 System Agent & Clustered Resources -Active node -Clustered resources, such as shared disk, are monitored by the system agent running on the active node -Sends a trap indicating active status to DSM -Passive nodes -Clustered resources are set to passive and these resources are not monitored. -Sends a trap indicating passive status to DSM

14 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 14 Shared Disks -In the next example, cluster has 3 shared disks defined in two different resource groups -DISKG:, DISKH: and DISKE: -When resources groups move over due to failure or as a result of explicit request to move groups, the shared disks are set to passive on the failed node and automatically monitored on the new active node

15 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 15 Passive Node Only local disks montiored

16 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 16 Active Node Local and Shared disks monitored

17 WorldView 2D Map

18 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 18 2D Map When MDB moves over to another cluster node, you may see a connection failure message. When new cluster node is active, it will automatically reconnect

19 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 19 Classic GUI -Classic 2D map GUI connects to SQL Vnode Name and thus eliminates the need to know the active node

20 Management Command Center (MCC)

21 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 21 MCC -MCC cannot be selected for installation on HA server -MCC can be installed on the remote servers and can connect to the SQL virtual node

22 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 22 Global Catalog -In an HA setup, AIS Catalog is created on the shared disk -The catalog is shared by all cluster nodes -Address spaces in the catalog are for SQL VNode name and not real nodes -MCC Client uses SQL VNode name

23 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 23 MCC Client MCC Client connects to a Virtual Node name and not real node name

24 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 24 MCC Client

25 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 25 Failover -During failover, active remote MCC clients may be connected to virtual cluster node. -As part of HA concept, cluster will failover to another cluster node -The MCC client will detect the failover and reconnect as the active node has changed. -If “RMI Connection lost message” is issued, click OK to reconnect to the new active node

26 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 26 MCC Client – After Failover Click OK and session will be re-established

27 Job Management Option (JMO)

28 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 28 JMO -If JMO Agent is installed and active, update JMO option to move checkpoint to the shared disk -Identify the shared disk where the following directory is created by the install process: - \Program files\ca\SharedComponent\CCS\WVEM Note: This must be on the shared disk and not local disk -Create TMP subdirectory as shown on the next slides

29 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 29 CheckPoint File -As shown, checkpoint is not shared out of the box and must be moved to shared disk

30 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 30 Shared Disk: -This shows Cluster Shared Directory that was chosen during the install process. This is where Checkpoint file should be moved

31 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 31 Shared Disk -This shows TMP subdirectory manually created

32 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 32 Update JMO – Temp Directory Option Default location for checkpoint file

33 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 33 Update JMO – Temp Directory Option -To update option from a command line enter cautenv setlocal CAISCHD0008

34 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 34 Update JMO – Temp Directory Option -Repeat CAISCHD0008 change on all cluster nodes -Stop and start Unicenter service to select the changes. This should create a checkpoint file on shared disk, which can then be shared by all cluster nodes

35 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 35 Station -Station is automatically defined for SQL VNode name. In non-HA mode, this is defined as real node -This enables job definitions to be shared across all cluster nodes

36 Test1 -JMO - HA Manager

37 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 37 Test1 - Failover Test Plan -Define a jobset -Station – Remote Node (job will be submitted to JMO Agent running on a non-cluster node) -Define a long running job with Sleep 15 mins -Define a second job which is dependent on a previous Sleep job -While the Sleep job is active, move the group over to simulate failover of the Workload Manager (JMO). -This should move JMO (Manager) to another cluster node. -Review the status of this job and dependent job on the new active node

38 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 38 Test 1 – Active Node

39 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 39 Test 1 – Job Definition HATest1_02 job is dependent on HATest1_01 job

40 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 40 Test1 – Demand Job HATEST1_02 job waiting on HATEST1_01 job to complete

41 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 41 Test1 – Simulate Failover -Move Group I14YClust2 is now new active node

42 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 42 Workload Manager Workload manager now active on new node. Shows Job2 submitted from the new active node

43 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 43 Test 1 – Job Completion -The job completes after failover. The dependent job starts and the jobset status changed to completed

44 Event Management

45 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 45 HA Environment Variables -CA_OPR_MONITOR_STATE -Specifies whether the Event Management Daemon will keep track of actions that it is in the middle of processing. The default is Yes. -CA_OPR_MONITOR_INTERVAL -Specifies the interval, in seconds, for saving the Event Management state table into a flat file. The default is 30 seconds.

46 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 46 CA_OPR_MONITOR_STATE Defaults to Yes in HA mode. For non-HA install, default is NO

47 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 47 CA_OPR_MONITOR_INTERVAL

48 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 48 Log Files -Unicenter Event Management log files reside on shared disk -Shared by all cluster nodes. For example: -NodeA is active and Event Management Daemon running on NodeA is writing to the log file -NodeA fails and NodeB now is active -NodeB will continue to write to the same log file used by NodeA and will also contain events from NodeA

49 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 49 Windows Events -Unicenter running on active node only - thus, Event Management running on active node only -In a cluster environment, Microsoft forwards all Windows Events from all cluster nodes to the active node -Unicenter Event Management MRA can process Events from other nodes as they are forwarded by Microsoft to the active node. However, since Event Management is not running on other cluster nodes, MRA node cannot be specified for non- active node

50 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 50 MRA - Node -When defining MRA, do not specify real node name for Node option Must not use real node name of the cluster

51 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 51 Windows Event from Non-Active Nodes -Windows events generated on a non-active node are written to the Unicenter Event log This shows CA Event is not running on non-active node

52 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 52 Windows Event from Non-Active Nodes -Windows events generated on a non-active node are written to the Unicenter Event log

53 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 53 Windows Event from Non Active Nodes

54 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 54 Test 1 – HA MRA -Main objective: to demonstrate the MRA failover. -By default, MRA active at the time of failover will continue after failover -After failover, the last active action will be re- executed on the new active node and subsequent actions will continue on the new node -If this feature is not required, set Event State monitor option to “NO”

55 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 55 Test1 – MRA Failover Tasks -Define 2 MRAs as follows: -One with “Sleep” of 2 mins -One with “HIGHLITE”. -Generate an event to trigger above MRA -Wait for 30 seconds to get STATE_SAV updated -While waiting on “Sleep” action to complete, simulate failover. -Verify if the HIGHLITE message is displayed on the new active node

56 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 56 Test1 – Define MRA -MRA sequence 20 will wait for 2 minutes. -After 30 seconds, verify STATE table is updated -Simulate failover -Verify subsequent actions executed on new active node

57 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 57 Test1 – STATE Table -State table at start

58 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 58 Simulate Failover -Simulate failover by moving group

59 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 59 Trigger MRA This shows last active sequence (sleepy 120) re-executed on the new active node

60 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 60 Review STATE Table -This shows STATE table has been updated to log HA_OPR – OPR MRA HA Test

61 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 61 After Failover – New Active Node -This confirms the remaining actions completed on the new active node

62 Stop Enterprise Management Subcomponents -unicntrl

63 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 63 Unicntrl stop -In HA setup, Unicntrl stop is not a valid command. It displays information to issue a stop for all subcomponents or offline the CA- Unicenter Cluster resource -If aws_dsm is running, it will be stopped as the cluster resource is off line

64 Uninstall

65 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 65 Uninstall -Uninstall needs to be performed on all nodes of the cluster -The data on the shared disk should be removed during the uninstall of NSM from the last cluster node -You should be prepared to reboot the cluster nodes multiple times

66 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 66 Uninstall – Node 1 -Do not remove the shared disk. Select No option and click Finish

67 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 67 Uninstall – Node 2 -Remove shared disk with last cluster node. Click Yes then click Finish. This will NOT drop the MDB -If MDB is not shared by any other remote products then manually drop the database.

68 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 68 Uninstall – Node 1 -After uninstalling Node 2, move group to Node 1 and run uninstall again -This time remove common services components that would not have been removed from the first de-install -There should not be any shared data to delete as this would have been deleted during uninstall of Node2. However, it is good idea to click Yes.

69 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 69 Cluster Resources -Uninstall of NSM r11.1 from all cluster nodes should remove cluster resources -If the uninstall fails or some components are not removed, then you will have to manually remove them -Take extra care to ensure you do not delete other cluster resources. Microsoft Cluster will remove dependent resources when you delete a resource on which other resources are dependent

70 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 70 NSM Cluster Resources

71 FAQs

72 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 72 Unicenter Management Portal (UMP) -Can Unicenter Management Portal be installed with NSM HA setup? -UMP is not classified as HA. If it is installed prior to NSM, it will be installed in NON-HA mode. -If NSM r11.1 is installed first in HA mode, UMP will still be installed as non-HA mode -However, UMP can continue to use an HA MDB

73 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 73 Exchange Agent -We are using 3.x Exchange agent which is cluster aware. How do we integrate this r11 NSM HA install? -Exchange is not part of r11. Review Migration Guide for more details -Or wait for UME 11.1 which is currently in Beta status

74 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 74 Event Management Run ID -If Run ID is used when defining MRA, ensure that userid has “Logon As a batch job” privilege (not specific to HA mode)

75 © 2005 Computer Associates International, Inc. (CA). All trademarks, trade names, services marks and logos referenced herein belong to their respective companies. 75 Event Management - Runid -To grant “logon as a batch job” privilege, simply add the user to the TNDUsers security group -If “logon as a batch job” privilege is not granted, Logon Type: 4 failure will be encountered


Download ppt "Best Practices for Implementing Unicenter NSM r11.1 in an HA MSCS Environment Part II -Last Revision April 24, 2006."

Similar presentations


Ads by Google