Download presentation
Presentation is loading. Please wait.
1
HP ArcSight ESM 6.8c HA Fail Over Illustrated
Philippe JOUVELLIER - HP ESP | Global Channel Partner Management Office * Lab during this session
2
HA Architecture explained
Intranet PRIMARY SECONDARY PRIMARY (host name esm) Interface configuration Primary (eth0) “esm” (eth1) cluster (service Ip/name) SECONDARY (host name esm1) Secondary (eth0) “esm1” (eth1) eth-0 eth-0 PACEMAKER PACEMAKER Service IP (cluster ) ESM File System Interlink cable Distributed Replicated Block Device Distributed Replicated Block Device eth-1 eth-1 The service ip/name address will be the shared ESM address/hostname ! Disk 1 Disk 2
3
HA and iPDU (optional) HP Intelligent Power Distribution Unit PRIMARY
Intranet HP Intelligent Power Distribution Unit PRIMARY SECONDARY HA Module uses the iPDU to disable one machine if both get into a mode where they each think they are the primary. This ensures that the failover from one ESM to the other goes smoothly ESM HA only supports the HP iPDU product line. Pacemaker have STONITH iPDU agent that sent command to power on/off, get info eth-0 eth-0 PACEMAKER PACEMAKER Service IP (cluster ) ESM File System Interlink cable Distributed Replicated Block Device Distributed Replicated Block Device eth-1 eth-1 iPDU is a server-room-class power strip whose outlets may be turned on and off remotely ! Disk 1 Disk 2 iPDU
4
STONITH (shoot the other node in the head)
HA architecture Enabling technology for failover Needed when primary is crippled and will not release resources Communication problems – primary cannot receive stop request Software problems (e.g. out of memory or other resources) Ideally STONITH mechanism should be independent of primary hardware/software Power control like iPDU In some clusters cutting the server off from the network (I/O fencing) is used. Default SSH based fallback reboot control far from ideal. Will only work if SSH to server, reboot is possible.
5
Fail Over Illustrated 1/2
Intranet ESM IP cluster is up and running Primary has : Operating system running IP cluster pacemaker activated ESM application started File system handling write operations onto disk 1 Disk 1 operating DRBD replicating data block from Disk 1 to Disk 2 (disk level operation) Secondary has: Operating System started IP cluster pacemaker activated and monitoring Primary ESM application stopped DRBD handling disk level block replication PRIMARY PRIMARY SECONDARY eth-0 eth-0 PACEMAKER PACEMAKER Service IP (cluster ) ESM Pacemaker on Secondary detect Primary failure ! SERVER DOWN File System Interlink cable Distributed Replicated Block Device Distributed Replicated Block Device eth-1 eth-1 Disk 1 Disk 2
6
Fail Over Illustrated 2/2
Intranet ESM IP cluster is still up and running Primary has gone down for one of the following reasons: Operating system crashed ESM application stopped/crashed Hardware failure other Secondary did the following: Detected Primary failure Took over IP cluster alias address Started ESM application Continued ESM operations DRBD disk level block trying replicating data block with former Primary disk if still operating and available FAILED HOST PRIMARY eth-0 eth-0 PACEMAKER Service IP (cluster ) ESM ESM DOWN File System Interlink cable Distributed Replicated Block Device eth-1 eth-1 Disk 2
7
Thank You Questions ?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.