But what if there is a catastrophic event? Fire, flood, earthquake …
app’s fail over to a separate physical location Servers in separate locations in the same cluster
Recovery Time Objective (RTO)Recovery Point Objective (RPO)
WAN Different datacenters (usually) equates to different subnets Longer distance means greater network latency
PropertyDefaultRecommendedDescription SameSubnetDelay 11Frequency heartbeats (HB) sent SameSubnetThreshold 510Missed HB before interface considered down CrossSubnetDelay 11Frequency HB sent to nodes on dissimilar subnets CrossSubnetThreshold 520 Missed HB before interface considered down to nodes on dissimilar subnets PowerShell: (Get-Cluster).SameSubnetThreshold = 10 (Get-Cluster).CrossSubnetThreshold = 20
Dependencies in Cluster Validation Report Network Name Resource IP Address Resource A IP Address Resource B OR
DNS Replication Record Created Record Updated Record Obtained DNS Client access point fails across subnets Client needs new address Nodes in dissimilar subnets
Network Virtualization
PowerShell syntax: Get-ClusterResource ClusNN | Set-ClusterParameter RegisterAllProvidersIP 1 Get-ClusterResource ClusNN | Set-ClusterParameter HostRecordTTL 300
VM = DNS
DNS Server 1 DNS Server 2 FS = VLAN DNS
DNS Server 1 DNS Server 2 VM = DNS
ValueDescription 0Clear Text 1Signed (default) 2Encrypted PowerShell syntax: (Get-Cluster). SecurityLevel = 2
Adjust intra-node heartbeat thresholds Understand NetName Resource Configuration Optimize Client Reconnection on CAP Failover Encrypt intra-node communication over unsecure WANs
Each node can have 1 vote Witness can only have 1 vote
Vote Site 2 Down!!! Site 1 can reach Cloud Witness! Cluster Survives!
Azure Witness
Cloud WitnessFile Share Witness Share the same arbitration logic Do not keep copy of cluster database
Cluster Site 1 Site 2
Vote Loss of Primary Site: Start-ClusterNode -ForceQuorum Recovery of Primary Site: Start-ClusterNode -PreventQuorum
PowerShell syntax: Get-ClusterGroup MyVM | Set-ClusterOwnerNode Node1, Node2
Recommended to use Cloud Witness When no access to Azure use File Share Witness in a 3 rd site Automatic failover – Keep number of nodes on primary and secondary sites equal Manual failover – Remove votes of nodes on secondary site
Chicago (you are here) NYC “Can you hear me now?”
Replication Block-level, volume-based Synchronous & asynchronous SMB transport Flexibility Any Windows volume Any fixed disk storage Any storage fabric Managemen t Failover Cluster Manager Windows PowerShell WMI End to end MS Storage Stack
Cluster Site1 Site2
Applications (local or remote) Source Server Node (SR) Data Log 1 t 2 Destination Server Node (SR) Data Log t1t