Site Power OutageNetwork Disconnect Node Shutdown for Patching Node Crash Quorum Witness Failure How do I make sure my Cluster stays up ??... Add/Evict.

Slides:



Advertisements
Similar presentations
1 Windows 2008 Failover Clustering Witness/Quorum Models.
Advertisements

Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location.
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
Chapter 9 Chapter 9: Managing Groups, Folders, Files, and Object Security.
Keith Burns Microsoft UK Mission Critical Database.
Turn on…. Calm restored …. Application within DC failure DC location failure Loss of access from / to your office location.
5.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 5: Working with File Systems.
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Backup and Recovery Part 1.
Implementing Failover Clustering with Hyper-V
National Manager Database Services
Site B Site A SANSANSANSAN.
SharePoint Business Continuity Management with SQL Server AlwaysOn
Workflow Steps Perform a datacenter switchover for a database availability group Version 1.2 (Updated 12/2012)
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Introduction to Oracle Backup and Recovery
Gopal Ashok Program Manager Microsoft Corp Session Code: DAT 312.
Course 6425A Module 9: Implementing an Active Directory Domain Services Maintenance Plan Presentation: 55 minutes Lab: 75 minutes This module helps students.
But what if there is a catastrophic event? Fire, flood, earthquake …
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
Architecting Availability Groups
Implementing Multi-Site Clusters April Trần Văn Huệ Nhất Nghệ CPLS.
Recovery Manager Overview Target Database Recovery Catalog Database Enterprise Manager Recovery Manager (RMAN) Media Options Server Session.
Module 12: Designing High Availability in Windows Server ® 2008.
Chapter 7 Making Backups with RMAN. Objectives Explain backup sets and image copies RMAN Backup modes’ Types of files backed up Backup destinations Specifying.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter Fourteen Windows XP Professional Fault Tolerance.
Chapter 18: Windows Server 2008 R2 and Active Directory Backup and Maintenance BAI617.
Failover Clustering & Hyper-V: Multisite Disaster Recovery
DATABASE MIRRORING  Mirroring is mainly implemented for increasing the database availability.  Is configured on a Database level.  Mainly involves two.
DB-2: OpenEdge® Replication: How to get Home in Time … Brian Bowman Sr. Solutions Engineer Sandy Caiado Sr. Solutions Engineer.
SQLCAT: SQL Server 2012 AlwaysOn Lessons Learned from Early Customer Deployments Sanjay Mishra Program Manager Microsoft Corporation DBI360.
Unified solution Easy to configure, manage, and monitor Reuse existing investments SAN/DAS environments Allow using HA hardware resources Fast seamless.
SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Module 15 Managing Windows Server® 2008 Backup and Restore.
Speaker Name 00/00/2013. Solution Requirements.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
High Availability in DB2 Nishant Sinha
Peter Mattei HP Storage Consultant 16. May 2013
Stretching A Wolfpack Cluster Of Servers For Disaster Tolerance Dick Wilkins Program Manager Hewlett-Packard Co. Redmond, WA
Windows Server 2016 Cloud building brick Ljubo Brodarić Siemens CVC.
You there? Yes Network Health Monitoring Heartbeats are sent to monitor health status of network interfaces Are sent over all cluster.
Failover Clustering & Hyper-V: Multi-Site Disaster Recovery Symon Perriman Technical Evangelist Microsoft
Virtual Machine Movement and Hyper-V Replica
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
SQL Server 2012: AlwaysOn HA and DR Design Patterns, and Lessons Learned from Early Customer Deployments Sanjay Mishra SQLCAT.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
SQL Server 2014 AlwaysOn Step-by-Step SQL Server 2014 AlwaysOn Step-by-Step A hands on look at implementing AlwaysOn in SQL Server 2014.
VAROVANJE VIRTUALIZIRANEGA DATACENTRA – VISOKA RAZPOLOŽLJIVOST Gorazd Šemrov Microsoft Consulting Services
All the things you need to know before setting up AlwaysOn Michael Steineke SQL & BI Solution Lead Enterprise Architect Concurrency, Inc.
Windows Server Failover Clustering (WSFC) with SQL Server.
AlwaysOn In SQL Server 2012 Fadi Abdulwahab – SharePoint Administrator - 4/2013
Architecting Availability Groups An analysis of Microsoft SQL Server Always-On Availability Group architectures 1.
Introduction to Clustering
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
Troubleshooting Tools
Architecting Availability Groups
ALWAYSON AVAILABILITY GROUPS
Cluster Disks and Cluster File Storage
Maximum Availability Architecture Enterprise Technology Centre.
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
System and Communication Faults
SQL Server High Availability Amit Vaid.
Introduction to Clustering
Architecting Availability Groups
Planning High Availability and Disaster Recovery
02 | Cluster Deployment & Upgrades
Distributed Availability Groups
Designing Database Solutions for SQL Server
Presentation transcript:

Site Power OutageNetwork Disconnect Node Shutdown for Patching Node Crash Quorum Witness Failure How do I make sure my Cluster stays up ??... Add/Evict Node

Enable more disaster recovery quorum scenarios

Cluster needs majority of participating votes to survive More about this in later slides…

Stores a copy of cluster database

File Share Witness No copy of cluster database Minimal network traffic – Cluster membership change only

Latest cluster database copy on Disk Witness Updates Cluster database Cluster Database Updated Cluster Started with latest database

Prevents node with stale database from forming cluster Updates Cluster database Only Time-stamp Updated Cluster Not Started! No latest database

Witness: Disk vs. File Share DiskFile Share Prevents Split-Brain Prevents Partition-in-Time Solves Partition-in-Time Arbitration Type SCSI Persistent Reservation Witness.log file on SMB Share Recommended: Use Disk Witness if you have shared storage

Vote No Vote Site A Site B Vote

Original: Total Votes = 4 Majority Votes = 3 Updated: Total Votes = 3 Majority Votes = 2 No Vote Vote Quorum Maintained! Cluster Survives!

NodeWeight Default = 1 Remove Vote = 0 Cluster Assigned = 1 (Get-ClusterNode ).NodeWeight = 0 Use PowerShell or Configure Quorum Wizard

Last Man Standing Cluster can now survive with only 1 node 64-node cluster all the way down to 1 node Seamless Integration With existing cluster quorum features & configurations With multisite disaster recovery deployments

Always configure a witness with Windows Server 2012 R2 Clustering will determine when it is best to use the Witness Configure Disk Witness if shared storage, otherwise FSW New Recommendation

PowerShell (Get-Cluster).DynamicQuorum = 1 (Get-ClusterNode “name”).NodeWeight = 1

PowerShell (Get-ClusterNode “name”).DynamicWeight (read only) (Get-Cluster).WitnessDynamicWeight (read only)

Node Shutdown Node removes its own vote Node Join On successful join the node gets its vote back Node Crash Remaining active nodes remove vote of the downed node

Witness Offline Witness vote gets removed by the cluster Witness Online If necessary, Witness vote is added back by the cluster Witness Failure Witness vote gets removed by the cluster

Cluster Site1 Site2

4 Nodes + Witness Configured (N = Number of Votes) Vote Last Man Standing! Cluster Survives! N = 5 Majority = 3 N = 3 Majority = 2 Vote

Last Man Standing! Cluster Survives! N = 5 Majority = 3 N = 3 Majority = 2 N = 2 Majority = 2 N = 1 Majority = 1 5 Nodes + No Witness Configured (N = Number of Votes)

Cluster survives graceful shutdown of either node Node 1Node 2 StateUP NodeWeight11 DynamicWeight10

Cluster Quorum Wizard PowerShell Set-ClusterQuorum –NoWitness Set-ClusterQuorum –DiskWitness “DiskResourceName” Set-ClusterQuorum –FileShareWitness “FileShareName” Set-ClusterQuorum –DiskOnly “DiskResourceName” Updated PowerShell

Manual Override Allows to start cluster without majority votes Cluster starts in a special “forced quorum” mode Remains in this mode till majority votes achieved Cluster automatically switches to normal functioning Caution Always understand why quorum was lost Split-brain between nodes possible You are now in control! Prevent Quorum Flag Command Line: net start clussvc /ForceQuorum PowerShell: Start-ClusterNode –ForceQuorum

Helps prevent nodes with vote to form cluster Nodes started with ‘Prevent Quorum’ always join existing cluster Applicable to cluster in “Force Quorum” Always start remaining nodes with ‘Prevent Quorum’ Helps prevent overwriting of latest cluster database Forward progress made by nodes in ‘Force Quorum’ is not lost Most applicable in multisite DR setup Prevent Quorum Flag Command Line: net start clussvc /PQ PowerShell: Start-ClusterNode –PreventQuorum

Cluster detects partitions after a manual Force Quorum Cluster has the built-in logic to track Force Quorum started partition Partition started with Force Quorum is deemed authoritative Other partitions automatically restart up on detecting a FQ cluster Restarted nodes in other partition join the FQ cluster Cluster automatically restarts the nodes with Prevent Quorum Cluster Site1Site2 Manual Override with ForceQuorum Nodes Restarted When Site2 partition detected Nodes Restarted When Site2 partition detected

Multi-Site DR Quorum Considerations of Quorum with DR solutions

What are you Service Level Agreements (SLA’s)? In the event of a disaster, how do you want to switch to your DR site?

Node Vote Weight Adjustments All nodes equally important No need to modify node vote weights All Sites Equal Allow cluster to sustain failure of any one site Allow automatic failover of workload to the surviving site Number of Nodes per Site Keep equal number of nodes in both sites Helps cluster sustain failure of any site Otherwise the site with more nodes would become Primary site

Always Configure File Share Witness (recommended) File Server running at a separate site The separate site must be accessible from the workload sites Allows cluster to sustain communication loss between sites Witness Selection Highly Available File Server, for witness, in a separate cluster Disk Witness can be used as directed by storage vendor

Failover Example Vote Site-2 Down!!! Site-1 can reach FSW! Cluster Survives! Vote

Witness Dynamic Vote & Tie Breaker Vote Site-2 Down!!! Site-1 Wins!!! Cluster Survives! Cluster removes Node 3’s Vote Cluster removes Witness Vote

All Sites Not Equal Cluster cannot sustain failure of Primary site Allow cluster to sustain failure of the Backup site Node Vote Weight Adjustments Disallow nodes in Backup site in affecting cluster quorum Remove node vote weight of nodes in Backup site Number of Nodes per Site No requirement to keep equal number of nodes in both sites

Workload Management Use Preferred Owners to prioritize keeping workload on Primary site Recovery Actions Primary site failure would require “Force Quorum” on Backup site Recover Primary site nodes using “Prevent Quorum”

Always Configure Witness File Server running at a separate site (recommended) File Server running local in Primary Site may be Ok (consider recovery scenarios) Witness Selection Highly Available File Server, for witness, in a separate cluster Asymmetric Disk Witness can be used as well (consider recovery scenarios)

Disk Witness accessibility Subset of nodes can access the disk Witness can come online only on subset of nodes Most applicable in multi-site clusters Disk only seen by primary site Witness can come online only on primary site Cluster recognizes asymmetric storage topology Uses this to place cluster quorum group

Backup Site Down Vote No Vote Backup Site Down!!! No effect on Quorum! Cluster Survives! Vote

Recommended Recovery Vote No Vote Primary Site Down!!! Not enough Votes!!! Cluster Down!! 1 Force Quorum Cluster Start! 2 Start nodes with Prevent Quorum! 3 Successful Join to Force Quorum Backup nodes 4 Cluster Starts! Not in Force Quorum Vote

Recommended Recovery Vote No Vote Primary Site Down!!! Not enough Votes!!! Cluster Down!! Force Quorum Cluster Start! Vote No Vote Cluster Not in Force Quorum Assign Votes to Nodes in Backup Site Remove Votes from Old Primary Site New Primary Site! New Backup Site! Start these nodes with “Prevent Quorum” Vote