Download presentation
Presentation is loading. Please wait.
Published byValerie Ramsey Modified over 8 years ago
1
Level 400 SQL Server 2012 AlwaysOn Deep Dive Christian Bolton, Coeo Ltd
2
Level 400 AGENDA Introduction AlwaysOn Failover Clustering AlwaysOn Availability Groups
3
Level 400 INTRODUCTION High-Availability & Disaster Recovery AlwaysOn Failover Clustering AlwaysOn Availability Groups
4
Level 400 ALWAYSON FAILOVER CLUSTERING Features Provides a virtual network name Allows failover of an entire instance Requires shared storage There is only one copy of the data What’s new? Multi-subnet support Flexible failover policy and diagnostics Support for local tempdb
5
Level 400 ALWAYSON FAILOVER CLUSTERING V-LAN SAN Replication IP: 10.10.10.10 subnet 1 subnet 2 Network Name: SqlClus Local SiteRemote Site
6
Level 400 ALWAYSON FAILOVER CLUSTERING subnet 2 subnet 1 SAN Replication IP1: 10.168.0.10 IP2: 192.168.0.10 Corpnet OR Network Name: SqlClus Local Site Remote Site
7
Level 400 ALWAYSON FAILOVER CLUSTERING
8
Level 400 ALWAYSON FAILOVER CLUSTERING Multi-Subnet Clustering Requirements Enterprise Edition of SQL Server Windows Server 2008 R2 + SAN replication for cross-sote DR Single Active Directory domain for all nodes
9
Level 400 ALWAYSON FAILOVER CLUSTERING Multi-Subnet Clustering Best Practices Step 1: Select the Quorum Mode Step 2: Tune the WSFC heartbeat Step 3: Select SAN replication level and mode Step 4: Set DNS settings
10
Level 400 BEST PRACTICE 1 – QUORUM MODE Node and File Share Majority Even number of nodes Node Majority Odd number of nodes Majority nodes on primary site Force quorum needed when primary site is down
11
Level 400 BEST PRACTICE 2 – HEARTBEAT SETTINGS Default Value Frequency is once per 1,000 milliseconds If 5 heartbeats are missed then initiate failover Tune the setting for cross subnet heartbeat CrossSubnetDelay can be up to 4,000 milliseconds CrossSubnetThreshold can be up to 10 http://technet.microsoft.com/en-us/library/dd197562(v=WS.10).aspx
12
Level 400 BEST PRACTICE 3 – SAN REPLICATION Choose your replication level Block, File System, or Application Level DFS-Replication not supported Preserve block size and write ordering to prevent data corruption Choose your replication mode according to network latency Synchronous if Network latency < 10ms Asynchronous is Network latency > 10ms
13
Level 400 BEST PRACTICE 4 – DNS SETTINGS Shorten the HostRecordTTL Default is 1,200 seconds (20 minutes) Cluster.exe res /priv HostRecordTTL=60 Shorter TTL puts more pressue on DNS Servers Reduce DNS replication delay DNS/AD inter-site replication schedule is 180 minutes by default Set replication frequency to no more than 15 minutes
14
Level 400 ALWAYSON FAILOVER CLUSTERING Tempdb can be configured on a local disk Why? Tempdb access occupies a large amount of SAN I/O Fast Solid State Storage is very accessible Better overall performance and cost saving Needs same path available on all nodes
15
Level 400 IMPROVED FAILURE DETECTION Need to eliminate false failures Make the necessary data for non-repro root cause analysis of SQL Server failures available Create a healthcheck mechanism which accurately identifies all detectable SQL Server failures
16
Level 400 IMPROVED FAILURE DETECTION Before SQL Server 2012
17
Level 400 IMPROVED FAILURE DETECTION Before SQL Server 2012 Too many false failovers Server too busy to take new connections Query timeout couldn’t be configured Single query failure would cause failover Ping pong during heavy load No failover when SQL Server is hung @@servername runs, everything else is broken
18
Level 400 IMPROVED FAILURE DETECTION How do you fix it? Part 1 – Create a mechanism to determine health state sp_server_diagnostics Architect the resource dll to use the new model Part 2 – Allow the user to configure what healthy means to them New configuration options in the resource dll
19
Level 400 IMPROVED FAILURE DETECTION sp_server_diagnostics Analyzes system state Reliable when nothing else is working Reports health state when nothing is working Component name Health state: Clean, Warning, Error, Not determined Extra logging for troubleshooting Memory status Wait stats, blocker report XEvents ring buffer
20
Level 400 IMPROVED FAILURE DETECTION Cluster resource dll Uses result from sp to determine when to failover Configure sensitivity level Configurable healthcheck timeout
21
Level 400 IMPROVED FAILURE DETECTION Syntax sp_server_diagnostics @repeat_interval=[sec] Run in a loop, report health status every n seconds Run once and stop when interval = 0
22
Level 400 IMPROVED FAILURE DETECTION User-Configurable Failure Detection 0 – No Automatic Failover or restart Service is down 1 – Failover or restart on server down No response from sp_server_diagnostics 2 – Failover or restart on server unresponsive System errors 3 – Failover or restart on critical SQL Server errors Resource errors 4 – Failover or restart on moderate SQL Server errors Query Processing errors 5 – Failover/restart on any qualified failure conditions
23
Level 400 AlwaysOn Availability Groups Combines the best of database mirroring and failover clustering Databases can be grouped together for failover Availability Group is the new unit of failover Up to four database replicas can be created Two can be synchronous Replicas can be “active” Readable for real-time reporting Offloaded backups “Managed” by Failover Clustering No requirement for shared storage
24
Level 400 AlwaysOn Availability Groups Backup capabilities Backups can be done on any replica Secondary replica may be synchronous or asynchronous Backups on primary still work Log backups on all replicas for a single log chain
25
Level 400 AlwaysOn Availability Groups Log backups form a single log chain
26
Level 400 AlwaysOn Availability Groups Backup restrictions, cautions, gotchas Differential backups are not supported on secondary Only copy_only full backups are supported on secondary Advisable for backups to be stored centrally
27
Level 400 AlwaysOn Availability Groups Automated backups How do you choose which replica to use for backup? With database mirroring, only the primary would work Now, backups succeed on all replicas Solution: Declarative policy
28
Level 400 AlwaysOn Availability Groups Declarative backup policy Preference for which role to use Primary Only Secondary Only Prefer Secondary Any Assign a relative priority to each replica
29
Level 400 AlwaysOn Availability Groups Declarative backup policy Logic Filter out replicas which are not up and online Filter out replicas which don’t meet the policy role Select the highest priority replica among the remaining set
30
Level 400 AlwaysOn Availability Groups Declarative backup policy Policy is advisor only and not enforced Automatically used by Maintenance Plans and Log Shipping Implemented as a system function which returns a boolean
31
Level 400 AlwaysOn Availability Groups Declarative backup policy Schedule the same job on all replicas and only one will run each time. If sys.fn_hadr_backup_is_preferred_replica ( @dbname ) = 1 BEGIN BACKUP DATABASE…. END
32
Level 400 DEMO AlwaysOn Availability Groups
33
Level 400 AlwaysOn Availability Groups Readable Secondary DB2DB1 SQLservr.exe InstanceA DB2DB1 Primary Secondary Database Log Synchronization InstanceB Reports Primary Secondary Reports CRASH
34
Level 400 AlwaysOn Availability Groups Impact of Read Workload REDO thread could get blocked by reporting workload REDO thread and read workload can deadlock Solution Internally maps to Snapshot Isolation Ignore all locking hints Never choose REDO as deadlock victim Result Blocking and deadlocks are eliminated
35
Level 400 AlwaysOn Availability Groups Query Performance on Secondary GOAL Comparable query plan on Readable Secondary Auto-Create Statistics enabled on Readable Secondary Temporary statistics are persisted in tempdb
36
Level 400 AlwaysOn Availability Groups Setting up a readable secondary None All Read_Intent Only
37
Level 400 AlwaysOn Availability Groups ApplicationIntent A new connection property Used to gate access to secondary ALLOW_CONNECTIONS=READ_ONLY Connect directly to secondary instance Read-Only Routing Connect to AG Listener and get automatically routed to a readable secondary
38
Level 400 DEMO AlwaysOn Availability Groups - Readable Secondary
39
Level 400 QUESTIONS? After the session please fill out the questionnaire. Questionnaires will be sent to you by e-mail and will be available in the profile section of the NT Conference website www.ntk.si.www.ntk.si Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.