High Availability in SQL Server 2012

High Availability in SQL Server 2012
Techniques to reduce downtime Eric Peterson

Agenda Overview of SQL Sever Methods of High Availability
From hardware thru methods not intended to teach how to implement Tips given, tips accepted, Review of methods, terms, features Log Shipping Replication Mirroring Clustering Always On Availability groups Other Things to think about. Comparison on methods

Speaker Background 30+ years professional experience 70s Mainframe
80s Database IDMS, IDSII, Oracle, Sybase DB2 v1.6  DB2 2.0 90s Sybase Pro Serve SQL Server, ODBC, PowerDesigner Design and Beta Teams 00s Independent Consultant - MS SQL Server Current - BCD Travel

High Availability, It Is All About the 9s
Downtime Outage Seconds Day Week Month Year ½ min 30 1 min 60 5 min 300 10 min 600 15 min 900 30 min 1800 1 hour 3600 2 hours 7200 8 hours 28800 1 day 86400

High Availability (HA) Terms
Keeping the system up Is Not Disaster Recovery Recovering from when bad things happen Latency The amount of delay time it takes to synchronize between two systems Temperature Hot – Always up, always in Sync Warm – Close, but has defined latency Cold – manual intervention, defined loss

Methods Technologies that have an impact on HA Maintenance & Backups
Replication Log Shipping Mirroring Sync Mirroring Async Clustering Always On Availability Third Party Software

In the Beginning PCs in the post mainframe world Departmental apps
Single points of failure Local disk drives System board, Memory Disk controller Power supply Backups ?maybe? Loss of infrastructure Network, etc

Resolving Single Points of Failure
Redundant Power , Power Supplies Battery/generator backup Network cards Device Controllers Disks  SANs Fault tolerant disks RAID Redundant Array of Independent Disks

RAID Levels Redundant Array of Independent Disks Raid 0 RAID 1
Block-level striping without parity RAID 1 Mirroring without parity or striping.

RAID Levels RAID 5 Block-level striping with distributed parity.

RAID Levels Raid 10 - Also known as Raid 1+0
A combination of RAID 1 and RAID 0 Mirroring + striping No parity write – Faster for Inserts Double the space!!!! Hot Swap

RAID – Which is best BENCHMARK!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
If all things are equal – “MPF” RAID 10 SAN Cache If enough cache parity write may not impact performance Device Controller dependent Single Point of IO If one disk corrupts then the other corrupts

More Redundancy is better
Hardware SAN Replication, mirroring, copy, etc Software Redundancy Old School “BAR” Replication Log Shipping Mirroring

Backup and Recover “BAR”
Manual or home grown process Backup Copy File Restore Backup Ability to query/develop against non production Scrub production data Latency defined 24 hours 1 week, month, quarter Does not work as well for VLDB Apply differentials

Replication Reads the Log Distribution Database Publisher Distributor
Keeps track Publisher Distributor Can be on same SQL Server Subscriber One to many subscribers Process can run in near real time Can schedule as well

Replication Types Snapshot Transactional Replication Merge
Snapshot Agent Schema and Data Transactional Replication One Way Publish to Subscribers Incremental changes Continuous or Scheduled Merge Merge Agent for Conflict Resolution Every row is given a unique identifier Detection and resolution

Replication Features Can Publish Stored Procs Monitoring Alerts
To non-SQL Server Subscribers To Multiple Subscribers Table by Table Stored Procs Monitoring Alerts

Log Shipping Provides Backup of current DB in Secondary DB
Read Only Copy Transaction Logs “Shipped” Automated “BAR” SQL Agent Jobs Latency Schedule Script Dis/Enable One to Many DBs

Log Shipping Issues Failover is not automatic you have to reset everything Read only DB can be used in Manual Failover Require application changes When log is being applied, Secondary read only DB connections are dropped Network dependent Log backup Size Job Schedule default times Spread schedule

Mirroring Dual Write for single DB on different SQL Servers
Asynchronous (High Performance) Synchronous (High Safety Mode) Good redundancy Manual or Automatic failover Rolling update 2008 Enterprise Resolution of page errors

Mirroring Issues Managing multiple databases so that if one fails they all fail is difficult if not impossible. Synchronous needs to be close Local or Dark Fiber There is only one mirror of the database The mirror is not directly usable it just sits there unless you are prepared to work with snapshots There is no mirror after the failover, the mirroring state is DISCONNECTED and the principal is exposed A SQL Server native client is needed to use mirroring

Clustering Overview SAN Node l Node N Cluster SQL Server Instance
Heartbeat Node l Node N SQL Server Instance Failover SAN Quorum Drive Virtual Server Group SQL Server Windows Server

Clustering Overview SAN Node l Node N Cluster SQL Server Instance
Heartbeat Node l Node N SQL Server Instance SAN Quorum Drive Virtual Server Group SQL Server Windows Server Failover

Clustering Terms Node Service Group(old) Resources Heartbeat Quorum
SAN

Cluster Setup Cluster is a logical grouping of Resources
Failover Cluster Manager (2008+) Cluster Manager Each Machine (node) Windows Server IP Address Local and SAN Drives Each (Service/Group) Instance VM, windows server SQL Dedicated Resources

Cluster Failover Service/Group (VM) from one Node to Node
Best Available or Directed Usually takes under 30 seconds Automatic or Manual Heartbeat Monitor Application Independent Failover Tracking Failover Notification Proc SQL Server Log Cluster

Clustering Issues Failover usually fast Failover Issues
But can take several min to recover DBs Failover Issues Connections Drop Transactions Stop Failure to Connect Cluster Can fail SAN Disk Failure Memory / Resources Keep a spreadsheet

AlwaysOn Group/Cluster

AlwaysOn Terms Windows Server Failover Clustering Type
Always On Failover Cluster Instance Always On Availability Group (cluster change!) Can be either or both Listener: IP Address and DNS Name Logical Instance that programs attach Replica: SQL Server mirror copy of DBs

AlwaysOn Availability Groups
One Database or a Group of Databases Advanced Mirroring Multiple Secondary DBs Multiple Synchronous DBs Automatic Page repair Active Secondary Offloading workloads Backup/log from secondary Multiple Groups

AlwaysOn Availability
Database Group Failover Automatic or Manual Management Studio Management of Groups in Management Studio Dashboard No Shared SAN Local Attached disks Ability to repair from mirror Change raid level???

Mgmt Studio Primary Secondary Listener

AlwaysOn Availability Restrictions
All servers must be in the same domain Can be different data centers/cloud Up to 3 replicas can be synchronous Local, or dark fiber Up to 2 of them can be used for automatic failover All servers must use the same service account If using Kerberos Both AlwaysOn Availability (Group & Cluster ) Rely on Windows Server Failover Clustering infrastructure & Windows Cluster

AlwaysOn Failover Cluster Instance
Failover of the instance rather than at the DB level New Features Multi-site clustering across subnets for improved site protection. Flexible failover policy for better control over instance failover. Improved diagnostics for faster troubleshooting. TempDB on local drive allows better query performance.

2012 Always On Downside Failover time Two Machines two deploys for:,
Volume Dependent 30 seconds to 30 minutes Two Machines two deploys for:, Security Must be same SID for SQL ids SQL Agent Jobs need to be “primary” aware Secondary must be up bug Corruption

SIOS Software Solution Overview
Cluster Heartbeat Node l Node N SQL Server Instance Failover VM SAN Quorum Drive Virtual Server Group SQL Server Windows Server

Comparison of Methods * Maintenance - use most current versions Type
Latency Temp 9s HA DR Replication asynchronous - scheduled warm Manual Application must handle Very Good Backup daily cold OK Log Shipping Scheduled Cold - warm good to excellent. Mirroring (Sync) synchronous hot good monitor server very good Mirroring (Async) depends on volume manual Clustering N/A excellent excellent to poor for SAN failure Always On (Sync) Always On (Async) warm + * Maintenance - use most current versions

Maintenance Beyond Patch Tuesday Backups Index/Table fragmentation
Clustering Backups Tape Drive speed Index/Table fragmentation Online index Old Data Removal Size matters, smaller is better

What's Best for your environment?
If you are afraid to failover You do not have a valid system Define the 9s thru Legal Cheating Change the calculation Yes/No Remove scheduled maintenance time from calc time Change the definition SQL > Cluster Benchmark Knowledge

High Availability in SQL Server 2012

Similar presentations

Presentation on theme: "High Availability in SQL Server 2012"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

High Availability in SQL Server 2012

Similar presentations

Presentation on theme: "High Availability in SQL Server 2012"— Presentation transcript:

Similar presentations

About project

Feedback