Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
Sofia, Bulgaria | 9-10 October Agenda ●What is high availability ●Common availability issues & solutions ●SQL Server 2005 high availability features ●Common development mistakes ●A simple example ●What is high availability ●Common availability issues & solutions ●SQL Server 2005 high availability features ●Common development mistakes ●A simple example
Sofia, Bulgaria | 9-10 October High availability Percentage Downtime (per year) 100%None %< 5.26 minutes 99.99%5.26 – 52 minutes 99.9 %52 m – 8 h, 45 min 99 %8 h, 45 m – 87 h, 36 m 90%788 h, 24 m – 875 h, 54 m
Sofia, Bulgaria | 9-10 October Availability barriers ●People ●Process ●Technology ●People ●Process ●Technology
Sofia, Bulgaria | 9-10 October Availability barriers(2) ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error ●… and more not related to development! ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error ●… and more not related to development!
Sofia, Bulgaria | 9-10 October Sample Scenario ●Three tier application ●Database (SQL Server 2005) ●Business logic (web service) ●UI (ASP.NET & Windows app) ●Server application (windows service) ●Uptime goal is % ●Focus is on DB and data access ●DB contains tables with billions records ●Number of users> ●Three tier application ●Database (SQL Server 2005) ●Business logic (web service) ●UI (ASP.NET & Windows app) ●Server application (windows service) ●Uptime goal is % ●Focus is on DB and data access ●DB contains tables with billions records ●Number of users>100000
Sofia, Bulgaria | 9-10 October Hardware failures ●Database failure ●Cluster ●Database mirroring ●Other ●Replication ●Log shipping ●Business logic/WEB front end failure ●Network load balancing ●Main concern is state ●Database failure ●Cluster ●Database mirroring ●Other ●Replication ●Log shipping ●Business logic/WEB front end failure ●Network load balancing ●Main concern is state
Sofia, Bulgaria | 9-10 October Clustering ●Cluster ●Hot standby – automatic failover ●Transparent to client ●Single storage ●Failover time ●From 15 seconds ●Can take >1h if recovery takes more time ●Cluster ●Hot standby – automatic failover ●Transparent to client ●Single storage ●Failover time ●From 15 seconds ●Can take >1h if recovery takes more time
Sofia, Bulgaria | 9-10 October Recovery process ●SQL Server 2000 ●Database is available after Undo completes ●SQL Server 2005 Enterprise Ed. ●Database is available when Undo begins ●SQL Server 2000 ●Database is available after Undo completes ●SQL Server 2005 Enterprise Ed. ●Database is available when Undo begins UndoRedo Time AvailableUndoRedoAvailable
Sofia, Bulgaria | 9-10 October Mirroring ●No special hardware required ●Instant stand by ●2-3 seconds ●Separate databases ●Witness server can be used ●Transparent client redirect ●Protects DB not Server ●No special hardware required ●Instant stand by ●2-3 seconds ●Separate databases ●Witness server can be used ●Transparent client redirect ●Protects DB not Server
Sofia, Bulgaria | 9-10 October Mirroring options ●High performance ●Transactions are applied asynchronously ●High safety ●Transactions are applied synchronously ●Automatic failover possible with witness ●High performance ●Transactions are applied asynchronously ●High safety ●Transactions are applied synchronously ●Automatic failover possible with witness
Sofia, Bulgaria | 9-10 October Other choices ●Warm stand-by ●Replication ●Log shipping ●Cold stand-by ●Backup & restore ●Detach/Copy/Attach ●Warm stand-by ●Replication ●Log shipping ●Cold stand-by ●Backup & restore ●Detach/Copy/Attach
Sofia, Bulgaria | 9-10 October DB mirroring and application access demo
Sofia, Bulgaria | 9-10 October Development issues ●Application should recover from DB connectivity issues ●Sounds obvious, but this is a common mistake ●Retry logic should be implemented for DB operations ●Not so obvious ●Transactions should be retried ●Application should recover from DB connectivity issues ●Sounds obvious, but this is a common mistake ●Retry logic should be implemented for DB operations ●Not so obvious ●Transactions should be retried
Sofia, Bulgaria | 9-10 October Agenda ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error
Sofia, Bulgaria | 9-10 October Software updates ●OS updates ●Handled by both clustering and mirroring ●NLB can be used to mitigate the effect on application servers ●SQL Server updates ●Not handled by clustering (node is down) ●Rolling updates with DB mirroring ●Improved update time in SQL 2005 ●OS updates ●Handled by both clustering and mirroring ●NLB can be used to mitigate the effect on application servers ●SQL Server updates ●Not handled by clustering (node is down) ●Rolling updates with DB mirroring ●Improved update time in SQL 2005
Sofia, Bulgaria | 9-10 October DB Schema changes ●Changing DB schema ●Holding locks for long periods of time ●Breaking application ●Breaking replication ●On large databases this can cause more downtime than hardware failures ●Changing DB schema ●Holding locks for long periods of time ●Breaking application ●Breaking replication ●On large databases this can cause more downtime than hardware failures
Sofia, Bulgaria | 9-10 October Minimizing locks ●Changes to tables require schema modify lock ●Prevents any other access to the table, even metadata access ●Avoid ●Adding non-null columns with a default ●Changing the width of a column ●Except increase of varchar length ●Adding check constraints ●Changes to tables require schema modify lock ●Prevents any other access to the table, even metadata access ●Avoid ●Adding non-null columns with a default ●Changing the width of a column ●Except increase of varchar length ●Adding check constraints
Sofia, Bulgaria | 9-10 October Minimizing locks (2) ●The following operations are fast even on large tables ●Adding a column with NULL values allowed ●Adding a column with NULL values allowed and default value ●Changing NOT NULL to NULL for a column ●Adding DEFAULT constraint ●Drop constraint/column ●The following operations are fast even on large tables ●Adding a column with NULL values allowed ●Adding a column with NULL values allowed and default value ●Changing NOT NULL to NULL for a column ●Adding DEFAULT constraint ●Drop constraint/column
Sofia, Bulgaria | 9-10 October Effect on application ●Hide changes from app ●Use stored procs or views if possible ●If db schema changes break the application: ●Upgrade app first to support old and new schema ●Upgrade db schema ●In a future release of the app remove support for old schema ●Hide changes from app ●Use stored procs or views if possible ●If db schema changes break the application: ●Upgrade app first to support old and new schema ●Upgrade db schema ●In a future release of the app remove support for old schema
Sofia, Bulgaria | 9-10 October Effect on replication ●SQL special scripts are used ●SQL 2005 ●Uses DDL triggers ●The following changes are handled: ●Alter table, alter view, alter trigger ●Alter procedure, alter function ●Both SQL 2000 & 2005 do not handle data conflicts (data change on a replication partner conflicts with the new schema) ●SQL special scripts are used ●SQL 2005 ●Uses DDL triggers ●The following changes are handled: ●Alter table, alter view, alter trigger ●Alter procedure, alter function ●Both SQL 2000 & 2005 do not handle data conflicts (data change on a replication partner conflicts with the new schema)
Sofia, Bulgaria | 9-10 October Schema change demo
Sofia, Bulgaria | 9-10 October Agenda ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error
Sofia, Bulgaria | 9-10 October Concurrency issues ●Index maintenance ●Index creating or rebuilding can be a long operation ●On SQL 2000 locks the whole table ●On SQL 2005 Enterprise edition this can be an online operation ●Application locks ●Snapshot isolation ●Based on storing multiple row versions ●Index maintenance ●Index creating or rebuilding can be a long operation ●On SQL 2000 locks the whole table ●On SQL 2005 Enterprise edition this can be an online operation ●Application locks ●Snapshot isolation ●Based on storing multiple row versions
Sofia, Bulgaria | 9-10 October Snapshot isolation ●Allows non-blocking consistent reads ●Writers do not block readers; readers do not block writers ●Increases concurrency and data availability while reducing deadlocks ●Two options for snapshot isolation – statement or transaction level ●Allows non-blocking consistent reads ●Writers do not block readers; readers do not block writers ●Increases concurrency and data availability while reducing deadlocks ●Two options for snapshot isolation – statement or transaction level
Sofia, Bulgaria | 9-10 October Snapshot isolation (2) ●Transaction level snapshot isolation SET TRANSACTION ISOLATION LEVEL SNAPSHOT ●Read operation do not acquire locks ●Read retrieve the values that existed at the transaction start ●… but permits writes, which can cause conflicts (that are resolved) ●Transaction level snapshot isolation SET TRANSACTION ISOLATION LEVEL SNAPSHOT ●Read operation do not acquire locks ●Read retrieve the values that existed at the transaction start ●… but permits writes, which can cause conflicts (that are resolved)
Sofia, Bulgaria | 9-10 October Snapshot isolation (3) ●Statement level snapshot isolation SET TRANSACTION ISOLATION LEVEL READ COMMITTED Enable READ_COMMITTED_SNAPSHOT option ●Read retrieve the values that existed at the statement start ●… but writers do block writers ●Statement level snapshot isolation SET TRANSACTION ISOLATION LEVEL READ COMMITTED Enable READ_COMMITTED_SNAPSHOT option ●Read retrieve the values that existed at the statement start ●… but writers do block writers
Sofia, Bulgaria | 9-10 October Snapshot isolation demo
Sofia, Bulgaria | 9-10 October How does it work? ●Stores row versions in tempdb ●Depending on the isolation level the appropriate row version is retrieved ●Greatly reduces locks ●… but might cause write conflicts ●They are solved with automatic rollback ●Stores row versions in tempdb ●Depending on the isolation level the appropriate row version is retrieved ●Greatly reduces locks ●… but might cause write conflicts ●They are solved with automatic rollback
Sofia, Bulgaria | 9-10 October Agenda ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error ●Hardware failures ●Updates ●Operating system & Database server ●Application & DB Schema updates ●Data Access Concurrency Limitations ●User or Application Error
Sofia, Bulgaria | 9-10 October User or app errors ●Users, applications, and DBAs do make errors ●Good process and tools can mitigate this risk ●This is the most typical availability issue ●Restore from backup.. ●Is there any good backup? ●How much time is needed to do the Restore? ●Total downtime can be very high ●Users, applications, and DBAs do make errors ●Good process and tools can mitigate this risk ●This is the most typical availability issue ●Restore from backup.. ●Is there any good backup? ●How much time is needed to do the Restore? ●Total downtime can be very high
Sofia, Bulgaria | 9-10 October DB snapshots ●Database can go back in time ●DB can be reverted to an existing snapshot ●Snapshots are created immediately ●Works with other availability options ●DB Mirroring ●Failover cluster ●Database can go back in time ●DB can be reverted to an existing snapshot ●Snapshots are created immediately ●Works with other availability options ●DB Mirroring ●Failover cluster
Sofia, Bulgaria | 9-10 October DB snapshots (2) ●Recover from User, Application, or DBA error ●Revert database to previously created Database Snapshot ●Static, time-consistent copy for reports ●Combined with DB Mirroring enables reporting on the standby ●Recover from User, Application, or DBA error ●Revert database to previously created Database Snapshot ●Static, time-consistent copy for reports ●Combined with DB Mirroring enables reporting on the standby
Sofia, Bulgaria | 9-10 October DB snapshots (3) ●Space efficient ●Does not require a complete copy of the data ●Shares unchanged pages of the database ●Requires extra storage only for changed pages ●Uses a “copy-on-write” mechanism ●Database Snapshot may affect performance on the base database ●Space efficient ●Does not require a complete copy of the data ●Shares unchanged pages of the database ●Requires extra storage only for changed pages ●Uses a “copy-on-write” mechanism ●Database Snapshot may affect performance on the base database
Sofia, Bulgaria | 9-10 October How do they work? ●NTFS Sparse files are utilized ●Only changed data is actually populated ●NTFS Sparse files are utilized ●Only changed data is actually populated
Sofia, Bulgaria | 9-10 October How to use them? ●To create a database snapshot CREATE DATABASE mySnapshot ON ( ) AS SNAPSHOT OF myDB ●To drop a database snapshot DROP DATABASE mySnapshot ●To revert a database to a snapshot RESTORE DATABASE myDB FROM DATABASE_SNAPSHOT = ‘mySnapshot' ●To create a database snapshot CREATE DATABASE mySnapshot ON ( ) AS SNAPSHOT OF myDB ●To drop a database snapshot DROP DATABASE mySnapshot ●To revert a database to a snapshot RESTORE DATABASE myDB FROM DATABASE_SNAPSHOT = ‘mySnapshot'
Sofia, Bulgaria | 9-10 October Database snapshots demo
Sofia, Bulgaria | 9-10 October Summary ●Maintaining high availability requires excellent planning ●Application architecture is more important then the specific high availability technologies ●Never forget that most of the issues come from human mistakes ●Maintaining high availability requires excellent planning ●Application architecture is more important then the specific high availability technologies ●Never forget that most of the issues come from human mistakes
Sofia, Bulgaria | 9-10 October Q & A
Sofia, Bulgaria | 9-10 October Please fill out the survey forms! They are the key to amazing prizes that you can get at the end of each day Thank you!
Sofia, Bulgaria | 9-10 October