Download presentation
Presentation is loading. Please wait.
Published byGeorge Bartholomew Parker Modified over 8 years ago
1
Digging Out From Corruption Eddie Wuerch, MCM - Principal, Database Performance - Salesforce Marketing Cloud Data protection and loss recovery with SQL Server
2
I am a DBA I am a steward of my company’s data Data loss can close my company Data loss can ruin my career Data loss shall not occur
3
Hi, I’m Eddie! And I’m a DBA. Over 15 years SQL Server Microsoft Certified Master Salesforce Marketing Cloud ◦Trillions of rows … 10+ billion tx/day … PBs data & indexes… ◦…24x7, no downtimes
4
What is “Corruption”? Logical Corruption DELETE dbo.BigTable --------- A bazillion rows affected.
5
What is “Corruption”? Physical Corruption SELECT id,… dbo.BigTable --------- Error 824
6
Corruption LOGICAL – HUMAN ERROR Incorrect data mods Detection is up to you Manually fix data/restore DB PHYSICAL – DAMAGED MEDIA File damage Incomplete writes SQL Errors: 823, 824, 825 DBCC CHECKDB Discreet restore options AG Auto-repair (!!)
7
Physical Corruption- Detection CHECKSUM Page Verification ◦Always use this. Every database. Agent alerts: 823, 824, 825 msdb.dbo.suspect_pages Detection on page access. Corruption may lie dormant for a long time
8
823/824/825 - DON’T PANIC DBCC CHECKDB ◦Get used to this BEFORE disaster ◦Run without repair opts ◦Let it complete ◦Your problem may fixed by dropping an index ◦Investigate performance techniques
9
Preparation A backup never saved anybody’s job. The restore did. Plan for the restore, not the backup
10
The Restore Strategy RPO & RTO: What are your goals? Layers of disaster / layers of recovery ◦Disk, Server, Network, Datacenter… Time = money ◦Lower downtime = higher cost of equipment and labor ◦Higher RPO/RTO = higher potential cost of fines, loss of business, refunds, etc. ◦RPO/RTO determined by cost
11
Backup Options Full Backup Database, Filegroup, File All recovery models Differential Database, Filegroup, File All recovery models Transaction log Database only Not available in SIMPLE recovery mode
12
The Full Database Backup Restore an entire database Begin a point-in-time restore Begin point of a FG/file/page restore (pull 8kb from last week’s backup, place it in running database) Does NOT break the log chain
13
The Full Backup File(s) Contains every allocated page Plus enough tx log to bring DB consistent Tx log will not be cleared during full backup (space planning)
14
Differential Backups All changed extents since last Full backup Plus enough tx log for consistency Can save lots of time on restores
15
Log Backups Changes since last log backup Sequential record of all changes Can be taken after loss of data file(s), if log file is available (Full Recovery Model only) N/A for Simple Recovery Model
16
The Transaction Log One file per DB is enough Write-ahead logging Both redo and undo tracked ACID
17
The Transaction Log Recovery Model vs. Logging Model Crash recovery
18
Bulk-Logged Recovery Model Full recovery model, with exceptions: ◦Minimally-logged transactions (ML) only record allocations ◦Log can’t redo – CHECKPOINT on commit (ouch!) Log backups of ML transactions include all changed data pages
19
The Log Chain Each log backup = changes since last log backup The sequential collection of restorable transaction log backups = log chain Starts with a full backup Is not tied to the most recent full backup
20
BACKUP… MIRROR TO Enterprise Edition only Specify additional copies of backup file(s) Up to 3 mirrors Works with Full, Diff, and Log backups
21
Restore Options Entire database in one operation Partial (Ent.Ed.) ◦Restore PRIMARY FG, bring DB online ◦Restore additional FGs, bring online one-by-one (partitioning bonus) Corruption Fixes (Online if Ent.Ed.): ◦Restore damaged files ◦Repair damaged pages
22
Restore Options BACKUP LOG … TO DISK = ' V:\Logs\... ' MIRROR TO DISK = ‘ W:\Logs\... '
23
Restore Options
26
Demo Disk corruption – non-clustered index Disk corruption – clustered index Lost data file Physical Corruption: detection and different restore/repair types (Let’s break stuff!)
27
Snapshots …are not backups.
28
Snapshots SQL, SAN, etc – generally the same Point-in-time image Copy-on-Write Only changed pages in snapshot (pre-change) Are not backups
29
Snapshots #DEVCONNECTIONS
30
Document, Practice, Drill, Repeat At restore time: panic, anguish, and unhappy executives Crises don’t honor vacation schedules or work hours Script, automate, document
31
Stick around for the raffle! Then join us at the afterparty at Champps Americana Thanks for attending! Please fill out the survey. Download these slides and scripts at SQLSaturday.com
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.