1 Data Guard Basics Julian Dyke Independent Consultant Web Version - February 2008 juliandyke.com © 2008 Julian Dyke
juliandyke.com © 2008 Julian Dyke 2 Agenda Data Guard The Theory The Reality
juliandyke.com © 2008 Julian Dyke 3 Data Guard The Theory
juliandyke.com © 2008 Julian Dyke 4 Data Guard Reasons for Deployment Site Failures Power failure Air conditioning failure Flooding Fire Storm damage Hurricane Earthquake Terrorism Sabotage Plane crash Planned Maintenance HUMAN ERROR
juliandyke.com © 2008 Julian Dyke 5 Primary DatabaseStandby Database Data Guard Standby Database Primary Instance Database Site 1 Database Standby Instance Site 2 Redo
juliandyke.com © 2008 Julian Dyke 6 Data Guard Physical Standby Physical Standby Technology introduced in Oracle 7.2 Marketed as Data Guard in Oracle and above Standby is identical copy of primary database Redo changes transported from primary to standby applied on standby (Redo Apply) Can switch operations to standby Planned (switchover / switchback) Unplanned (failover) Failover time dependent on various factors Rate of redo generation / size of redo logs Redo transport / apply configuration
juliandyke.com © 2008 Julian Dyke 7 Data Guard Logical Standby Introduced in Oracle 9.2 Subset of database objects Redo copied from primary to standby Changes converted into logical change records (LCR) Logical change records applied on standby (SQL Apply) Standby database can be opened for updates Can modify propagated objects Can create new indexes for propagated objects May need larger system for logical standby LCR apply can be less efficient than redo apply Array updates on primary become single row updates on standby
juliandyke.com © 2008 Julian Dyke 8 Data Guard Protection Modes Three protection modes: Maximum protection - zero data loss Redo synchronously transported to standby database Redo must be applied to at least one standby before transactions on primary can be committed Processing on primary is suspended if no standby is available Maximum availability - minimal data loss Similar to maximum protection mode If no standby database is available processing continues on primary Maximum performance (default) Redo asynchronously shipped to standby database If no standby database is available processing continues on primary
juliandyke.com © 2008 Julian Dyke 9 Data Guard Redo Log Shipping ARCH background process Copies completed redo log files to standby LGWR background process - modes are: ASYNC - asynchronous Oracle 10.1 and below redo written by LGWR to dedicated area in SGA read from SGA by LNSn background process Oracle 10.2 and above redo written by LGWR to local disk read from disk by LNSn background process SYNC - synchronous Redo written to standby by LGWR - modes are: AFFIRM - wait for confirmation redo written to disk NOAFFIRM - do not wait
juliandyke.com © 2008 Julian Dyke 10 Data Guard ARCH Redo Transmission ARC0ARC1 Online Redo Log LGWRRFS Standby Redo Log ARCn Archived Redo Logs MRP LSP Standby Database Primary Database LOG_ARCHIVE_DEST_1 LOG_ARCHIVE_DEST_2 Primary DatabaseStandby Database Archived Redo Logs
juliandyke.com © 2008 Julian Dyke 11 Data Guard LGWR (ASYNC) Redo Transmission Archived Redo Logs ARCn RFS Standby Redo Log ARCn Archived Redo Logs MRP LSP Standby Database Primary Database LOG_ARCHIVE_DEST_1 Primary DatabaseStandby Database LNSn LGWR Online Redo Log
juliandyke.com © 2008 Julian Dyke 12 Data Guard LGWR (SYNC) Redo Transmission Archived Redo Logs ARCn Online Redo Log RFS Standby Redo Log ARCn Archived Redo Logs MRP LSP Standby Database Primary Database LOG_ARCHIVE_DEST_1 Primary DatabaseStandby Database LNSnLGWR
juliandyke.com © 2008 Julian Dyke 13 Data Guard Role Transitions There are two types of role transition Switchover Planned failover to standby database Original primary becomes new standby Original standby becomes new primary No data loss Can switchback at any time Failover Unplanned failover to standby database Original standby becomes new primary Original primary may need to be rebuilt Possible data loss
juliandyke.com © 2008 Julian Dyke 14 After Switchover Data Guard Switchover Before Switchover Primary Instance Database Primary Database Site1 Database Physical Standby Instance Standby Database Site2 Standby Database Physical Standby Instance Database Site1 Database Primary Instance Primary Database Site2 Redo
juliandyke.com © 2008 Julian Dyke 15 Database Physical Standby Instance Standby Database Site2 After Failover Data Guard Failover Before Failover Primary Instance Database Primary Database Site1 Database Physical Standby Instance Standby Database Site2 Unavailable Primary Instance Database Site1 Database Primary Instance Primary Database Site2 Redo
juliandyke.com © 2008 Julian Dyke 16 Data Guard Read-Only Mode Physical standby database can be opened in read-only mode (Managed) Recovery must be suspended Reports can use temporary tablespaces Sorts Temporary tables Reports cannot modify permanent objects Failover times may be affected Suspended redo must be applied
juliandyke.com © 2008 Julian Dyke 17 Data Guard Delayed Redo Application Delay in redo application can be configured Redo is transported immediately Provides protection against site failure Redo is not applied immediately Provides protection against human error Increases potential failover times In Oracle 10.1 and above flashback database can be used as an alternative to delayed redo application
juliandyke.com © 2008 Julian Dyke 18 Data Guard Data Guard Broker Introduced in Oracle 9.2 Stable in Oracle 10.2 and above Managed using DGMGRL utility Contains Data Guard configuration Additional layer of complexity Used by Enterprise Manager to manage standby Mandatory for some new functionality e.g. Fast Start Failover
juliandyke.com © 2008 Julian Dyke 19 Site1 Primary Node 1 Database Standby Node 2 Site2 Database Data Guard Fast Start Failover Observer Site3
juliandyke.com © 2008 Julian Dyke 20 Data Guard Fast Start Failover Detects failure of primary database Automatically fails over to nominated standby database Requirements include Flashback logging must be configured DGMGRL must be used Observer process running in third independent site Highly available in Oracle 11.1 and above MAXIMUM AVAILABILITY protection mode Standby database archive log destination must be configured as LGWR SYNC MAXIMUM PERFORMANCE protection mode Oracle 11.1 and above Primary database can potentially be reinstated automatically Using flashback logs
juliandyke.com © 2008 Julian Dyke 21 Data Guard Fast Start Failover Advantages No interconnect network required between sites No storage network required between sites RAC licences not required if each site is a single-instance Disadvantages Active / Passive Requires Enterprise Edition licence Remaining infrastructure must also failover Network Application tier Clients
juliandyke.com © 2008 Julian Dyke 22 Data Guard Oracle 11g New Features Snapshot Standby Standby can be converted to snapshot standby Can be opened in read-write mode (for testing) Redo transport continues Redo apply delayed Standby can subsequently be converted back to physical standby Active Data Guard Separately licensed option Updates applied to primary Changes can be read immediately on standby databases Standby database can be opened in read-only mode Redo can continue to be applied
juliandyke.com © 2008 Julian Dyke 23 Data Guard Licensing Standby database nodes must by fully licensed Same metric as primary (named user, CPU etc) Standard Edition Cannot use Data Guard Use user-defined scripts to transport redo Use Automatic Recovery to apply redo Manually resolve archive log gaps Enterprise Edition Use Managed Recovery to apply redo Use Fetch Archive Logging to resolve archive log gaps Additional licenses required for Active Data Guard
juliandyke.com © 2008 Julian Dyke 24 Data Guard Alternatives Standard Edition Manual log shipping using scripts SAN level Replication technologies Netapp SnapMirror, MetroCluster EMC SRDF, Mirrorview HP StorageWorks Redo log replication technologies Quest Shareplex
juliandyke.com © 2008 Julian Dyke 25 Data Guard The Reality
juliandyke.com © 2008 Julian Dyke 26 Data Guard The Reality Many sites run physical standbys Well proven technology Spare capacity on standby often used for development or testing during normal operations Relatively few sites run a logical standby Streams is much more popular Many sites enable flashback logging In both development and production environments Very few using Automatic Failover Very few sites working with Oracle 11g yet Consequently none using Active Data Guard
juliandyke.com © 2008 Julian Dyke 27 Data Guard The Reality Failover times Normally dependent on management decisions Usually some investigation before failover Time to failover database is minimal (5-10 minutes) Time to failover infrastructure can be hours Network configuration DNS Application / web servers Clients Failover SLAs often up to 48 hours Rebuild times Can take minutes using flashback logging Can take much longer depending on reason for failover
juliandyke.com © 2008 Julian Dyke 28 Thank you for your interest References Questions