DB-15: Inside The Recovery Subsystem Plan to commit; Be prepared to rollback. Richard Banville Fellow, Technology and Product Architecture Progress OpenEdge.

Slides:



Advertisements
Similar presentations
ICS 214A: Database Management Systems Fall 2002
Advertisements

Crash Recovery John Ortiz. Lecture 22Crash Recovery2 Review: The ACID properties  Atomicity: All actions in the transaction happen, or none happens 
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
DB-03: A Tour of the OpenEdge™ RDBMS Storage Architecture Richard Banville Technical Fellow.
1 CPS216: Data-intensive Computing Systems Failure Recovery Shivnath Babu.
T OP N P ERFORMANCE T IPS Adam Backman Partner, White Star Software.
Oracle Architecture. Instances and Databases (1/2)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Database Recovery Techniques.
Recovery CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Chapter 19 Database Recovery Techniques
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
Recovery 10/18/05. Implementing atomicity Note, when a transaction commits, the portion of the system implementing durability ensures the transaction’s.
ICS (072)Database Recovery1 Database Recovery Concepts and Techniques Dr. Muhammad Shafique.
1 Minggu 8, Pertemuan 16 Transaction Management (cont.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Database Recovery Techniques.
1 - Oracle Server Architecture Overview
Chapter 19 Database Recovery Techniques. Slide Chapter 19 Outline Databases Recovery 1. Purpose of Database Recovery 2. Types of Failure 3. Transaction.
1 Implementing Atomicity and Durability Chapter 25.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
1 CS 541 Database Systems Implementation of Undo- Redo.
Backup and Recovery Part 1.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Database I/O Mechanisms
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
Architecture Rajesh. Components of Database Engine.
7202ICT – Database Administration
An Oracle server:  Is a database management system that provides an open, comprehensive, integrated approach to information management.  Consists.
Chapter 15 Recovery. Topics in this Chapter Transactions Transaction Recovery System Recovery Media Recovery Two-Phase Commit SQL Facilities.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Recovery Chapter 6.3 V3.1 Napier University Dr Gordon Russell.
Recovery system By Kotoua Selira. Failure classification Transaction failure : Logical errors: transaction cannot complete due to some internal error.
Progress Database Admin 1 Jeffrey A. Brown - Technical Support Consultant
Ch 10: Transaction Management and Concurrent Control.
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Chapter 15 Recovery. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.15-2 Topics in this Chapter Transactions Transaction Recovery System.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
DB-08: A Day in the Life of a Type II Record Richard Banville Progress Fellow.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Recovery.
Recovery technique. Recovery concept Recovery from transactions failure mean data restored to the most recent consistent state just before the time of.
Backup and Recovery - II - Checkpoint - Transaction log – active portion - Database Recovery.
Transactional Recovery and Checkpoints Chap
Oracle Architecture - Structure. Oracle Architecture - Structure The Oracle Server architecture 1. Structures are well-defined objects that store the.
Transactional Recovery and Checkpoints. Difference How is this different from schedule recovery? It is the details to implementing schedule recovery –It.
1 Database Systems ( 資料庫系統 ) January 3, 2005 Chapter 18 By Hao-hua Chu ( 朱浩華 )
Jun-Ki Min. Slide Purpose of Database Recovery ◦ To bring the database into the last consistent stat e, which existed prior to the failure. ◦

Database recovery techniques
Database Recovery Techniques
Database Recovery Techniques
DURABILITY OF TRANSACTIONS AND CRASH RECOVERY
Inside transaction logging
Implementing Atomicity and Durability
Transactional Recovery and Checkpoints
Enforcing the Atomic and Durable Properties
Chapter Overview Understanding the Database Architecture
Walking Through A Database Health Check
Database Recovery Techniques
Recovery II: Surviving Aborts and System Crashes
Inside transaction logging
Database Recovery 1 Purpose of Database Recovery
Data-intensive Computing Systems Failure Recovery
Recovery Unit 4.4 Dr Gordon Russell, Napier University
Presentation transcript:

DB-15: Inside The Recovery Subsystem Plan to commit; Be prepared to rollback. Richard Banville Fellow, Technology and Product Architecture Progress OpenEdge

© 2007 Progress Software Corporation 2 DB-15: Inside the Recovery Subsystem Recovery Types  Transaction Recovery* Before image rollback/undo and crash recovery  Hard Failure Recovery Roll forward after images Point in time, transaction, retry  Coordinated distributed txn consistency OpenEdge ® 2PC - Prepare Phase, Commit Phase  Heterogeneous distributed txn consistency (JTA) External distributed transaction coordinator Requires application changes Available for OpenEdge SQL only * Before Imaging is the focus of this presentation

© 2007 Progress Software Corporation 3 DB-15: Inside the Recovery Subsystem Agenda  The BI Units of Measure  Some Simple Rules  General Processing (the fun stuff)  Reliability Switches  Summary

© 2007 Progress Software Corporation 4 DB-15: Inside the Recovery Subsystem BI Layout: Notes and Blocks Notes are the basis for recording change in the database BI made up of many Notes Notes are variable sized Notes are organized in order of operation Notes are stored into BI blocks BI block size can be customized (1-16K) I/O is performed in BI Blocksize

© 2007 Progress Software Corporation 5 DB-15: Inside the Recovery Subsystem BI Layout: Clusters Notes are stored into BI blocks BI Block size can be customized (1-16K) I/O is performed in BI Blocksize Blocks are grouped to form a cluster BI cluster size can be customized (16KB – 256MB) Size affects checkpoint frequency (among other things)

© 2007 Progress Software Corporation 6 DB-15: Inside the Recovery Subsystem BI Layout: Clusters Clusters are allocated as needed Clusters are logically joined and ordered into a ring Only ever one cluster accepting BI writes

© 2007 Progress Software Corporation 7 DB-15: Inside the Recovery Subsystem BI Layout: Storage BI File The Primary Recovery Area: BI data stored in the extents of area #2 of the database It grows as needed Space is re-used when possible

© 2007 Progress Software Corporation 8 DB-15: Inside the Recovery Subsystem What’s in a note? Trid: code = RL_RMCR version = 2 Trid: area = 8 dbkey = update counter = 4770 Header Note Specific Info Data Portion (if needed)  Length & note version  Note code/identifier  Associates action  Note type  Transaction Id  Block pointer & area  Block update counter  Record #  Table number  Size of record  Split information  Block change data  i.e, Record data itself  Only if needed

© 2007 Progress Software Corporation 9 DB-15: Inside the Recovery Subsystem AI / BI Relationship  File I/O BI written first AI/BI Note Headers the same (OE 10.0A) Slightly less data written to AI  Rollforward Reads BI for rollback Does NOT record AI data DOES record “some” BI data (uses –i) Why is –i OK?

© 2007 Progress Software Corporation 10 DB-15: Inside the Recovery Subsystem Agenda  The BI Units of Measure  Some Simple Rules  General Processing (the fun stuff)  Reliability Switches  Summary

© 2007 Progress Software Corporation 11 DB-15: Inside the Recovery Subsystem Rules to live by  #1 - Write ahead logging (WAL) Recovery log notes written BEFORE data –Assures atomic and durable transactions –BI, AI - reliable write I/O –Can relax data write I/O  Write prior to BI-reuse  Cluster close  Missing data applied by redo  Deferring writes allows multiple updates to occur with a single I/O  #2 - Write ordering rule (FS and hardware) AI, BI writes get to disk in order requested

© 2007 Progress Software Corporation 12 DB-15: Inside the Recovery Subsystem Rules to follow  #3 - BI Space Reuse Only when cluster is closed Cluster closes when its last transaction ends –Checkpoint DOES NOT close a cluster –Checkpoint occurs when cluster fills up  #4 - Exclusive Block Access When changing data in database  #5 - Atomic Physical Changes Such as block chain manipulations Enforced by internal TXE mechanism SYSTEM ERROR: User 5 died during micro txn.

© 2007 Progress Software Corporation 13 DB-15: Inside the Recovery Subsystem Rule  #6 - Without exception: All DB changes are recorded in recovery log.

© 2007 Progress Software Corporation 14 DB-15: Inside the Recovery Subsystem Rules were meant to be broken  #6 - Without exception: All DB changes are recorded in recovery log.  Exception: Control Area (area #1) changes are not logged. –Why should I care? –Allows structural changes w/o affecting recovery  Such as adding space while in roll forward. –Recovery Mechanism: Builddb

© 2007 Progress Software Corporation 15 DB-15: Inside the Recovery Subsystem Agenda  The BI Units of Measure  Some Simple Rules  General Processing (the fun stuff)  Reliability Switches  Summary

© 2007 Progress Software Corporation 16 DB-15: Inside the Recovery Subsystem Forward Processing  Locate/Lock the data block to change Not all notes require a block –Transaction begin, end Not all DB changes require a block! –Acquiring additional space –Certain index sub-operations  Ensure begin transaction recorded  Record the change in the BI log (via the BI buffer pool) So you want to perform a database action

© 2007 Progress Software Corporation 17 DB-15: Inside the Recovery Subsystem Rollback Processing BI Buffer Pool – Recording a change -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Free List 15 Current Input Buffer 9 Backout Buffer 12 Backout Buffer BI Current Output Buffer New Notes (Actions) Forward Processing

© 2007 Progress Software Corporation 18 DB-15: Inside the Recovery Subsystem BI Buffer Pool – Recording a change -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Free List BI Current Output Buffer PROMON: Total BI Writes Records (notes) written Busy buffer waits Empty buffer waits Partial Writes New Notes (Actions) Forward Processing Is it OK to buffer dirty BI blocks? YES Is it OK to buffer committed BI data? Delayed commit is up to you!

© 2007 Progress Software Corporation 19 DB-15: Inside the Recovery Subsystem Forward Processing (continued)  Finally perform the DB action (make the change) Logical, physical or a mix  Data block’s update ctr is incremented Identifies if a noted change made it to disk yet Ensures changes re-applied in order  Dependency counter maintained in ctlr struct Ensures associated BI flushed if –B eviction  User may be forced to do (expensive) BI I/O On -B eviction or No BI buffers available Avoid with APWs, BIW and -bibufs The BI Note has been written…

© 2007 Progress Software Corporation 20 DB-15: Inside the Recovery Subsystem Helping avoid OLTP BI I/O

© 2007 Progress Software Corporation 21 DB-15: Inside the Recovery Subsystem Broker Processing -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Current Output Buffer Free List BI Delayed commit (Durability) Based on –Mf value, Broker may flush BI buffers to disk For aged txn ends Broker PROMON: Total BI Writes Records (notes) written Partial Writes New Notes (Actions) Helping Avoid OLTP BI I/O

© 2007 Progress Software Corporation 22 DB-15: Inside the Recovery Subsystem BIW Processing -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Current Output Buffer Free List BI B I W PROMON: Total BI Writes Records (notes) written BIW Writes New Notes (Actions) Partial Writes Helping Avoid OLTP BI I/O

© 2007 Progress Software Corporation 23 DB-15: Inside the Recovery Subsystem APW Processing -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Current Output Buffer Free List BI A P W db Checkpoint Queue Associated BI Note (dependency ctr) Data Blocks New Notes (Actions) WAL 12 Helping Avoid OLTP BI I/O

© 2007 Progress Software Corporation 24 DB-15: Inside the Recovery Subsystem BI Clusters And Checkpointing

© 2007 Progress Software Corporation 25 DB-15: Inside the Recovery Subsystem The Precious Ring BI Files 4231 Database BI Cluster Layout B buffer pool Modified Queue Current Out Buffer -bibufs BI blocks are grouped together to form a cluster of blocks. The cluster of blocks are logically joined together in a ring.

© 2007 Progress Software Corporation 26 DB-15: Inside the Recovery Subsystem Checkpoint – Synchronization point BI Files 4231 Database BI Cluster Layout B buffer pool Modified Queue Current Out Buffer Db buffer pool scanned Db buffers previously marked for chkpt are written out (OUCH!) Dirty buffers are marked for chkpt & put on checkpoint queue File system cache is synchronized File System Cache No more sync delay -bibufs Fuzzy checkpointing avoids I/O All Database Changes Halted! BI buffer pool flushed

© 2007 Progress Software Corporation 27 DB-15: Inside the Recovery Subsystem Checkpoint (with –directio) BI Files 4231 Database BI Cluster Layout B buffer pool 1 (unbuffered I/O) All Database Changes Halted! Db buffer pool scanned Db buffers marked for chkpt are written out Dirty buffers are marked for chkpt & put on checkpoint queue Fuzzy checkpointing avoids I/O BI buffer pool flushed

© 2007 Progress Software Corporation 28 DB-15: Inside the Recovery Subsystem The APW A P W db APW Queue Checkpoint Queue B Buffer Pool … PROMON: Buffers Flushed at checkpoint BIW Writes The APWs help w/checkpoints too

© 2007 Progress Software Corporation 29 DB-15: Inside the Recovery Subsystem Checkpoint – Size Does Matter  Larger cluster sizes Fewer checkpoints (sync points) –Will a crash result in additional lost data? Longer recovery time –Recovery starts at last cluster - 1 Longer BI format time (runtime) Longer BI format time after truncate –Use at least one fixed length extent  Also use a variable length extent –Use bigrow

© 2007 Progress Software Corporation 30 DB-15: Inside the Recovery Subsystem Checkpoints and Promon Seeing is believing… Ckpt Database Writes No. Time Len Freq Dirty CPT Q Scan APW Q Flushes 27 10:23: :22: :22: :21: :21: Ooops!!

© 2007 Progress Software Corporation 31 DB-15: Inside the Recovery Subsystem Checkpoints and Promon Seeing is believing… Ckpt Database Writes No. Time Len Freq Dirty CPT Q Scan APW Q Flushes 27 10:23: :22: :22: :21: :21: Len: begin to end time - Time cluster was actively available for writes Freq: begin time to begin time - Time between checkpoints Dirty: # data blocks newly updated – not incremented when “made dirtier” Time spent performing checkpoint operation: Freq - Len

© 2007 Progress Software Corporation 32 DB-15: Inside the Recovery Subsystem Checkpoints and Promon APW Specific Activity… Ckpt Database Writes No. Time Len Freq Dirty CPT Q Scan APW Q Flushes 27 10:23: :22: :22: :21: :21: CPT Q: # data buffers APW wrote from checkpoint queue (from prev chkpt) Scan: # data buffers APW wrote while scanning -B APW Q: # data buffers APW wrote from APW Q Dirty buffers added to APWQ from -B LRU eviction

© 2007 Progress Software Corporation 33 DB-15: Inside the Recovery Subsystem Checkpoints and Promon To be avoided… Ckpt Database Writes No. Time Len Freq Dirty CPT Q Scan APW Q Flushes 27 10:23: :22: :22: :21: :21: Flushes: Number of blocks written during checkpoint (marked from previous checkpoint) Len: Checkpointing too often should be avoided

© 2007 Progress Software Corporation 34 DB-15: Inside the Recovery Subsystem Reusing space in the BI file

© 2007 Progress Software Corporation 35 DB-15: Inside the Recovery Subsystem BI Space Reuse 1 BI Files

© 2007 Progress Software Corporation 36 DB-15: Inside the Recovery Subsystem BI Space Reuse 15 BI Files

© 2007 Progress Software Corporation 37 DB-15: Inside the Recovery Subsystem BI Space Reuse BI Files 6 When can BI space be reused? No need to “Age” cluster anymore No open transactions in cluster W h y ?? Checkpoint DOES NOT close a cluster!! Changes have been written to data files If outstanding transaction were to roll back, where would the undo action come from? -G 0 vs –G 60 Thanks fdatasync() BI files grow to some working set size

© 2007 Progress Software Corporation 38 DB-15: Inside the Recovery Subsystem Rollback

© 2007 Progress Software Corporation 39 DB-15: Inside the Recovery Subsystem Rollback Processing -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Current Output Buffer Free List 15 Current Input Buffer 9 Backout Buffer 12 Backout Buffer BI.lbi PROMON: Input buffer hits Output buffer hits Mod buffer hits Busy buffer waits Total BI Reads Notes read ABL sub transaction rollback: ABL requests compensating action Read backwards & UNDO until tx begin

© 2007 Progress Software Corporation 40 DB-15: Inside the Recovery Subsystem What about BOB? -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Free List 15 Current Input Buffer 9 Backout Buffer 12 Backout Buffer BI Current Output Buffer PROMON: Input buffer hits Output buffer hits Mod buffer hits BO Buffer hits

© 2007 Progress Software Corporation 41 DB-15: Inside the Recovery Subsystem Crash Recovery

© 2007 Progress Software Corporation 42 DB-15: Inside the Recovery Subsystem BI Note Types  Physical (purely physical) Database Extend and Raising HWM Block chain manipulations Do not participate in rollback Participate in physical crash recovery  Logical (purely logical) Changes not relying on physical state (dynamic) Index sub-operations Rollback & logical part of crash recovery  Physiological (most popular)

© 2007 Progress Software Corporation 43 DB-15: Inside the Recovery Subsystem Crash Recovery  Performed on each database startup Only needed phases performed  Brings DB up to last known consistent state Physically sound In-flight transactions rolled back Missing committed transactions re-applied

© 2007 Progress Software Corporation 44 DB-15: Inside the Recovery Subsystem Physical Redo Oldest active txn Last Recorded Note Before-Image Log Bring DB up to point of crash *** Begin Physical Redo Phase, 4 at 0. Find last active cluster and backup one *** Physical Redo Phase Completed at block, off, upd… *** At end of Physical Redo, txn table is 128 Apply notes based on updctr No BI notes generated during redo redo phase - forward scan

© 2007 Progress Software Corporation 45 DB-15: Inside the Recovery Subsystem Physical Undo redo phase - forward scan Before-Image Log Backout physical DB changes (if needed) Oldest active txn *** Begin Physical Undo 10 txns at block 128 offset 1608 *** Physical Undo Completed at 128 (block #) Starts at crash point. U ndo physical and physiological notes Causes new BI notes to be generated Ends when 1 st transaction end encountered Physical undo Last Note

© 2007 Progress Software Corporation 46 DB-15: Inside the Recovery Subsystem Logical Undo redo phase - forward scan Before-Image Log Backout all uncommitted transactions Oldest active txn *** Begin Logical Undo Phase, 10 incomplete txns are being backed out. *** Logical Undo Phase Completed at Block 1135 offset Starts where physical undo left off Undo logical and physiological notes *** Logical Undo Phase begin at Block 1136 offset Logical undo backward scan Physical undo Last Note

© 2007 Progress Software Corporation 47 DB-15: Inside the Recovery Subsystem Agenda  The BI Units of Measure  Some Simple Rules  General Processing  Reliability Switches  Summary

© 2007 Progress Software Corporation 48 DB-15: Inside the Recovery Subsystem Switches: Reliability and Integrity  -I : No longer a valid parameter. Never had anything to do with crash recovery  -R : Default - Reliable BI I/O Writes bypass the FS cache Use for OLTP *** Before-Image File I/O (-r -R): Reliable. *** Crash Recovery (-i): Enabled.

© 2007 Progress Software Corporation 49 DB-15: Inside the Recovery Subsystem Switches: Reliability and Integrity  -r : BI writes are buffered (un-reliable) to FS Well tuned system overshadows any gain of -r All notes recorded Rollback will work Crash recovery likely to work Recovery from OS crash will most likely fail *** This session is running with the non-raw (-r) parameter. *** Before-Image File I/O (-r -R): Not Reliable. *** Crash Recovery (-i): Enabled. *** An earlier -r session crashed, the database may be damaged.

© 2007 Progress Software Corporation 50 DB-15: Inside the Recovery Subsystem Switches: Reliability and Integrity  -i : Does not record purely physical notes BI I/O is buffered (un-reliable) to FS No FS sync at checkpoint Rollback will work. OS or DB crash, abnormal termination –Must restore from backup *** This session is being run with the no-integrity (-i) option. *** Crash Recovery (-i): Not Enabled. *** Before-Image File I/O (-r -R): Not Reliable. Why provide it then?

© 2007 Progress Software Corporation 51 DB-15: Inside the Recovery Subsystem Switches: Last Resort  -F (dash Foolish) Enter DB without recovery Use as a last resort Integrity NOT maintained Usually need to –Validate Data Integrity –Dump and load

© 2007 Progress Software Corporation 52 DB-15: Inside the Recovery Subsystem Agenda  The BI Units of Measure  Some Simple Rules  General Processing  Reliability Switches  Summary

© 2007 Progress Software Corporation 53 DB-15: Inside the Recovery Subsystem Summary  Recovery is a complex thing  You can do things to improve the process  We make it simple for you

© 2007 Progress Software Corporation 54 DB-15: Inside the Recovery Subsystem Questions? -bibufs 10 NF - a NF - b NF - c NF - d NF - e Modified Queue Current Out Buffer Free List BI A P W db Checkpoint Queue Associated BI Note 4231

© 2007 Progress Software Corporation 55 DB-15: Inside the Recovery Subsystem Thank you for your time!

© 2007 Progress Software Corporation 56 DB-15: Inside the Recovery Subsystem

© 2007 Progress Software Corporation 57 DB-15: Inside the Recovery Subsystem Other recovery related Switches  -bi  -biblocksize  -directio No need for sync at checkpoint time  -bwdelay  -bibufs, -aibufs  -bistall, -bithold

© 2007 Progress Software Corporation 58 DB-15: Inside the Recovery Subsystem Switches: Transactions  -Mf : Delayed commit # seconds a commit note can reside in –bibufs Some commits lost/Integrity Maintained  Group Commit Technique –groupdelay only runs w/-Mf 0 Only in multi user mode # milliseconds to sleep at commit time  -G : # seconds to age cluster (use & re-use) No longer needed with fdatasync()