Download presentation
Presentation is loading. Please wait.
Published byHarold Webb Modified over 9 years ago
1
Joonwon Lee joon@kaist.ac.kr Recovery
2
Lightweight Recoverable Virtual Memory Rio Vista
3
Introduction failure – when a system does not perform in the manner defined erroneous state – state that could lead the system to the failure fault – anomalous physical condition – causes design/manufacturing error damage/fatigue external disturbance faults lead the system to an erroneous state which may or may not results in a failure
4
Failures process failure – deadlock, timeout, protection violation,... – OS should confine this failure to the process system failure – software and hardware – amnesia failure: cannot recover the state just before the failure – pause failure: the state can be reinstated – halting failure: the system never restarts disk failure – serious problem when it is the last backup storage – usually backed up by tape OR – mirrored (it will enhance read throughput anyway) communication medium failure – does not cause total system failure
5
Error Recovery Forward Error Recovery –allow the process to proceed after fixing errors –difficult to remove all the errors (in software, procedures to cope with all kinds of error should be prepared, which is almost impossible) Backward Error Recovery –the process should restart from the saved (or predefined) state –roll-back mechanism is needed –easy to cope with any kind of errors (it is not necessary to anticipate all kinds of errors) –overhead to restore previous state checkpointing is needed –same error may occur again
6
Backward Error Recovery Operation-based approach –using a log, undo(roll-back) what has been done until an error-free state can be restored –write ahead log (for a write to X) records in a log new value of X updates X State-based approach –checkpoint a complete state of a process at crash, rollback to the most recent safe state –needs many checkpoints –shadow page copy of a page that is to be updated updates are done only on the original page at crash, goes back to the shadow page at commit, keep using the original page
7
Issues in Recovery(1) failure and recovery of a process affect other processes that exchange data with the failed process orphan message – when a process rolls back to the point before sending out a message – actions of other processes depending on the orphan message should be rolled back, too (domino effects) lost message – node Y receives a message from X – Y rolls back to the point before receiving the message – effects are the same as when the message is lost
8
Issues in Recovery(2) livelocks Y sends out m1 and receives an orphan message n1, and rolls back m1 becomes an orphan message receiving m1, X rolls back X Y x x m1 n1 1. failure, and roll back 2. orphan message, roll back
9
Checkpoints local checkpoint –snapshot of a single node –superscalar CPU and out-of-order memory operations made checkpointing difficult global checkpoint –strongly consistent set of checkpoints all the checkpoints are inside a given interval no information is exchanged between any processes during this interval this is the last place any process should rolls back to
10
Checkpoints(2) –consistent set of checkpoints a message recorder as “received” in a checkpoint should be recorded as “sent” in another checkpoint –no orphan message recorded as “sent” may NOT be recorded as “received” in other checkpoint –possible lost message simple to make this set –take a checkpoint after sending every message –or after sending N messages for better efficiency but at more chances of domino effect lost message can be dealt as in other network protocols
11
Synchronous Checkpointing Assumption –FIFO delivery of messages –no lost message Operations –an initiating node P broadcasts a message –all the other node take temporary checkpoints if necessary reply OK to the P do not send any message until they hear from P –P broadcasts either GO: if all the nodes reply OK to P Fail: otherwise –Nodes make the temporary checkpoint permanent or discard it start to send messages from this point
12
Synchronous Checkpointing advantages –east recovery: all processes restarts from the checkpoint disadvantages –message overhead –hinder normal progress (no computational messages are allowed during checkpointing)
13
Asynchronous Checkpointing checkpoint at each node is made independently –no guarantee of consistent set –recovery is complex to find the nearest consistent set optimization: all incoming messages are logged after checkpoint –recovery algorithm analyzes the log and find the most recent consistent set of checkpoints
14
Asynchronous Checkpointing(2) –Y crashes Y restarts from the last checkpoint send ROLLBACK(Y,2) to X since the last checkpoint records that Y has sent 2 msgs to X ROLLBACK(Y,1) to Z (red lines) –other nodes sends back ROLLBACK msgs similarly (blue lines) X sends out (X,2), (X,0) to Y and Z, respectively –each node sets the chkpnt as to prevent orphan msgs (red brackets) number of received msg from i recorded in the chkpnt < N, where ROLLBACK(i,N) msg has arrived –loop until a consistent set of checkpoints comes up bounded by N (?) X Y x Z [ [ [ [ [
15
Free Transactions with Rio Vista crash taxonomy –hardware: not frequent –software: frequent due to bugs in OS –power: UPS motivations –transactions are useful but high overhead (disk accesses) –file cache is useful, but vulnerable to system crashes
16
Traditional Approach: RVM at the beginning of a transaction, RVM copies the page to undo log(shadow page) –user abort is serviced by the undo log at commit, RVM reclaims undo space, and writes updated pages to redo log on disk –system/process failure is serviced by the redo log at leisure time, database is updated from the redo log
17
Rio file cache protect cached data from system crashes –cache is as reliable as a disk –then, write ahead log for recovery is not needed –writes to disk can be delayed infinitely OS errors can corrupt any part of the system –the issue is how to reduce the chances at a crash –warm reboot process writes the cache to disk
18
file cache vs disk why people view memory more vulnerable than disk? –memory access is a simple write an error in the address bits will overwrite the file cache –interface to access disk is complex and explicit hardware controller is accessed only through device driver calls to device drivers are checked for their arguments it is extremely unlikely that accidental errors can forge the logic of device driver
19
How to protect from system crashes? prevent OS from accidentally overwriting the file cache virtual memory mapping –turn off the write-permission bits in the page table for the pages in the file cache –unauthorized accesses will encounter protection violation –file cache module enables the bit before writing and disables the bit afterwards the file cache is vulnerable to crashes while being written –disk has the same problem –solutions verify after writes use shadow copy for atomic writes
20
How to protect from system crashes? some kernels bypass the address translations (TLB) –many systems can disable such bypasses –otherwise, code insertion (sandboxing) check for every kernel write using physical address 20-50% slower memory-mapped file –kernel procedures that modify the memory-mapped file should be changed as above –faulty user program can still corrupt files to which it has write access
21
Warm Reboot Recovery needs to access many data structures – internal file cache lists –page tables (memory-mapped files) –all these data must be protected from crash but they are scattered inside the kernel Registry –a separate physical memory region –contains all the information to recover the file cache –it is updated only when a buffer is replaced (reloaded)
22
File System Modifications writes to disk can be saved –most disk writes are reliability-induced writes to disk are needed only when the file cache overflows writing back dirty copies when the system is idle –reduces the time when a buffer is replaced
23
Vista Recoverable Memory
24
Recovery operations –prepare undo log –writes directly to DB’s mapped image in Rio these updates are persistent –at commit, discard the undo log –at abort, restore the undo log to the mapped DB at recovery –Rio writes back Vista segments that were mapped at the time of crash –Visa examines the segment if there is any uncommitted transactions roll back (restore undo log) –recovery process should be idempotent crash can happen while recovering
25
Persistent Heap only transactions can use –when they aborts, all the used heaps are returned undo records mentioned above are stored here programs can store their original data structures –usually convert them to record style when stored in a file meta data for the heap is in user space –why? –need a protection from corruption reduce the risk by using isolated range of addresses software fault isolation virtual memory protection
26
Fault Tolerance with DSM DSM maintains multiple copies of a page –if a copy is lost, it can be recovered from another copy maintain at least two copies for each page –cope with a single failure –can be extend to cope with n-failures what about state information? –can be rebuilt
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.