Bringing Undo to system admin: a new paradigm for recovery

Slides:



Advertisements
Similar presentations
Information Security Domains Computer Operations Security By: Shafi Alassmi Instructor: Francis G. Date: Sep 22, 2010.
Advertisements

Fabián E. Bustamante, Winter 2006 Recovery Oriented Computing Embracing Failure A. B. Brown and D. A. Patterson, Embracing failure: a case for recovery-
Carrier Grade EAB/UK/T Carrier Grade – What does it mean At least % uptime –Downtime less than 5 minutes per year Designed for 10 to.
Session - 15 RECOVERY CONTROL - 1 Matakuliah: M0184 / Pengolahan Data Distribusi Tahun: 2005 Versi:
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
CS-550 (M.Soneru): Recovery [SaS] 1 Recovery. CS-550 (M.Soneru): Recovery [SaS] 2 Recovery Computer system recovery: –Restore the system to a normal operational.
Reliability and Partition Types of Failures 1.Node failure 2.Communication line of failure 3.Loss of a message (or transaction) 4.Network partition 5.Any.
Backup and Recovery (2) Oracle 10g CAP364 1 Hebah ElGibreen.
Transactions and Recovery
Oracle backup and recovery strategy
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
1 Motivation Goal: Create and document a black box availability benchmark Improving dependability requires that we quantify the ROC-related metrics.
20 Copyright © 2004, Oracle. All rights reserved. Database Recovery.
Database Integrity and Security HAP 709 – Healthcare Databases George Mason University Janusz Wojtusiak, PhD Fall, 2010.
IT Security for Users By Matthew Moody.
Recovery-Oriented Computing User Study Training Materials October 2003.
Module 1: Recovering Messaging Databases. Overview Overview of Database Recovery Scenarios Recovering a Messaging Database Using Dial-Tone Recovery.
Heba Daraghmeh 27/6/2010 Basic Computer Terminology.
1 Database Systems CS204 Lecture 21 Transaction Processing I Asma Ahmad FAST-NU April 7, 2011.
Data Recovery Techniques Florida State University CIS 4360 – Computer Security Fall 2006 December 6, 2006 Matthew Alberti Horacesio Carmichael.
CS3353: System Administration J. Childress KEP U331
Figures – Chapter 14. Figure 14.1 System layers where security may be compromised.
Reliability and Security in Database Servers By Samuel Njoroge.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Evaluating Undo: Human-Aware Recovery Benchmarks Aaron Brown with Leonard Chung, Calvin Ling, and William Kakes January 2004 ROC Retreat.
The Relational Model1 Transaction Processing Units of Work.
CS 505: Thu D. Nguyen Rutgers University, Spring CS 505: Computer Structures Fault Tolerance Thu D. Nguyen Spring 2005 Computer Science Rutgers.
IT Database Administration Section 09. Backup and Recovery Backup: The available options Full Consistent (cold) Backup Database shutdown, all files.
Database Systems Recovery & Concurrency Lecture # 20 1 st April, 2011.
Yonglei Tao School of Computing & Info Systems GVSU Ch 7 Design Guidelines.
Chapter 15: Reliability and Security in Database Servers Neyha Amar CS 157B May 6, 2008.
Transactions.
Spheres of Undo: A Framework for Extending Undo Aaron Brown January 2004 ROC Retreat.
Database Administration Basics. Basic Concepts and Definitions  Data Facts that can be recorded and stored  Metadata Data that describes properties.
16 Copyright © 2005, Oracle. All rights reserved. Performing Database Recovery.
Oracle Architecture - Structure. Oracle Architecture - Structure The Oracle Server architecture 1. Structures are well-defined objects that store the.
Chapter 2: Advanced programming concepts Part 3: The user interface Lecture 5 1.
Disaster Recovery Mechanics Adolphe Ngabo COSC 356.
What do System Administrators Do? William Kakes Calvin Ling Leonard Chung Aaron Brown EECS Computer Science Division University of California, Berkeley.
How to Recover Deleted Files from Android Phone Internal Memory and External SD Card
Lecturer: Eng. Mohamed Adam Isak PH.D Researcher in CS M.Sc. and B.Sc. of Information Technology Engineering, Lecturer in University of Somalia and Mogadishu.
Undo for Recovery: Approaches and Models Aaron Brown UC Berkeley ROC Group.
© 1997 UW CSE 11/24/97O-1 Recovery Concepts Chapter 18 (lightly)
Operating System Reliability Andy Wang COP 5611 Advanced Operating Systems.
Chapter 6 Protecting Your Files

Information Systems Development
Embracing Failure: A Case for Recovery-Oriented Computing
Large Distributed Systems
Addressing Human Error with Undo
Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002
Security Engineering.
Operating System Reliability
Operating System Reliability
Computer virus Topic: Prepared For Salmeen Rahman Prepared By
Fault Tolerance In Operating System
Recovery-Oriented Computing
Exchange OST Recovery Freeware Tool. Index Introduction What is OST File? Reasons for OST file corruption Possible ways to fix OST file corruption issue.
Outlook Recovery Freeware is the professional tool to fix Outlook Error and PST corruption.
VMware Data Recovery to Recover Data from VMDK File.
How to Resolve QuickBooks Error Code Simple Steps to Fix QuickBooks Error Code 6150 QuickBooks is a versatile accounting software that is trusted.
Undo for Recovery: Approaches and Models
Operating System Reliability
Operating System Reliability
Operating System Reliability
Performing Database Recovery
A Beginners Guide to Transactions
Distributed Systems and Concurrency: Distributed Systems
Operating System Reliability
Operating System Reliability
Presentation transcript:

Bringing Undo to system admin: a new paradigm for recovery Aaron Brown UC Berkeley CS Division abrown@cs.berkeley.edu http://roc.cs.berkeley.edu

Motivation Recovery is important people screw up software and hardware break upgrades fail hackers break in etc. and sysadmins have to clean up the mess can we make life easier?

What makes recovery easy? Not having to think about it beforehand do you have a backup strategy to handle your typos? Having a consistent strategy system-wide no trying to disambiguate user/system data Being familiar with it recovery: it’s not just for catastrophes anymore easy recovery => more freedom to experiment, learn Not having to do it at all export recovery to users This is not what we have today!

Undo: a new recovery paradigm reason we call it undo: more like a WP, where you don’t sit around explicitly defining backups to recover from typos we’re all familiar with undo in editors and word processors. It’s a much more natural recovery paradigm compared to what we do today for system recovery. While working in your editor, I doubt many of you worry about a backup strategy for your edits or are afraid to try things (xxx). Undo: a new recovery paradigm Make system recovery as painless and natural as undoing mistakes in a word processor Continuous recovery with undo: the 3 R’s Rewind: roll system state backwards to any time point Repair: fix problem; reconfigure to avoid problem Redo: roll system state forward, replaying user interactions lost during rewind users are familiar with the model

Undo makes recovery easy No explicit definition of recovery points Covers system and user data repair corruption, virus damage, trojans, ... Redo means no loss of user data on rollback Provides forgiving environment encourages learning via experimentation Can export to users

Status Now: defining the conceptual model input welcome! would undo improve your life? where would you like to see it? Next: studying implementation techniques no-overwrite storage logging of state and user actions using dependencies between state to guide rollback Goals: proof-of-concept implementation (email service) set of design guidelines for building undo-recoverable systems if possible, an API and infrastructure for undoable systems

Aaron Brown, UC Berkeley Contact Aaron Brown, UC Berkeley abrown@cs.berkeley.edu This work is part of the ROC (Recovery-Oriented Computing) Project, run by Dave Patterson http://roc.cs.berkeley.edu