Building Undo for Operators: An Undoable E-mail Store Aaron Brown ROC Research Group University of California, Berkeley Winter 2003 ROC Research Retreat.

Slides:



Advertisements
Similar presentations
Parallel and Distributed Simulation Time Warp: Basic Algorithm.
Advertisements

Remote Procedure Call (RPC)
Fabián E. Bustamante, Winter 2006 Recovery Oriented Computing Embracing Failure A. B. Brown and D. A. Patterson, Embracing failure: a case for recovery-
Operating-System Structures
Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
 Guy Jacob  Roee Shapiro – Project A Spring, 2008 INFINI DRIVE  Project Supervisor: Hai Vortman  Lab Chief Engineer: Dr. Ilana David.
How Clients and Servers Work Together. Objectives Web Server Protocols Examine how server and client software work Use FTP to transfer files Initiate.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Tentative Updates in MINO Steven Czerwinski Jeff Pang Anthony Joseph John Kubiatowicz ROC Winter Retreat January 13, 2002.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Wide-area cooperative storage with CFS
Backup and Recovery Part 1.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Porcupine: A Highly Available Cluster- based Mail Service Y. Saito, B. Bershad, H. Levy U. Washington.
Architecture of SMTP, POP, IMAP, MIME.
Pro Exchange SPAM Filter An Exchange 2000 based spam filtering solution.
Case Study - GFS.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
What is in Presentation What is IPsec Why is IPsec Important IPsec Protocols IPsec Architecture How to Implement IPsec in linux.
What Can IP Do? Deliver datagrams to hosts – The IP address in a datagram header identify a host IP treats a computer as an endpoint of communication Best.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
WORKFLOW IN MOBILE ENVIRONMENT. WHAT IS WORKFLOW ?  WORKFLOW IS A COLLECTION OF TASKS ORGANIZED TO ACCOMPLISH SOME BUSINESS PROCESS.  EXAMPLE: Patient.
1 The Google File System Reporter: You-Wei Zhang.
Chapter 17 Networking Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William Stallings.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 9
IT 424 Networks2 IT 424 Networks2 Ack.: Slides are adapted from the slides of the book: “Computer Networking” – J. Kurose, K. Ross Chapter 2: Application.
Application Layer Protocols Simple Mail Transfer Protocol.
The Hadoop Distributed File System
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
Recovery-Oriented Computing User Study Training Materials October 2003.
CH2 System models.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Designing a global repository using OceanStore Steven Czerwinski, Anthony Joseph, John Kubiatowicz Summer Retreat June 11, 2002 UC Berkeley.
Undo: Update and Futures Aaron Brown ROC Research Group University of California, Berkeley Summer 2003 ROC Retreat 5 June 2003.
Rewind, Repair, Replay: Three R’s to improve dependability Aaron Brown and David Patterson ROC Research Group University of California at Berkeley SIGOPS.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
(Business) Process Centric Exchanges
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
Cohesion and Coupling CS 4311
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Chapter 38 Persistence Framework with Patterns 1CS6359 Fall 2011 John Cole.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
Lecture 6: Sun: 8/5/1435 Distributed Applications Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Evaluating Undo: Human-Aware Recovery Benchmarks Aaron Brown with Leonard Chung, Calvin Ling, and William Kakes January 2004 ROC Retreat.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CS 3830 Day 9 Introduction 1-1. Announcements r Quiz #2 this Friday r Demo prog1 and prog2 together starting this Wednesday 2: Application Layer 2.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
File Systems cs550 Operating Systems David Monismith.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Spheres of Undo: A Framework for Extending Undo Aaron Brown January 2004 ROC Retreat.
Draft-lemonade-imap-submit-00.txt “Forward without Download” Allow IMAP client to include previously- received message (or parts) in or as new message.
Undo for Recovery: Approaches and Models Aaron Brown UC Berkeley ROC Group.
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
Networking Applications
WWW and HTTP King Fahd University of Petroleum & Minerals
Behavioral Design Patterns
Undo for Recovery: Approaches and Models
Optimize Your Java Code By Tools
EEC 688/788 Secure and Dependable Computing
Parallel and Distributed Simulation
William Stallings Data and Computer Communications
EEC 688/788 Secure and Dependable Computing
Process-to-Process Delivery: UDP, TCP
Abstractions for Fault Tolerance
Presentation transcript:

Building Undo for Operators: An Undoable Store Aaron Brown ROC Research Group University of California, Berkeley Winter 2003 ROC Research Retreat 14 January 2003

Slide 2 Outline Recap of “Three-R’s” Undo model Implementing Undo for Analysis of undo prototype Conclusions and future directions

Slide 3 Recap: Undo for Operators Dependability is achieved at the interface of the system and its operator Undo goal: support the operator – create a forgiving operations environment » tolerate human error, allow trial-and-error experiments – provide last-resort recovery from unknown damage » retroactive repair of software bugs, broken upgrades, unknown operator changes, external attacks – be easily comprehensible by operator and developer – preserve user data while refreshing system state

Slide 4 Recap: The Three-R’s Undo Model Provide Undo as virtual time travel – Rewind: roll back all system state, physically – Repair: make changes to past timeline to avert original disaster – Replay: roll state forward logically, merging original timeline with effects of repairs Key properties of 3R’s Undo – recovery from problems at any system layer – recovery from unanticipated problems – no assumptions about correct application behavior

Slide 5 Outline Recap of “Three-R’s” Undo model Implementing Undo for Analysis of undo prototype Conclusions and future directions

Slide 6 Building an Undo Prototype Target: an undoable store service – a leaf node in the Internet network – accessed via IMAP and SMTP Built around existing store service – e.g., sendmail+imapd, iPlanet, Exchange Extensible to other apps – architected around a reusable undo core with - specific extensions Written in Java

Slide 7 Undo System Architecture Service Time-travel Storage Undo Manager Verbs control Timeline Log Control UI User Proxy IMAP, SMTP Includes: - user state - application - OS

Slide 8 Architecture Design Points Proxy-based design targeted at services Application-neutral core – undo manager, timeline log, time-travel storage, UI – contains all undo-cycle logic -specific semantics encapsulated – proxy: encodes interactions into verbs – verbs: define semantics » model of preserved state » model of acceptable external (observed) consistency

Slide 9 Verbs Key construct in undo architecture – represent end-user state changes, state exposure – building block of timeline and history – used to detect and manage inconsistencies Essential for reasoning about undo – verbs define exactly what state is preserved during an undo cycle – provide framework for defining application’s model of acceptable external inconsistency

Slide 10 Verbs and the Undo Cycle Verb flow Service Includes: - user state - application - OS Time-travel Storage Undo Manager Verbs control Timeline Log Control UI User Proxy Replay Normal operation

Slide 11 Comparison to Optimistic Replication Insight: undo can be recast as replication— not in space, but in time – two virtual replicas diverge at the Rewind point – verb history defines an update log to one replica – repairs define updates to the other replica – Replay involves reconciling the two update streams, detecting and handling inconsistencies Verb-based design is heavily influenced by replica systems – verbs map to replica system “actions” or “updates”

Slide 12 Comparison to Optimistic Replication (2) But Undo has key differences: – reconciliation between a logical log (verb history) and a physical state (repaired system) » so log-merging schemes (e.g., IceCube) don’t apply – not all inconsistencies are bad » no need to check for inconsistency until externalization » no complex, error-prone guards on verb execution – inconsistencies can be handled after-the-fact » post-conditions on externalized data are sufficient » fewer assumptions of application correctness » if necessary, can rewind undesired inconsistency- producing verb

Slide 13 example: original timeline System boundary System state Verbs History log Time Inbox Folder1 olleH Copy Externalizes:— olleH Fetch m Externalizes:m + Signature(m)=“olleH” Hello olleH Deliver m Externalizes:—  + input “Hello”

Slide 14 olleH Copy Externalizes:— olleH Fetch m Externalizes:m + Signature(m)=“olleH” Hello olleH Deliver m Externalizes:—  + input “Hello” X Hello Deliver Externalizes:— + input “Hello” Hello Deliver m example: replay timeline System boundary System state Verbs History log Time Inbox Folder1 Hello Copy Externalizes:— Hello Fetch m Externalizes:m + Signature(m)=“olleH” mismatch! => inconsistency

Slide 15 Verbs and External Inconsistency Detecting inconsistency – verbs that externalize state record a signature of externalized data and define a comparison function » comparison fn. defines the external consistency model Handling inconsistency – verbs define compensation routines, invoked when inconsistencies are found – later non-commuting verbs are squashed to prevent data loss

Slide 16 Verbs for SMTP & IMAP protocols mapped into 13 verbs – SMTP: Deliver – IMAP: Append, Fetch, Store, Copy, List, Status, Select, Expunge, Close, Create, Rename, Delete – set could easily be extended with user and subscription management functionality Each verb is a Java class implementing a generic, app-neutral Verb interface

Slide 17 Example Verb: STORE (set msg flags) Tag – input: target folder, message IDs, flag value – externalized output: resultant flags of messages Sequencing tests – commutes with & independent of SMTP verbs, all IMAP verbs except Fetch/Store/Close on same folder Consistency check – new flags must match original flags for all messages in common; no original messages should be missing Inconsistency handler – leave user a message explaining and listing changes

Slide 18 An External Consistency Model for Retrieval (IMAP) – tracked state includes message bodies, key message headers (to/from/subject/cc), flags, folder lists, and execution status of state-altering commands – inconsistency if objects are missing or altered on replay or commands fail; order & new objects ignored – compensation via explanatory messages and creation of “lost-and-found” containers Delivery (SMTP) – only possible inconsistency is in execution status – one tricky case: originally-failed delivery succeeds » delay bounce to provide window for undo-repair

Slide 19 Undo System Architecture Next up: – proxy – undo manager – time-travel storage layer Service Time-travel Storage Undo Manager Verbs control Timeline Log Control UI User Proxy IMAP, SMTP

Slide 20 Proxying IMAP proxy loop is straightforward – accept client connection and open server connection – parse commands and instantiate corresponding verbs – invoke undo manager to schedule, execute, log verb » passing a handle to the client and server connections SMTP is more complicated – failed SMTP deliveries must be logged for later retry – proxy poses as a server that always accepts deliveries » deliver verb is created after client has been ACK’d » some finesse to avoid being an open-relay – if verb fails, proxy sends a bounce after a delay

Slide 21 Undo Manager Implementation Timeline log – BerkeleyDB recno database storing serialized verbs – also stores control records to make undo manager state persistent and recoverable Verb scheduling – all verb execution passes through undo manager – scoreboard-like structure sequences verbs according to independence, commutativity, ordering properties External inconsistency management – undo manager coordinates verb APIs to ensure that external inconsistencies are detected and handled

Slide 22 A Time-Travel Storage Layer Base: Network Appliance Filer with snapshots Java wrapper for Filer’s management CLI – provides direct control of snapshot create/restore – periodically takes snapshots, aging out old ones » hierarchical scheduling allows the 31 snapshots to span one month, with density inversely proportional to age Rewind restores the closest prior snapshot, then replays up to the exact rewind point Challenge: making snapshot restore undoable – solution: implement rollback by copying old snapshot forward to present (expensive, but necessary)

Slide 23 Outline Recap of “Three-R’s” Undo model Implementing Undo for Analysis of undo prototype Conclusions and future directions

Slide 24 Code complexity Entire system is about 23K lines of Java – split evenly between app-neutral and -specific code – bulk of code is in verbs Implementing and debugging the verbs and proxy took roughly two man-months

Slide 25 Measuring Overhead for Undo Workload – modified SPECmail2000 benchmark » simulates traffic seen by an ISP mail server » modified to use IMAP instead of POP, all mail local – configured with 5000 users » about 56 SMTP cxns/minute, 149 IMAP cxns/minute Setup – mail server: Linux, sendmail + UW imapd, 5000 users – undo proxy: Win2k, Sun JDK – workload generator: Win2k, Sun JDK – storage: all mailboxes and logs on NetApp Filer » mailboxes accessed via NFS, logs via CIFS

Slide 26 Overhead Results: Space and Time Space overhead – 5GB/day for timeline log = 1GB/day per 1000 users – 325KB per 1000 mail folder name translations – per 120GB disk: » 7 weeks of timeline for 1000 ISP users » or 350 million folder name translations Time overhead – cumulative distribution plot of IMAP and SMTP session lengths

Slide 27 Outline Recap of “Three-R’s” Undo model Implementing Undo for Analysis of undo prototype Conclusions and future directions

Slide 28 Next Steps Perform more analysis of prototype – speed of rewind, replay, entire undo cycle Measure efficacy of undo with human expts. – compare recovery time from pre-defined scenarios with and without undo » based on survey of administrators Examine other applications to understand generality of undo architecture – auctions, e-commerce, others?

Slide 29 Conclusions We can build a real Undo implementation – for a real application – with tolerable overhead – with a reusable, application-independent core – based on a verb model that enables reasoning about state and undo behavior Look forward to final analysis at next retreat!

Building Undo for Operators: An Undoable Store Aaron Brown ROC Research Group University of California, Berkeley

Slide 31 Backup Slides

Slide 32 Summary: What’s in a Verb? Encapsulation of a user-service interaction – action specification – procedure to execute verb via proxy – tag: input parameters, signature of externalized data, verb execution status Commutativity/independence test functions Consistency management functions – routine to check two tags for external inconsistency – inconsistency handler – squash routine

Slide 33 Verbs and the Undo Timeline Recorded history of verbs must be consistent with actual verb execution – but logging at proxy may not match execution order Solution: partial serialization of verbs – undo manager tracks arriving & executing verbs in scoreboard – arriving verbs that commute are executed in parallel – non-commuting verbs stall until dependencies clear Guarantees consistent recorded timeline without requiring hooks into application

Slide 34 Verb Issue: Naming State Verbs need to specify names of target state – folders, messages, users Names must be time-invariant and unaffected by changes to execution timeline – IMAP protocol names can’t guarantee this Solution: allocate “UndoIDs” to objects – UndoID never changes and is restored on replay – undo system maintains translations to IMAP names » messages: embed UndoID in header field » folders: track UndoIDs in parallel database