Chapter 17: Application Recovery

Slides:



Advertisements
Similar presentations
MQ Series Cross Platform Dominant Messaging sw – 70% of market Messaging API same on all platforms Guaranteed one-time delivery Two-Phase Commit Wide EAI.
Advertisements

Transactions and Recovery Checkpointing Souhad Daraghma.
Message Queues COMP3017 Advanced Databases Dr Nicholas Gibbins –
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
E-Transactions: End-to-End Reliability for Three-Tier Architectures Svend Frølund and Rachid Guerraoui.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Distributed Systems 2006 Styles of Client/Server Computing.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
1 Transaction Management Database recovery Concurrency control.
Click to add text Introduction to z/OS Basics © 2006 IBM Corporation Chapter 15: WebSphere MQ.
1 I/O Management in Representative Operating Systems.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Distributed Databases
A Survey of Rollback-Recovery Protocols in Message-Passing Systems M. Elnozahy, L. Alvisi, Y. Wang, D. Johnson Carnegie Mellon University Presented by:
WORKFLOW IN MOBILE ENVIRONMENT. WHAT IS WORKFLOW ?  WORKFLOW IS A COLLECTION OF TASKS ORGANIZED TO ACCOMPLISH SOME BUSINESS PROCESS.  EXAMPLE: Patient.
Transaction Management: Concurrency Control CS634 Class 16, Apr 2, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
6 Memory Management and Processor Management Management of Resources Measure of Effectiveness – On most modern computers, the operating system serves.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Chapter 15 Recovery. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.15-2 Topics in this Chapter Transactions Transaction Recovery System.
Distributed File Systems 11.2Process SaiRaj Bharath Yalamanchili.
10 1 Chapter 10 - A Transaction Management Database Systems: Design, Implementation, and Management, Rob and Coronel.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
R Some of these slides are from Prof Frank Lin SJSU. r Minor modifications are made. 1.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Chap 1. What Is It All About? 정강수.
MQ Series Cross Platform Dominant Messaging sw – 70% of market
Transactional Information Systems:
Database Recovery Techniques
Last Class: Introduction
Primary-Backup Replication
On-Line Transaction Processing
Managing Multi-User Databases
Multimedia Laboratory
WWW and HTTP King Fahd University of Petroleum & Minerals
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
ARP and RARP Objectives Chapter 7 Upon completion you will be able to:
EEC 688/788 Secure and Dependable Computing
#01 Client/Server Computing
Client-Server Interaction
EECS 498 Introduction to Distributed Systems Fall 2017
On transactions, and Atomic Operations
RELIABILITY.
Outline Announcements Fault Tolerance.
Transactional Information Systems:
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Switching Techniques.
Outline Introduction Background Distributed DBMS Architecture
On transactions, and Atomic Operations
Outline Introduction Background Distributed DBMS Architecture
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Distributed Transactions
Lecture 21: Replication Control
William Stallings Data and Computer Communications
Lecture 20: Intro to Transactions & Logging II
EEC 688/788 Secure and Dependable Computing
MQ Series Cross Platform Dominant Messaging sw – 70% of market
Transaction management
Transactional Information Systems:
Database Recovery 1 Purpose of Database Recovery
EEC 688/788 Secure and Dependable Computing
Transactional Information Systems:
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Last Class: Fault Tolerance
#01 Client/Server Computing
Presentation transcript:

Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned “Basic research is what I‘m doing when I don‘t know what I‘m doing.” (Wernher von Braun) “Research is the act of going up alleys to see if they are blind.” (Plutarch) 4/8/2019 Transactional Information Systems

From ACID To Recovery Guarantees Problem: Client that does not receive a returncode from transactional server cannot easily find out the transaction outcome and may be tempted to re-initiate the (non-idempotent) transaction, thus producing unacceptable effects. Approach: In addition to atomicity, the transactional server needs to guarantee the exactly-once execution of the transaction, where execution includes the server‘s reply message.  (almost) perfect failure masking 4/8/2019 Transactional Information Systems

Stateless Applications Based on Queues (running on client, or app server or data server): user sends input message app program sends request message to data server data server executes transaction and sends reply message to app app program sends output message to user there are no conversations with the user within a transaction, and subsequent transactions are independent Solution Queued Transactions: message recovery by queue manager with persistent, recoverable message queues exactly-once execution by enclosing message dequeue and enqueue into transaction 4/8/2019 Transactional Information Systems

Illustration of 2-Tier Queued Transaction User Application Process (Client) Database Server ... input output enqueue request dequeue reply server transaction 4/8/2019 Transactional Information Systems

Illustration of 3-Tier Queued Transaction User Client Database Server ... input output enqueue request dequeue reply distributed server transaction Application 4/8/2019 Transactional Information Systems

Correctness of Queued Transaction Protocol Theorem 17.1: With the queued transaction protocol for stateless applications, the following guarantees hold: 1. Once the user-input transaction is committed, a request is executed by the server exactly once. 2. Once the user-input transaction is committed, the user output is delivered at least once. 3. If user output is testable, the user output is delivered exactly once, provided the user-input transaction has been committed. Inherent (small window of) uncertainty: (last) user input may get lost (last) user output may be sent more than once  can be eliminated with testable output (using special hardware) 4/8/2019 Transactional Information Systems

Client During Normal Operation user-input processing by client: begin transaction; enqueue (request); commit transaction; user-output processing by client: wait until reply queue is not empty; dequeue (reply); while user has not acknowledged the reply or sent the next request do present reply to user; end /*while*/; 4/8/2019 Transactional Information Systems

Server During Normal Operation request-reply processing by data server: begin transaction; dequeue (request); perform data operations and generate reply; enqueue (reply); commit transaction; 4/8/2019 Transactional Information Systems

Client and Server Restart Client restart: check reply queue; if not empty then process reply like during normal operation; end /*if*/; Server restart: check request queue; if not empty then initiate processing of requests like during normal operation end /*if*/; 4/8/2019 Transactional Information Systems

Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems

Pseudo-Conversational Transactions for Stateful Applications Queue-based message recovery for entire conversations Conversational “logical unit of work” broken down into chain of stateless transactions with (small) application state maintained in the queue (analogously to Cookies, but more general and much more reliable) Dequeue of reply and enqueue of next request combined into one transaction for exactly-once execution guarantee good for apps such as travel reservation, electronic shopping, etc. 4/8/2019 Transactional Information Systems

Illustration of Pseudo-Conversational Transactions User Application Process (Client) Database Server ... 4/8/2019 Transactional Information Systems

Correctness of Pseudo-Conversational Transaction Protocol Theorem 17.2: With the queue-based message recovery for conversational multi-step transaction chains, the following guarantees hold: Once the initial user-input transaction that starts the entire conversation is committed, the entire transaction chain is executed by the server exactly once. Once the initial user-input transaction is committed, each user-output message throughout the conversation is delivered at least once. If user output is testable, each user-output message is delivered exactly once, provided the initial user-input transaction has been committed. 4/8/2019 Transactional Information Systems

Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems

Queue-based Message Recovery for Exactly-Once Workflow Execution At end of activity execute transaction that combines: writing the activity‘s modifications of workflow state and context to persistent store writing the state modifications that result from the firing of outgoing transitions to persistent store writing the context modifications that result from the actions of firing transitions to persistent store notifying the follow-up activities by enqueueing messages Newly invoked activity executes transaction that combines: dequeueing of notification message writing the workflow state and context to persistent store 4/8/2019 Transactional Information Systems

Use of Queued Transactions in Travel Planning Workflow Select Conference Check Flight Hotel Cost Go No / Budget:=1000; Trials:=1; / Cost = ConfFee + TravelCost [Cost  Budget] [Cost > Budget & Trials < 3] / Trials++ [Cost > Budget & Trials  3] [!ConfFound] [ConfFound] / Cost:=0 Tutorials Compute Fee Áirfare CheckTravelCost CheckConfFee 4/8/2019 Transactional Information Systems

Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems

Problem Scenario with C/S Application Failures User Application Process (Client) Database Server ... input output request reply 2nd App ? replay crash during normal operation during client restart 4/8/2019 Transactional Information Systems

General Considerations for Client-Server Stateful Application Recovery Message logging for message recovery and deterministic program replay (of piecewise deterministic program) Installation points for process recovery and reduced program replay Server processes concurrent threads on behalf of many clients Server “commits its state” upon sending a reply to a client Forced logging should be minimized Server should be able to perform independent recovery 4/8/2019 Transactional Information Systems

Server Reply Logging Method Client and server each maintain a message lookup table (MLT) and write message log entries to a stable log Client performs lazy, non-forced, logging, and periodically creates intallation point, and force-logs user-input messages Server forces its log buffer before sending a reply message Server recovery rebuilds message lookup table and replays incomplete requests to produce reply may need logging of read/write interleaving among threads Client recovery rebuilds MLT, reloads app from last installation point and replays application, intercepting message events and obtaining the contents of messages from local MLT or the server Client sends stability notifications to facilitate server log truncation 4/8/2019 Transactional Information Systems

Data Structures for Server Reply Logging client server stable log file message lookup table MSN Type ... 10 request 20 30 reply 40 70 80 15 input 45 output installation point lazy logging force log upon reply 65 4/8/2019 Transactional Information Systems

Replaying Incomplete Requests with Server Reply Logging client server MSN Type ... 10 request 20 30 reply 15 input ... R(x)W(x)R(x)R(y)W(y)R(y) ... 4/8/2019 Transactional Information Systems

Log Truncation with Server Reply Logging client c server ... MSN Type 15 input 20 request 40 reply 45 output 70 request 70 + stability notification RedoMSN for client c 80 server log other clients client log client message lookup table 4/8/2019 Transactional Information Systems

Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems

Transactional Information Systems Lessons Learned Application recovery needs to reconcile data, process, and message recovery Queue-based message recovery sufficient for stateless and small-state, pseudo-conversational apps (including workflows but disregarding the actual workflow activities) Rich-state applications require more general approach to message logging and process installation points efficiently solved for client-server applications hot research topic for advanced multi-tier e-services (e.g., electronic auctions) 4/8/2019 Transactional Information Systems