Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned “Basic research is what I‘m doing when I don‘t know what I‘m doing.” (Wernher von Braun) “Research is the act of going up alleys to see if they are blind.” (Plutarch) 4/8/2019 Transactional Information Systems
From ACID To Recovery Guarantees Problem: Client that does not receive a returncode from transactional server cannot easily find out the transaction outcome and may be tempted to re-initiate the (non-idempotent) transaction, thus producing unacceptable effects. Approach: In addition to atomicity, the transactional server needs to guarantee the exactly-once execution of the transaction, where execution includes the server‘s reply message. (almost) perfect failure masking 4/8/2019 Transactional Information Systems
Stateless Applications Based on Queues (running on client, or app server or data server): user sends input message app program sends request message to data server data server executes transaction and sends reply message to app app program sends output message to user there are no conversations with the user within a transaction, and subsequent transactions are independent Solution Queued Transactions: message recovery by queue manager with persistent, recoverable message queues exactly-once execution by enclosing message dequeue and enqueue into transaction 4/8/2019 Transactional Information Systems
Illustration of 2-Tier Queued Transaction User Application Process (Client) Database Server ... input output enqueue request dequeue reply server transaction 4/8/2019 Transactional Information Systems
Illustration of 3-Tier Queued Transaction User Client Database Server ... input output enqueue request dequeue reply distributed server transaction Application 4/8/2019 Transactional Information Systems
Correctness of Queued Transaction Protocol Theorem 17.1: With the queued transaction protocol for stateless applications, the following guarantees hold: 1. Once the user-input transaction is committed, a request is executed by the server exactly once. 2. Once the user-input transaction is committed, the user output is delivered at least once. 3. If user output is testable, the user output is delivered exactly once, provided the user-input transaction has been committed. Inherent (small window of) uncertainty: (last) user input may get lost (last) user output may be sent more than once can be eliminated with testable output (using special hardware) 4/8/2019 Transactional Information Systems
Client During Normal Operation user-input processing by client: begin transaction; enqueue (request); commit transaction; user-output processing by client: wait until reply queue is not empty; dequeue (reply); while user has not acknowledged the reply or sent the next request do present reply to user; end /*while*/; 4/8/2019 Transactional Information Systems
Server During Normal Operation request-reply processing by data server: begin transaction; dequeue (request); perform data operations and generate reply; enqueue (reply); commit transaction; 4/8/2019 Transactional Information Systems
Client and Server Restart Client restart: check reply queue; if not empty then process reply like during normal operation; end /*if*/; Server restart: check request queue; if not empty then initiate processing of requests like during normal operation end /*if*/; 4/8/2019 Transactional Information Systems
Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems
Pseudo-Conversational Transactions for Stateful Applications Queue-based message recovery for entire conversations Conversational “logical unit of work” broken down into chain of stateless transactions with (small) application state maintained in the queue (analogously to Cookies, but more general and much more reliable) Dequeue of reply and enqueue of next request combined into one transaction for exactly-once execution guarantee good for apps such as travel reservation, electronic shopping, etc. 4/8/2019 Transactional Information Systems
Illustration of Pseudo-Conversational Transactions User Application Process (Client) Database Server ... 4/8/2019 Transactional Information Systems
Correctness of Pseudo-Conversational Transaction Protocol Theorem 17.2: With the queue-based message recovery for conversational multi-step transaction chains, the following guarantees hold: Once the initial user-input transaction that starts the entire conversation is committed, the entire transaction chain is executed by the server exactly once. Once the initial user-input transaction is committed, each user-output message throughout the conversation is delivered at least once. If user output is testable, each user-output message is delivered exactly once, provided the initial user-input transaction has been committed. 4/8/2019 Transactional Information Systems
Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems
Queue-based Message Recovery for Exactly-Once Workflow Execution At end of activity execute transaction that combines: writing the activity‘s modifications of workflow state and context to persistent store writing the state modifications that result from the firing of outgoing transitions to persistent store writing the context modifications that result from the actions of firing transitions to persistent store notifying the follow-up activities by enqueueing messages Newly invoked activity executes transaction that combines: dequeueing of notification message writing the workflow state and context to persistent store 4/8/2019 Transactional Information Systems
Use of Queued Transactions in Travel Planning Workflow Select Conference Check Flight Hotel Cost Go No / Budget:=1000; Trials:=1; / Cost = ConfFee + TravelCost [Cost Budget] [Cost > Budget & Trials < 3] / Trials++ [Cost > Budget & Trials 3] [!ConfFound] [ConfFound] / Cost:=0 Tutorials Compute Fee Áirfare CheckTravelCost CheckConfFee 4/8/2019 Transactional Information Systems
Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems
Problem Scenario with C/S Application Failures User Application Process (Client) Database Server ... input output request reply 2nd App ? replay crash during normal operation during client restart 4/8/2019 Transactional Information Systems
General Considerations for Client-Server Stateful Application Recovery Message logging for message recovery and deterministic program replay (of piecewise deterministic program) Installation points for process recovery and reduced program replay Server processes concurrent threads on behalf of many clients Server “commits its state” upon sending a reply to a client Forced logging should be minimized Server should be able to perform independent recovery 4/8/2019 Transactional Information Systems
Server Reply Logging Method Client and server each maintain a message lookup table (MLT) and write message log entries to a stable log Client performs lazy, non-forced, logging, and periodically creates intallation point, and force-logs user-input messages Server forces its log buffer before sending a reply message Server recovery rebuilds message lookup table and replays incomplete requests to produce reply may need logging of read/write interleaving among threads Client recovery rebuilds MLT, reloads app from last installation point and replays application, intercepting message events and obtaining the contents of messages from local MLT or the server Client sends stability notifications to facilitate server log truncation 4/8/2019 Transactional Information Systems
Data Structures for Server Reply Logging client server stable log file message lookup table MSN Type ... 10 request 20 30 reply 40 70 80 15 input 45 output installation point lazy logging force log upon reply 65 4/8/2019 Transactional Information Systems
Replaying Incomplete Requests with Server Reply Logging client server MSN Type ... 10 request 20 30 reply 15 input ... R(x)W(x)R(x)R(y)W(y)R(y) ... 4/8/2019 Transactional Information Systems
Log Truncation with Server Reply Logging client c server ... MSN Type 15 input 20 request 40 reply 45 output 70 request 70 + stability notification RedoMSN for client c 80 server log other clients client log client message lookup table 4/8/2019 Transactional Information Systems
Chapter 17: Application Recovery 17.2 Stateless Applications Based on Queues 17.3 Stateful Applications Based on Queues 17.4 Workflows Based on Queues 17.5 General Client-Server Applications 17.6 Lessons Learned 4/8/2019 Transactional Information Systems
Transactional Information Systems Lessons Learned Application recovery needs to reconcile data, process, and message recovery Queue-based message recovery sufficient for stateless and small-state, pseudo-conversational apps (including workflows but disregarding the actual workflow activities) Rich-state applications require more general approach to message logging and process installation points efficiently solved for client-server applications hot research topic for advanced multi-tier e-services (e.g., electronic auctions) 4/8/2019 Transactional Information Systems