Brahim Ayari, Abdelmajid Khelil and Neeraj Suri

Slides:



Advertisements
Similar presentations
Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Advertisements

High throughput chain replication for read-mostly workloads
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
Nummenmaa & Thanish: Practical Distributed Commit in Modern Environments PDCS’01 PRACTICAL DISTRIBUTED COMMIT IN MODERN ENVIRONMENTS by Jyrki Nummenmaa.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 12: Three-Phase Commits (3PC) Professor Chen Li.
CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Two phase commit. What we’ve learnt so far Sequential consistency –All nodes agree on a total order of ops on a single object Crash recovery –An operation.
Transaction Processing in Mobile Distributed Databases Sherida Jacob CSC 536 5/2/2005.
Dept. of Computer Science & Engineering, CUHK Fault Tolerance and Performance Analysis in Wireless CORBA Chen Xinyu Supervisor: Markers: Prof.
Distributed Databases
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
CS162 Section Lecture 10 Slides based from Lecture and
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
UbiStore: Ubiquitous and Opportunistic Backup Architecture. Feiselia Tan, Sebastien Ardon, Max Ott Presented by: Zainab Aljazzaf.
William Easton. Introduction  Mobile Environments  Locking and Data Starvation  Mobile DB Architecture  Timing Mechanisms  Static Timer  Dynamic.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Distributed Transactions Chapter 13
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
ISADS'03 Message Logging and Recovery in Wireless CORBA Using Access Bridge Michael R. Lyu The Chinese Univ. of Hong Kong
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
University of Tampere, CS Department Distributed Commit.
© Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group Gossiping: Adaptive and Reliable Broadcasting.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Challenges to Reliable Data Transport Over Heterogeneous Wireless Networks.
Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.
1 Transport Control Protocol for Wireless Connections ElAarag and Bassiouni Vehicle Technology Conference 1999.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Movement-Based Check-pointing and Logging for Recovery in Mobile Computing Systems Sapna E. George, Ing-Ray Chen, Ying Jin Dept. of Computer Science Virginia.
More on Fault Tolerance
Atomic Transactions in Distributed Systems
Dominik Kaspar, Eunsook Kim, Carles Gomez, Carsten Bormann
Ad-hoc Networks.
Outline Introduction Background Distributed DBMS Architecture
Database System Implementation CSE 507
Two phase commit.
Operating System Reliability
Operating System Reliability
CSE 4340/5349 Mobile Systems Engineering
Commit Protocols CS60002: Distributed Systems
RELIABILITY.
Outline Introduction Background Distributed DBMS Architecture
Outline Announcements Fault Tolerance.
Operating System Reliability
Operating System Reliability
CSE 486/586 Distributed Systems Concurrency Control --- 3
Assignment 5 - Solution Problem 1
Distributed Transactions
Lecture 21: Replication Control
DISTRIBUTED DATABASES
Brahim Ayari, Abdelmajid Khelil, Neeraj Suri and Eugen Bleim
Operating System Reliability
Distributed Databases Recovery
Chapter 19: Distributed Transaction Recovery
UNIVERSITAS GUNADARMA
Transactions in Distributed Systems
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Abstractions for Fault Tolerance
CIS 720 Concurrency Control.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Operating System Reliability
University of Wisconsin-Madison Presented by: Nick Kirchem
Operating System Reliability
Presentation transcript:

FT-PPTC: An Efficient and Fault-Tolerant Commit Protocol for Mobile Environments Brahim Ayari, Abdelmajid Khelil and Neeraj Suri Department of Computer Science TU Darmstadt, Germany {brahim, khelil, suri}@informatik.tu-darmstadt.de

Motivation Mobile transactions Mobile participants Atomic commit is a Mobile inventory M-commerce Mobile gaming Mobile participants Atomic commit is a fundamental operation Wired Network

Motivation Heterogeneity Perturbations Wireless networks Mobile nodes Slowest process determines resource blocking time Perturbations Disconnections Node/Comm failures Increases resource blocking time or abort rate Wired Network WLAN GPRS UMTS

Objectives Ensure strict atomicity Minimize the blocking time of resources in the wired network Minimize transaction abort rate (heterogeneity and fault-tolerance) Reduce (wireless) message complexity System Model Transaction Model Fault Model Our Approach Related Work Evaluation Summary

System Model Mobile Hosts (MH) vs. Fixed Hosts (FH) Heterogeneity in MHs may not have stable storage Heterogeneity in Wireless network (WLAN, UMTS, GPRS ...)  Delay and bandwidth Mobile nodes (laptops, PDAs, cellular phones …)  Processing capability Autonomy of (mobile) nodes Management and sharing of data Execution/decision to commit or abort the transaction

Transaction Model Mobile transaction: Coordinator (CO) role Distributed transaction where at least one MH participates in its execution Set of execution fragments Coordinator (CO) role Distributes execution fragments to corresponding nodes Takes the final decision either to commit or abort Stores information about execution state Implemented by a FH 2PC in the fixed network

Fault Model Communication failures Node failures Message loss Network disconnection Node failures Transient MH failures: SW/HW faults, power loss, user-triggered failures (put-off) … Permanent MH failures: Loss, damage … CO failures: Crash recovery model FH failures: Crash recovery model

Related Work Unilateral Commit for Mobile (UCM) [PDCS’00] Supports mobile participants Strict assumptions (two-phase locking for all participants) Mobile Two Phase Commit (M-2PC) [ADC’05] Mobile participants are assumed to be connected from initializing the transaction until finishing the execution of their fragments Transaction Commit On Timeout (TCOT) [TOC’02] Uses timeouts to reduce the number of messages exchanged Provides only semantic atomicity and not strict atomicity Supports only one mobile participant Entail high resource blocking time Handle (small) subset of failures

Fault-Tolerant Pre-Phase Transaction Commit (FT-PPTC) Main idea: ►Decouple commit of mobile participants from fixed participants ►Initial mobile commit, delegate problem to fixed part of network;  Reduction of resource blocking time on fixed participants  Re-use established protocols (commit, recovery, fault-tolerance) CO distributes fragments first for mobile participants and waits for their execution results (Pre-Phase) CO waits for a mobile participant for the timeouts that the participant initializes and updates (execution and shipping timeouts)  CO waits for the slowest mobile participant and proceeds with the following phase iff the pre-phase succeeds Handles network and node heterogeneity Each mobile participant is represented in the fixed network by a proxy called mobile host agent (MH-Agent)  Handles failures: Crash, disconnection, wireless message loss …

Send corresponding execution fragment The FT-PPTC Protocol Initiator Coordinator MH-Agent Mobile Participant Fixed Participant Begin Send transaction & own timeouts Forward fragment Send corresponding execution fragment Send timeouts Forward timeouts Hetero- geneity ~ Pre-Commit Phase ~ Send updates Send YES Vote Send updates Fixed Participant Execution fragment Ack 2PC Prepare Core 2PC Phase Vote Decision Decision Decision Ack Ack Ack End Release resources Release resources Release resources

FT-PPTC: Failure Resilience Scenarios Initiator Coordinator MH-Agent Mobile Participant Begin Send transaction and timeouts Forward fragment Send corresponding execution fragment Send timeouts Transient Disconnect. Forward timeouts ~ Pre-Commit Phase ~ Extend timeouts Forward extended timeouts Send updates ~ Permanent failures Send updates Send YES Vote Fixed Participant Execution fragment Ack 2PC Prepare Core 2PC Phase Vote Decision Decision Decision Ack Ack Ack End Release resources Release resources Release resources

Related Work Unilateral Commit for Mobile (UCM) [PDCS’00] Supports mobile participants Strict assumptions (two-phase locking for all participants) Mobile Two Phase Commit (M-2PC) [ADC’05] Mobile participants are assumed to be connected from initializing the transaction until finishing the execution of their fragments Transaction Commit On Timeout (TCOT) [TOC’02] Uses timeouts to reduce the number of messages exchanged Provides only semantic atomicity and not strict atomicity Supports only one mobile participant Entail high resource blocking time Handle (small) subset of failures

Comparison to Related Work Comparison of the message complexity in failure-free case: mp : #mobile participants sp : #fixed participants e : #timeout extensions Coverage of failures: Protocol Atomicity # Phases Wireless Message Complexity Overall Message Complexity FT-PPTC strict 2 (2 + 1)mp + e (3mp + e) + (4sp + 2mp + e) M-2PC (2 + 1)mp - 1 (4mp - 1) + 4sp TCOT semantic 1 2mp – 1 + e (2mp - 1) + 2sp + eall UCM (1 + 1)mp 2mp + 2sp Protocol Transient MH-failure Permanent MH-failure Coordinator failure Message loss Delay of messages Network disconnections FT-PPTC x M-2PC TCOT UCM

Simulation Settings Simulator: SimJava 2.0 (University of Edinburgh) Load model: Each MH initiates a mobile transaction Parameter Value # fixed participants 4 # mobile participants [1, 25] Average execution time of one fragment (MH) 5 ms Average execution time of one fragment (FH) 2 ms Transmission delay over wireless link 10 ms Transmission delay over wired link Average number of fragments per MT 5

Simulation Results Throughput Resource Blocking Time  FT-PPTC is comparable to 2PC  FT-PPTC is scalable

Conclusions and Future Work FT-PPTC: Fault-tolerant atomic commit protocol for mobile transactions Decouples commit of mobile participants from fixed participants  Minimizes blocking time of resources on fixed part of the network  Provides resilience to both communication and node failures Future work Design efficient fault-tolerant recovery mechanisms Comprehensive sensitivity analysis Consider ad-hoc communication between the mobile devices

Questions ? Brahim Ayari, Abdelmajid Khelil and Neeraj Suri Department of Computer Science TU Darmstadt, Germany {brahim, khelil, suri}@informatik.tu-darmstadt.de