Transis Dynamic Voting for Consistent Primary Components PODC 1997 talk slides Esti Yeger Lotem, Idit Keidar and Danny Dolev The Hebrew University

Slides:



Advertisements
Similar presentations
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Efficient Solutions to the Replicated Log and Dictionary Problems
Giovanni Chierico | May 2012 | Дубна Consistency in a distributed world.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group.
Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
CS 582 / CMPE 481 Distributed Systems
1 Availability Study of Dynamic Voting Algorithms Kyle Ingols and Idit Keidar MIT Lab for Computer Science.
Transis Efficient Message Ordering in Dynamic Networks PODC 1996 talk slides Idit Keidar and Danny Dolev The Hebrew University Transis Project.
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
1 Dynamic Atomic Storage Without Consensus Alex Shraer (Technion) Joint work with: Marcos K. Aguilera (MSR), Idit Keidar (Technion), Dahlia Malkhi (MSR.
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Real-Time Distributed Databases By: Chris Scardino CSC536 Monday, May 2, 2005.
Mutual Consistency Detection of mutual inconsistency in distributed systems (Parker, Popek, et. al.) Distributed system with replication for reliability.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
1 Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group Paradigms for Building Distributed Systems: Performance Measurements and.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
1 A Framework for Highly Available Services Based on Group Communication Alan Fekete Idit Keidar University of Sidney MIT.
Ad Hoc Mobility Management with Uniform Quorum Systems.
Transis 1 Fault Tolerant Video-On-Demand Services Tal Anker, Danny Dolev, Idit Keidar, The Transis Project.
Distributed Databases
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Distributed Deadlocks and Transaction Recovery.
Minimum Cost Blocking Problem in Multi-path Wireless Routing Protocols.
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
A Randomized Error Recovery Algorithm for Reliable Multicast Zhen Xiao Ken Birman AT&T Labs – Research Cornell University.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
© 2011 wheresjenny.com Minutes of meeting. © 2011 wheresjenny.com Meeting minutes are the written or recorded documentation that is used to inform attendees.
© Logicalis Group Using DB2/400 effectively. Data integrity facilities Traditional iSeries database usage Applications are responsible for data integrity.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Association Rule Mining in Peer-to-Peer Systems Ran Wolff Assaf Shcuster Department of Computer Science Technion I.I.T. Haifa 32000,Isreal.
Improving the Efficiency of Fault-Tolerant Distributed Shared-Memory Algorithms Eli Sadovnik and Steven Homberg Second Annual MIT PRIMES Conference, May.
University of Tampere, CS Department Distributed Commit.
Concurrency Control. Objectives Management of Databases Concurrency Control Database Recovery Database Security Database Administration.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.
“Virtual Time and Global States of Distributed Systems”
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
November NC state university Group Communication Specifications Gregory V Chockler, Idit Keidar, Roman Vitenberg Presented by – Jyothish S Varma.
1 An Efficient, Low-Cost Inconsistency Detection Framework for Data and Service Sharing in an Internet-Scale System Yijun Lu †, Hong Jiang †, and Dan Feng.
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.
Committed:Effects are installed to the database. Aborted:Does not execute to completion and any partial effects on database are erased. Consistent state:
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Group Communication Theresa Nguyen ICS243f Spring 2001.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
CSE 486/586 Distributed Systems Consistency --- 3
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Fault Tolerance (2). Topics r Reliable Group Communication.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Seminar On Rain Technology
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Minutes of meeting.
Chapter 19: Distributed Databases
EECS 498 Introduction to Distributed Systems Fall 2017
Outline Announcements Fault Tolerance.
Reliable Distributed Systems
Consistency and Replication
Transactions-Concurrency Problems
Presentation transcript:

Transis Dynamic Voting for Consistent Primary Components PODC 1997 talk slides Esti Yeger Lotem, Idit Keidar and Danny Dolev The Hebrew University

Transis Primary Components Allows one subset of the processes to function when failures occur: –Database applications –Group communication systems (e.g., ISIS) Often based on majority (quorum) –Difficult to adapt to dynamic changes in the set of participants –Problematic in unreliable networks

Transis Dynamic Voting Defines quorums adaptively: majority of the previous quorum, e.g., {a,b,c,d,e} {a,b,c} {a,b} Naturally adapts to dynamic changes in the set of participants In unreliable networks, proven to lead to better performance: simulations, empirical tests, stochastic analysis

Transis Dynamic Linear Voting Breaks ties between groups of equal size Uses a linear order, L, on all potential processes in the system Sub_Quorum(S, T) if: –T contains a majority of S, or –T contains half the members of S including the member, p, of S with the highest L( p )

Transis Our Dynamic Voting Protocol for primary components Consistent Allows processes to join and leave on the fly Efficient –Low communication overhead Simple to implement Robust –Processes and links may fail

Transis The Challenge: Coping with failures that occur in the course of the protocol {a, b, c} attempt to form a quorum a and b succeed c detaches, unaware of the attempt

Transis The Challenge (Cont’d) {a, b} form a quorum –majority of {a, b, c} Concurrently {c, d, e} form a quorum –majority of {a, b, c, d, e}  Inconsistency!

Transis Other Protocols Inconsistent Two-phase Commit –Limits the availability ISIS - Cold start when primary is lost Phoenix - Three phase consensus protocol –High communication overhead

Transis Our Solution: A Second Level of Knowledge If a and b succeed in forming {a, b, c} then c is aware of the attempt. For c, {a, b, c} is ambiguous: {a, b, c} may or may not have been formed. –Processes record ambiguous attempts In our example, c records both: {a, b, c, d, e} and {a, b, c} –Requires a majority of both  c will refuse to form {c, d, e}

Transis A Session of the Protocol General Scheme Invoked Upon Membership Changes 1Exchange information 2If Sub_Quorum of the last primary and of all ambiguous attempts, Attempt: –Record the attempt as ambiguous 3If all attempted, Form: –Become the primary in the system Delete all ambiguous attempts

Transis Storing all the Ambiguous Attempts is not Feasible When failures cascade, the number of ambiguous attempts may be exponential: {a,b,c,d,e}, {a,b,c}, {a,b,d}, {a,b,e}, {a,b,c,f,g}, {a,b,c,f}, {a,b,d,g}, … Ambiguous attempts  constraints We use a “garbage collection” mechanism to store only a linear number of attempts

Transis Our “Garbage Collection” Mechanism Resolution rules: –If the attempt was formed by some member, adopt it as your primary. –If the attempt was not formed by any of the members, delete it. Learning rules: – p learns the status of q w.r.t. attempt A1 during a later attempt A2 Linear

Transis Why is this linear? (In the number of processes) If p and q participate in two attempts, A1 and A2 then: p learned whether q formed A1 before attempting to form A2 Once p learns about A1 from all its members, p can resolve A1  In each recorded attempt there is a member that does not appear in later attempts

Transis Criticism of Dynamic Voting Quorums can become very small (even one process) –Failure of a single process may cause the rest of the system to block Desirable solution: –Set a threshold, Min_Quorum, on quorum size –(N - Min_Quorum) process are always a quorum –Min_Quorum reflects the tradeoff between “static” and “dynamic”

Transis The Challenge What happens if N (the number of processes) changes on the fly? –N - Min_Quorum changes –The “truth value” of Sub_Quorum changes (no longer a predicate) Asynchronous distributed system –Different processes may know of different values of N

Transis Adding New Processes Carefully - two steps W - The set of participating processes –may “vote” for quorums A - Candidates to be added to W –do not “vote” for quorums –“vote against” large quorums New processes are added: – to A in the attempt step, – to W in the form step

Transis The Min_Quorum Requirement Every quorum must contain more than Min_Quorum members of W –at least Min_Quorum “vote for” Every group that contains all but Min_Quorum members of W and A is a quorum (regardless of past quorums) –At most Min_Quorum “vote against”

Transis Conclusions Consistently maintaining a primary component using dynamic voting More available than other protocols Simple and efficient No need for cold start New mechanism: always allowing large groups to be quorums where processes can join on the fly