Synchronizing Processes

Slides:



Advertisements
Similar presentations
Byzantine Generals. Outline r Byzantine generals problem.
Advertisements

Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Byzantine Generals Problem Anthony Soo Kaim Ryan Chu Stephen Wu.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Disconnected Operation in the Coda File System James J. Kistler and M. Satyanarayanan Carnegie Mellon University Presented by Deepak Mehtani.
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Distributed File Systems Case Studies: Sprite Coda.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Practical Byzantine Fault Tolerance
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
1 The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Presented by Radu Handorean.
1 Resilience by Distributed Consensus : Byzantine Generals Problem Adapted from various sources by: T. K. Prasad, Professor Kno.e.sis : Ohio Center of.
CODA: A HIGHLY AVAILABLE FILE SYSTEM FOR A DISTRIBUTED WORKSTATION ENVIRONMENT M. Satyanarayanan, J. J. Kistler, P. Kumar, M. E. Okasaki, E. H. Siegel,
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
The Byzantine General Problem Leslie Lamport, Robert Shostak, Marshall Pease.SRI International presented by Muyuan Wang.
Distributed File Systems
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
Mobile File Systems.
Synchronizing Processes
Coordination and Agreement
Synchronizing Processes
Coda / AFS Thomas Brown Albert Ng.
Nomadic File Systems Uri Moszkowicz 05/02/02.
The OM(m) algorithm Recall what the oral message model is.
Andrew File System (AFS)
Chapter 25: Advanced Data Types and New Applications
Byzantine Fault Tolerance
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
Providing Secure Storage on the Internet
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Consensus
Today: Coda, xFS Case Study: Coda File System
Jacob Gardner & Chuan Guo
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Byzantine Generals Problem
Distributed File Systems
Byzantine Faults definition and problem statement impossibility
Distributed File Systems
Consensus in Synchronous Systems: Byzantine Generals Problem
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
The Byzantine Generals Problem
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed Resource Management: Distributed Shared Memory
Distributed File Systems
Byzantine Generals Problem
Sisi Duan Assistant Professor Information Systems
Presentation transcript:

Synchronizing Processes Clocks External clock synchronization (Cristian) Internal clock synchronization (Gusella & Zatti) Network Time Protocol (Mills) Decisions Agreement protocols (Fischer) Data Distributed file systems (Satyanarayanan) Memory Distributed shared memory (Nitzberg & Lo) Schedules Distributed scheduling (Isard et al.) Synchronizing Processes

Agreement Problems Require all non-faulty (or correct) processes to come to an agreement Three types of problems: Consensus: Each process Pi proposes a value vi and all non-faulty processes agree on a consensus value c Interactive Consistency: Each process Pi proposes a value vi and all non-faulty processes agree on a consensus vector c = <v1, v2, …, vN> Byzantine (Generals or Reliable Broadcast): One process Pg proposes a value vg and all non-faulty processes agree on a consensus value c = vg Synchronizing Processes > Agreement Protocols > Agreement Problems

Relations Among the Problems Since the interactive consistency problem can be solved with a Byzantine protocol Bz And the consensus problem can be solved with an interactive consistency protocol The consensus problem can be solved with a Byzantine protocol Bz N copies of the Bz protocol are run in parallel, where each processor Pi acts as the commander (Pg) for exactly one copy of the protocol The non-faulty processors use the majority vote of the consensus vector as the consensus value Hence, a Byzantine protocol can solve all three problems Synchronizing Processes > Agreement Protocols > Agreement Problems

The Byzantine Generals Problem [Lamport, Shostak, & Pease, 1982.] Basic idea is very similar to the consensus problem: Each of N generals has a value v(i), (e.g. “attack” or “retreat”). We want an algorithm to allow all generals to exchange their values such that the following hold: All non-faulty generals must agree on the values of v(1),…,v(N). If the i th general is non-faulty, then the value agreed for v(i) must be the i th general’s value.

Byzantine Generals Problem The problem described earlier can be solved by restricting attention to one commanding general and considering all others to be lieutenants. A commanding general must send an order to his N–1 lieutenants, such that: IC1: All loyal lieutenants obey the same order. IC2: If the commander is loyal, then loyal lieutenants obey the order he sends.

Oral Message Algorithm Assumptions: Every message that is sent is delivered correctly The receiver of a message knows who sent it The absence of a message can be detected Assumptions #1 and #2 prevent a traitor from interfering with the communication between two other generals Assumption #3 foils a traitor who tries to prevent a decision by simply not sending messages Denoted OM(m), where m is the maximum number of traitors the system can handle Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages

Impossibility Theorem If processes can only send unauthenticated messages, more than two thirds of the processes must be non-faulty to derive a solution In other words, no solution exists for a system with fewer than 3m + 1 nodes, where m is the number of faulty processes Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages

Algorithm with Oral Messages Algorithm OM(m) (defined recursively) tolerates m traitors. Algorithm OM(0): Commander sends value to each lieutenant. Each lieutenant uses the value received from the commander (or “retreat” if no message is received). Algorithm OM(m), m > 0: Each lieutenant uses OM(m–1) to send the value received (take this value to be “retreat” if not received) to the other N–2 lieutenants. Each lieutenant uses the majority of the values received from the commander and the other lieutenants in the previous two steps.

Intuition If the commander is loyal, then he sends the same command to all lieutenants. In this case, the lieutenants all agree on the correct command by majority, as in the example. If the commander is a traitor, then he may send different commands to different lieutenants. However, this leaves one fewer traitors among the lieutenants, making it easier to reach agreement among them. (When the commander is a traitor, they can agree on any command.)

Signed Message Algorithm Assumptions: Every message that is sent is delivered correctly The receiver of a message knows who sent it The absence of a message can be detected Signatures: A loyal general’s signature cannot be forged, and any alteration of the contents of his signed messages can be detected Anyone can verify the authenticity of a general’s signature Denoted SM(m), the algorithm can cope with m traitors for any number of generals I.e., it is now possible to tolerate any number of traitors Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

SM(m) Algorithm Initially Vi = { } The commander P0 signs and sends his value to every lieutenant If lieutenant i receives a message of the form v:0 from the commander, then it adds v to Vi and sends the message v:0:i to every other lieutenant If lieutenant i receives a message of the form v:0:j1:…:jk and v is not in Vi then if k < m, it sends the message v:0:j1:…:jk:i to every lieutenant other than j1, …, jk When lieutenant i will receive no more messages, it obeys choice(Vi) Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Choice Function The choice function is applied to a set of orders to obtain a single one Requirements: If the set V consists of a single element v, then choice(V) = v If the set V is empty, then choice(V) = a predetermined value Possibilities: choice(V) selects the majority of set V or a predetermined value if there is not a majority choice(V) selects the median of set V, if the elements of V can be ordered Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Choice Function The choice function is applied to a set of orders to obtain a single one Requirements: If the set V consists of a single element v, then choice(V) = v If the set V is empty, then choice(V) = a predetermined value Basic Idea: If Commander is loyal, then all messages will be of the form V:0:w*. (No forging.) So, all lieutenants end up with Vi = {V}. If Commander is a traitor, then loyal lieutenants can detect it. Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

SM(m) Example After step 3, V1 = V2 = {Attack, Retreat} Intuitively, both lieutenants can tell the commander is a tritor With no majority, choice would default to Retreat Commander Attack:0 Retreat:0 Attack:0:1 Lieutenant 1 Lieutenant 2 Retreat:0:2 Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Synchronizing Processes Clocks External clock synchronization (Cristian) Internal clock synchronization (Gusella & Zatti) Network Time Protocol (Mills) Decisions Agreement protocols (Fischer) Data Distributed file systems (Satyanarayanan) Memory Distributed shared memory (Nitzberg & Lo) Schedules Distributed scheduling (Isard et al.) Synchronizing Processes

Synchronizing Processes: Distributed File Systems CS/CE/TE 6378 Advanced Operating Systems

Distributed File Systems Distributed File System (DFS): a system that provides access to the same storage for a distributed network of processes Common Storage Synchronizing Processes > Distributed File Systems

Benefits of DFSs Data sharing is simplified User mobility is supported Files appear to be local Users are not required to specify remote servers to access User mobility is supported Any workstation in the system can access the storage System administration is easier Operations staff can focus on a small number of servers instead of a large number of workstations Better security is possible Servers can be physically secured No user programs are executed on the servers Site autonomy is improved Workstations can be turned off without disrupting the storage Synchronizing Processes > Distributed File Systems

Design Principles for DFSs Utilize workstations when possible Opt to perform operations on workstations rather than servers to improve scalability Cache whenever possible This reduces contention on centralized resources and transparently makes data available whenever used Exploit file usage characteristics Knowledge of how files are accessed can be used to make better choices Ex: Temporary files are rarely shared, hence can be kept locally Minimize system-wide knowledge and change Scalability is enhanced if global information is rarely monitored or updated Trust the fewest possible entities Security is improved by trusting a smaller number of processes Batch if possible Transferring files in large chunks improve overall throughput Synchronizing Processes > Distributed File Systems

Quiz Question Which of the following was not a design principle from the Andrew and Coda file systems? Cache whenever possible. Decentralize operations when possible. Minimize system-wide knowledge. Trust the most possible entities. Synchronizing Processes > Distributed File Systems

Mechanisms for Building DFSs Mount points Enables filename spaces to be “glued” together to provide a single, seamless, hierarchical namespace Client caching Contributes the most to better performance in DFSs Hints Pieces of information that can substantially improve performance if correct but no negative consequence if erroneous Ex: Caching mappings of pathname prefixes Bulk data transfer Reduces network communication overhead by transferring in bulk Encryption Used for remote authentication, either with private or public keys Replication Storing the same data on multiple servers increases availability Synchronizing Processes > Distributed File Systems

Quiz Question Which of the following is not an important mechanism for developing distributed file systems? Caching data at clients, either entire files or portions of files. Encrypting data transmissions, either with private or public keys. Read-only data replication for files that change often. Transferring data in bulk to reduce communication overheads. Synchronizing Processes > Distributed File Systems

DFS Case Studies Two case studies: Both were: Andrew (AFS) Coda Developed at Carnegie Mellon University (CMU) A Unix-based DFS Focused on scalability, security, and availability Synchronizing Processes > Distributed File Systems

Andrew File System (AFS) Vice is a collection of trusted file servers Venus is a service that runs on each workstation to mediate shared file access Venus Venus Vice Vice Venus Venus Venus Venus Synchronizing Processes > Distributed File Systems > Andrew

AFS-1 Used from 1984 through 1985 Each server contained a local file system mirroring the structure of the shared file system If a file was not on the server, a search would end in a stub directory that identified the server containing the file Clients cached pathname prefix information to direct file requests to the appropriate servers Venus used a pessimistic approach to maintaining cache coherence All cached files copies were considered suspect Venus would contact Vice to verify the cache was the latest version before accessing the file Synchronizing Processes > Distributed File Systems > Andrew

AFS-2 Used from 1985 through 1989 Venus now used an optimistic approach to maintaining cache coherence All cached files were considered valid Callbacks were used When files are cached on a workstation, the server promises to notify the workstation if the file is to be modified by another machine A remote procedure call (RPC) mechanism was used to optimize bulk file transfers Mount points and volumes were used instead of stub directories to easily move files around among the servers Each user was normally assigned a volume and a disk quota Read-only replication of volumes increased availability Synchronizing Processes > Distributed File Systems > Andrew

Quiz Question What is a callback? A notification from a client that a local cache has been modified. A notification from a server that a file or directory is to be modified. Both of the above. None of the above. Synchronizing Processes > Distributed File Systems > Andrew

AFS-3 Used from 1989 through early 1990s Supports multiple administrative cells, each with its own servers, workstations, system admins, and users Each cell is completely autonomous Venus now cached files in large chunks instead of their entirety Synchronizing Processes > Distributed File Systems > Andrew

Security in Andrew Protection domains Authentication Each is composed of users and groups Each group is associated with a unique owner (user) A protection server is used to immediately reflect changes in domains Authentication Upon login, a user’s password is used to obtain tokens from an authentication server Venus uses these tokens to establish connections to the RPC File system protection Access lists are used to determine access to directories instead of files, including negative rights Resource usage Andrew’s protection and authentication mechanisms protect against denials of service and resources Synchronizing Processes > Distributed File Systems > Andrew

Coda A descendant of AFS-2 Substantially more resilient to server and network failures By relying entirely on local resources (caches) when the servers are inaccessible Allows a user to continue working regardless of failures elsewhere in the system Synchronizing Processes > Distributed File Systems > Coda

Coda Overview Clients cache entire files on their local disks Cache coherence is maintained by the use of callbacks Clients dynamically find files on servers and cache location information Token-based authentication and end-to-end encryption are used for security Provides failure resiliency through two mechanisms: Server replication: storing copies of files on multiple servers Disconnected operation: mode of optimistic execution in which the client relies solely on cached data Synchronizing Processes > Distributed File Systems > Coda

Server Replication Replicated Volume: Volume Storage Group (VSG): consists of several physical volumes or replicas that are managed as one logical volume by the system Volume Storage Group (VSG): a set of servers maintaining a replicated volume Accessible VSG (AVSG): the set of servers currently accessible Venus performs periodic probes to detect AVSGs One member is designated as the preferred server Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Quiz Question What is a VSG? Venus Service Group Vice Server Group Volume Storage Group None of the above Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Server Replication Venus employs a Read-One, Write-All strategy For a read request, If a local cache exists, Venus will read the cache instead of contacting the VSG If a local cache does not exist, Venus will contact the preferred server for its copy Venus will also contact the other AVSG for their version numbers If the preferred version is stale, a new, up-to-date preferred server is selected from the AVSG and the fetch is repeated Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Server Replication Venus employs a Read-One, Write-All strategy For a write, When a file is closed, it is transferred to all members of the AVSG If the server’s copy does not conflict with the client’s copy, an update operation handles transferring file contents, making directory entries, and changing access lists A data structure called the update set, which summarizes the client’s knowledge of which servers did not have conflicts, is distributed to the servers Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Disconnected Operation Begins at a client when no member of a VSG is accessible Clients are allowed to rely solely on local caches If a cache does not exist, the system call that triggered the file access is aborted Disconnected operation ends when Venus established a connection with the VSG Venus executes a series of update processes to reintegrate the client with the VSG Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation

Disconnected Operation Reintegration updates can fail for two reasons: There may be no authentication tokens that Venus can use to communicate securely with AVSG members due to token expirations Conflicts may be detected If reintegration fails, a temporary repository is created on the servers to store the data in question until a user can resolve the problem later These temporary repositories are called covolumes Mitigate is the operation that transfers a file or directory from a workstation to a covolume Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation

Conflict Resolution When a conflict is detected, Coda first attempts to resolve it automatically Ex: partitioned creation of uniquely named files in the same directory can be handled automatically by selectively replaying the missing file creates If automated resolution is not possible, Code marks all accessible replicas inconsistent and moves them to their covolumes Coda provides a repair tool to assist users in manually resolving conflicts Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution

Quiz Question Which of the following is not true about conflict resolution in the Coda DFS? Coda attempts to resolve conflicts by recreating any missing files in a directory. Coda inspects workstations for the most up-to-date cache of the conflicted file. For file-level conflicts, Coda marks all replicas as inconsistent and moves them to a covolume. Users manually resolve file-level conflicts using a provided repair tool. Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution