Synchronizing Processes

Synchronizing Processes
Clocks External clock synchronization (Cristian) Internal clock synchronization (Gusella & Zatti) Network Time Protocol (Mills) Decisions Agreement protocols (Fischer) Data Distributed file systems (Satyanarayanan) Memory Distributed shared memory (Nitzberg & Lo) Schedules Distributed scheduling (Isard et al.) Synchronizing Processes

Agreement Problems Require all non-faulty (or correct) processes to come to an agreement Three types of problems: Consensus: Each process Pi proposes a value vi and all non-faulty processes agree on a consensus value c Interactive Consistency: Each process Pi proposes a value vi and all non-faulty processes agree on a consensus vector c = <v1, v2, …, vN> Byzantine (Generals or Reliable Broadcast): One process Pg proposes a value vg and all non-faulty processes agree on a consensus value c = vg Synchronizing Processes > Agreement Protocols > Agreement Problems

Relations Among the Problems
Since the interactive consistency problem can be solved with a Byzantine protocol Bz And the consensus problem can be solved with an interactive consistency protocol The consensus problem can be solved with a Byzantine protocol Bz N copies of the Bz protocol are run in parallel, where each processor Pi acts as the commander (Pg) for exactly one copy of the protocol The non-faulty processors use the majority vote of the consensus vector as the consensus value Hence, a Byzantine protocol can solve all three problems Synchronizing Processes > Agreement Protocols > Agreement Problems

The Byzantine Generals Problem
[Lamport, Shostak, & Pease, 1982.] Basic idea is very similar to the consensus problem: Each of N generals has a value v(i), (e.g. “attack” or “retreat”). We want an algorithm to allow all generals to exchange their values such that the following hold: All non-faulty generals must agree on the values of v(1),…,v(N). If the i th general is non-faulty, then the value agreed for v(i) must be the i th general’s value.

Byzantine Generals Problem
The problem described earlier can be solved by restricting attention to one commanding general and considering all others to be lieutenants. A commanding general must send an order to his N–1 lieutenants, such that: IC1: All loyal lieutenants obey the same order. IC2: If the commander is loyal, then loyal lieutenants obey the order he sends.

Oral Message Algorithm
Assumptions: Every message that is sent is delivered correctly The receiver of a message knows who sent it The absence of a message can be detected Assumptions #1 and #2 prevent a traitor from interfering with the communication between two other generals Assumption #3 foils a traitor who tries to prevent a decision by simply not sending messages Denoted OM(m), where m is the maximum number of traitors the system can handle Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages

Impossibility Theorem
If processes can only send unauthenticated messages, more than two thirds of the processes must be non-faulty to derive a solution In other words, no solution exists for a system with fewer than 3m + 1 nodes, where m is the number of faulty processes Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Unauthenticated Messages

Algorithm with Oral Messages
Algorithm OM(m) (defined recursively) tolerates m traitors. Algorithm OM(0): Commander sends value to each lieutenant. Each lieutenant uses the value received from the commander (or “retreat” if no message is received). Algorithm OM(m), m > 0: Each lieutenant uses OM(m–1) to send the value received (take this value to be “retreat” if not received) to the other N–2 lieutenants. Each lieutenant uses the majority of the values received from the commander and the other lieutenants in the previous two steps.

Intuition If the commander is loyal, then he sends the same command to all lieutenants. In this case, the lieutenants all agree on the correct command by majority, as in the example. If the commander is a traitor, then he may send different commands to different lieutenants. However, this leaves one fewer traitors among the lieutenants, making it easier to reach agreement among them. (When the commander is a traitor, they can agree on any command.)

Signed Message Algorithm
Assumptions: Every message that is sent is delivered correctly The receiver of a message knows who sent it The absence of a message can be detected Signatures: A loyal general’s signature cannot be forged, and any alteration of the contents of his signed messages can be detected Anyone can verify the authenticity of a general’s signature Denoted SM(m), the algorithm can cope with m traitors for any number of generals I.e., it is now possible to tolerate any number of traitors Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

SM(m) Algorithm Initially Vi = { }
The commander P0 signs and sends his value to every lieutenant If lieutenant i receives a message of the form v:0 from the commander, then it adds v to Vi and sends the message v:0:i to every other lieutenant If lieutenant i receives a message of the form v:0:j1:…:jk and v is not in Vi then if k < m, it sends the message v:0:j1:…:jk:i to every lieutenant other than j1, …, jk When lieutenant i will receive no more messages, it obeys choice(Vi) Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Choice Function The choice function is applied to a set of orders to obtain a single one Requirements: If the set V consists of a single element v, then choice(V) = v If the set V is empty, then choice(V) = a predetermined value Possibilities: choice(V) selects the majority of set V or a predetermined value if there is not a majority choice(V) selects the median of set V, if the elements of V can be ordered Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Choice Function The choice function is applied to a set of orders to obtain a single one Requirements: If the set V consists of a single element v, then choice(V) = v If the set V is empty, then choice(V) = a predetermined value Basic Idea: If Commander is loyal, then all messages will be of the form V:0:w*. (No forging.) So, all lieutenants end up with Vi = {V}. If Commander is a traitor, then loyal lieutenants can detect it. Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

SM(m) Example After step 3, V1 = V2 = {Attack, Retreat}
Intuitively, both lieutenants can tell the commander is a tritor With no majority, choice would default to Retreat Commander Attack:0 Retreat:0 Attack:0:1 Lieutenant 1 Lieutenant 2 Retreat:0:2 Synchronizing Processes > Agreement Protocols > Byzantine Protocols > Authenticated Messages

Synchronizing Processes
Clocks External clock synchronization (Cristian) Internal clock synchronization (Gusella & Zatti) Network Time Protocol (Mills) Decisions Agreement protocols (Fischer) Data Distributed file systems (Satyanarayanan) Memory Distributed shared memory (Nitzberg & Lo) Schedules Distributed scheduling (Isard et al.) Synchronizing Processes

Synchronizing Processes: Distributed File Systems
CS/CE/TE Advanced Operating Systems

Distributed File Systems
Distributed File System (DFS): a system that provides access to the same storage for a distributed network of processes Common Storage Synchronizing Processes > Distributed File Systems

Benefits of DFSs Data sharing is simplified User mobility is supported
Files appear to be local Users are not required to specify remote servers to access User mobility is supported Any workstation in the system can access the storage System administration is easier Operations staff can focus on a small number of servers instead of a large number of workstations Better security is possible Servers can be physically secured No user programs are executed on the servers Site autonomy is improved Workstations can be turned off without disrupting the storage Synchronizing Processes > Distributed File Systems

Design Principles for DFSs
Utilize workstations when possible Opt to perform operations on workstations rather than servers to improve scalability Cache whenever possible This reduces contention on centralized resources and transparently makes data available whenever used Exploit file usage characteristics Knowledge of how files are accessed can be used to make better choices Ex: Temporary files are rarely shared, hence can be kept locally Minimize system-wide knowledge and change Scalability is enhanced if global information is rarely monitored or updated Trust the fewest possible entities Security is improved by trusting a smaller number of processes Batch if possible Transferring files in large chunks improve overall throughput Synchronizing Processes > Distributed File Systems

Quiz Question Which of the following was not a design principle from the Andrew and Coda file systems? Cache whenever possible. Decentralize operations when possible. Minimize system-wide knowledge. Trust the most possible entities. Synchronizing Processes > Distributed File Systems

Mechanisms for Building DFSs
Mount points Enables filename spaces to be “glued” together to provide a single, seamless, hierarchical namespace Client caching Contributes the most to better performance in DFSs Hints Pieces of information that can substantially improve performance if correct but no negative consequence if erroneous Ex: Caching mappings of pathname prefixes Bulk data transfer Reduces network communication overhead by transferring in bulk Encryption Used for remote authentication, either with private or public keys Replication Storing the same data on multiple servers increases availability Synchronizing Processes > Distributed File Systems

Quiz Question Which of the following is not an important mechanism for developing distributed file systems? Caching data at clients, either entire files or portions of files. Encrypting data transmissions, either with private or public keys. Read-only data replication for files that change often. Transferring data in bulk to reduce communication overheads. Synchronizing Processes > Distributed File Systems

DFS Case Studies Two case studies: Both were: Andrew (AFS) Coda
Developed at Carnegie Mellon University (CMU) A Unix-based DFS Focused on scalability, security, and availability Synchronizing Processes > Distributed File Systems

Andrew File System (AFS)
Vice is a collection of trusted file servers Venus is a service that runs on each workstation to mediate shared file access Venus Venus Vice Vice Venus Venus Venus Venus Synchronizing Processes > Distributed File Systems > Andrew

AFS-1 Used from 1984 through 1985 Each server contained a local file system mirroring the structure of the shared file system If a file was not on the server, a search would end in a stub directory that identified the server containing the file Clients cached pathname prefix information to direct file requests to the appropriate servers Venus used a pessimistic approach to maintaining cache coherence All cached files copies were considered suspect Venus would contact Vice to verify the cache was the latest version before accessing the file Synchronizing Processes > Distributed File Systems > Andrew

AFS-2 Used from 1985 through 1989 Venus now used an optimistic approach to maintaining cache coherence All cached files were considered valid Callbacks were used When files are cached on a workstation, the server promises to notify the workstation if the file is to be modified by another machine A remote procedure call (RPC) mechanism was used to optimize bulk file transfers Mount points and volumes were used instead of stub directories to easily move files around among the servers Each user was normally assigned a volume and a disk quota Read-only replication of volumes increased availability Synchronizing Processes > Distributed File Systems > Andrew

Quiz Question What is a callback?
A notification from a client that a local cache has been modified. A notification from a server that a file or directory is to be modified. Both of the above. None of the above. Synchronizing Processes > Distributed File Systems > Andrew

AFS-3 Used from 1989 through early 1990s
Supports multiple administrative cells, each with its own servers, workstations, system admins, and users Each cell is completely autonomous Venus now cached files in large chunks instead of their entirety Synchronizing Processes > Distributed File Systems > Andrew

Security in Andrew Protection domains Authentication
Each is composed of users and groups Each group is associated with a unique owner (user) A protection server is used to immediately reflect changes in domains Authentication Upon login, a user’s password is used to obtain tokens from an authentication server Venus uses these tokens to establish connections to the RPC File system protection Access lists are used to determine access to directories instead of files, including negative rights Resource usage Andrew’s protection and authentication mechanisms protect against denials of service and resources Synchronizing Processes > Distributed File Systems > Andrew

Coda A descendant of AFS-2
Substantially more resilient to server and network failures By relying entirely on local resources (caches) when the servers are inaccessible Allows a user to continue working regardless of failures elsewhere in the system Synchronizing Processes > Distributed File Systems > Coda

Coda Overview Clients cache entire files on their local disks
Cache coherence is maintained by the use of callbacks Clients dynamically find files on servers and cache location information Token-based authentication and end-to-end encryption are used for security Provides failure resiliency through two mechanisms: Server replication: storing copies of files on multiple servers Disconnected operation: mode of optimistic execution in which the client relies solely on cached data Synchronizing Processes > Distributed File Systems > Coda

Server Replication Replicated Volume: Volume Storage Group (VSG):
consists of several physical volumes or replicas that are managed as one logical volume by the system Volume Storage Group (VSG): a set of servers maintaining a replicated volume Accessible VSG (AVSG): the set of servers currently accessible Venus performs periodic probes to detect AVSGs One member is designated as the preferred server Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Quiz Question What is a VSG? Venus Service Group Vice Server Group
Volume Storage Group None of the above Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Server Replication Venus employs a Read-One, Write-All strategy
For a read request, If a local cache exists, Venus will read the cache instead of contacting the VSG If a local cache does not exist, Venus will contact the preferred server for its copy Venus will also contact the other AVSG for their version numbers If the preferred version is stale, a new, up-to-date preferred server is selected from the AVSG and the fetch is repeated Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Server Replication Venus employs a Read-One, Write-All strategy
For a write, When a file is closed, it is transferred to all members of the AVSG If the server’s copy does not conflict with the client’s copy, an update operation handles transferring file contents, making directory entries, and changing access lists A data structure called the update set, which summarizes the client’s knowledge of which servers did not have conflicts, is distributed to the servers Synchronizing Processes > Distributed File Systems > Coda > Server Replication

Disconnected Operation
Begins at a client when no member of a VSG is accessible Clients are allowed to rely solely on local caches If a cache does not exist, the system call that triggered the file access is aborted Disconnected operation ends when Venus established a connection with the VSG Venus executes a series of update processes to reintegrate the client with the VSG Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation

Disconnected Operation
Reintegration updates can fail for two reasons: There may be no authentication tokens that Venus can use to communicate securely with AVSG members due to token expirations Conflicts may be detected If reintegration fails, a temporary repository is created on the servers to store the data in question until a user can resolve the problem later These temporary repositories are called covolumes Mitigate is the operation that transfers a file or directory from a workstation to a covolume Synchronizing Processes > Distributed File Systems > Coda > Disconnected Operation

Conflict Resolution When a conflict is detected, Coda first attempts to resolve it automatically Ex: partitioned creation of uniquely named files in the same directory can be handled automatically by selectively replaying the missing file creates If automated resolution is not possible, Code marks all accessible replicas inconsistent and moves them to their covolumes Coda provides a repair tool to assist users in manually resolving conflicts Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution

Quiz Question Which of the following is not true about conflict resolution in the Coda DFS? Coda attempts to resolve conflicts by recreating any missing files in a directory. Coda inspects workstations for the most up-to-date cache of the conflicted file. For file-level conflicts, Coda marks all replicas as inconsistent and moves them to a covolume. Users manually resolve file-level conflicts using a provided repair tool. Synchronizing Processes > Distributed File Systems > Coda > Conflict Resolution

Synchronizing Processes

Similar presentations

Presentation on theme: "Synchronizing Processes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Synchronizing Processes

Similar presentations

Presentation on theme: "Synchronizing Processes"— Presentation transcript:

Similar presentations

About project

Feedback