Synchronization
Clock Synchronization In a centralized system time is unambiguous Clock Synchronization In a centralized system time is unambiguous. In a distributed system agreement on time is not obvious. When each machine has its own clock, an event that occurred after another event may be assigned an earlier time.
Physical clocks Computer Clock ≡ Timer Oscillating crystal (under tension) 2 register: counter and holding register Interrupt ( clock tick) when counter gets to zero N clocks n rates (clock skew) UTC (Universal Coordinated Time) is the current universal time standard Radio pulse to synchronize (±10 msec) Satellite (± 0.5 msec)
Clock Synchronization Algorithms If 1 – r < dC/dt < 1 + r r maximum drift rate If 2 clock drift from UTC in the opposite directions, after Δt, ε = 2r Dt Resynchronization interval d/2r d maximum time difference The relation between clock time and UTC when clocks tick at different rates.
Cristian's Algorithm synchronization with a time server Periodically (T < d/2r s) synchronization request to time server Time must never run backward gradual changes Non zero message propagation time (estimate value = (T1-T0-I) /2 )
The Berkeley Algorithm synchronization without time server The time daemon asks all the other machines for their clock value discrepancies The machines answer The time daemon tells everyone how to adjust their clock to the average no Universal Coordinated Time available
Logical clocks Centralized algorithms have disadvantages. Decentralized algorithms can use averaging methods. NTP (Network Time Protocol) provides an accuracy of 1-50 msec using advanced algorithms. For many purposes it is sufficient that all machines agree on the same time. Logical clocks Often processes need to agree on the order in which events occur.
Lamport Timestamps If a b C(a) < C(b) happens before If a b C(a) < C(b) If a snd., b rcv. C(a) < C(b) C(a) C(b) If C(b) < C(a) C(b) = C(a)+1 We obtain a total ordering of all events in the system but Lamport timestamps don’t capture causality Vector timestamps each process Pi has a vector Vi so that: Vi[i] is the number of events occurred so far at Pi If Vi[j]=k Pi knows k events occurred at Pj
Global State It is the local state of each process, together with the messages currently in transit in a distributed system A distributed snapshot reflects a consistent global state
Any process P may initiate the algorithm recording its local state. Global State Knowledge of the state in which a distributed system currently is, is useful Any process P may initiate the algorithm recording its local state. P Organization of a process and channels for a distributed snapshot A process P sends a marker along each of its outgoing channels
Global State Process Q receives a marker for the first time, records its local state and send a marker along each outgoing channel Q records all incoming message Q receives a marker for its incoming channel and finishes recording the state of the incoming channel A process has finished its part of the algorithm when it has received a marker along each of its incoming channels, and processed each one. Then its state is collected Many snapshots may be in progress at the same time.
Election Algorithms The Bully Algorithm (1/2) Selecting a coordinator The bully election algorithm Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election
The Bully Algorithm (2/2) Process 6 tells 5 to stop Process 6 wins and tells everyone
Election algorithm using a ring (without token). A Ring Algorithm Each process knows who its successor is. Election algorithm using a ring (without token).
Mutual Exclusion: critical regions in distributed systems A Centralized Algorithm (to simulate a single processor system, needs a coordinator) Process 1 asks the coordinator for permission to enter a critical region. Permission is granted Process 2 then asks permission to enter the same critical region. The coordinator does not reply. When process 1 exits the critical region, it tells the coordinator, which then replies to 2
A Distributed Algorithm requires a total ordering of all events in the system A message contains the critical region name, the process number and the current time Two processes (0,2) want to enter the same critical region at the same moment. Process 0 has the lowest timestamp, so it wins and enters the critical region. When process 0 is done, it sends an OK also, so 2 can now enter the critical region. This algorithm is worse than the centralized one (n points of failure, scaling, multiple messages…)
A Token Ring Algorithm when the process acquires the token, it accesses the critical region (if needed) start An unordered group of processes on a network. A logical, ordered, ring constructed in software. Each process knows who is the next in line
Messages per entry/exit Delay before entry * (in message times) Mutual Exclusion Algorithm Messages per entry/exit Delay before entry * (in message times) Problems Centralized 3 2 Coordinator crash Distributed 2 ( n – 1 ) Crash of any process Token ring 1 to 0 to n – 1 Lost token, process crash A comparison of three mutual exclusion algorithms. * With dependence on the mechanism
Examples of primitives for transactions. The Transaction Model A transaction permits that a whole set of related instructions would be successfully completed or none would be completed. Special primitives are requested ( supplied by system or language) Primitive Description BEGIN_TRANSACTION Make the start of a transaction END_TRANSACTION Terminate the transaction and try to commit ABORT_TRANSACTION Kill the transaction and restore the old values READ Read data from a file, a table, or otherwise WRITE Write data to a file, a table, or otherwise Examples of primitives for transactions.
The Transaction Model Transactions are ACID: Atomic : to outside, they happen indivisibly Consistent : they don’t violate system invariants Isolated : concurrent transactions don’t interfere Durable : the changes are permanent These flat transactions have some limitations... No partial result is permitted In a large system they can take a lot of time
Nested and Distributed Transactions A nested transaction: it follows a logical (hierarchical) division of the work A distributed transaction: it is logically flat and operates on distributed data
How to implement transactions? Private Workspace Copying everything in a private workspace is prohibitive… Example: dealing with a file system: The file index and disk blocks for a three-block file The situation after a transaction has modified block 0 and appended block 3 After committing
Write-ahead Log x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a) Log [x = 0 / 1] (b) [y = 0/2] (c) [x = 1/4] (d) rollback Files are actually modified in place, but a record is written to log all the changes a) A transaction b) – d) The log before each statement is executed In distributed systems each machine keeps its own log of changes to its local file system
General (layered) organization of managers for handling transactions. Concurrency Control allows several transaction to be executed simultaneously, leaving data in a consistent way. Transactions access data in a specific order General (layered) organization of managers for handling transactions.
Concurrency Control General organization of managers for handling distributed transactions.
Serializability a) – c) Three transactions T1, T2, and T3 BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) Schedule 1 x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Legal Schedule 2 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Schedule 3 x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Illegal (d) a) – c) Three transactions T1, T2, and T3 Legal serialized result : x= 1,2,3 depending upon which one runs last d) Possible schedules Synchronization can be achieved or with mutex or with explicit timestamp and following pessimistic or optimistic approach.