Distributed Systems Lecture 5 Time and synchronization 1
Previous lecture Failure detection – Approaches – Metrics Gossip protocols 2
Motivation You want both audio and video to be synchronized when watching a movie You want to catch the 6:05 train but your clock is off my 7 minutes – What if it is late by 7 minutes? – What if it is fast by 7 minutes? 3
Cloud airline reservation system – Server A receives a client request to purchase last ticket on flight 123. – Server A timestamps purchase using local clock 9h:15m:32.45s, and logs it. Replies ok to client. – That was the last seat. Server A sends message to Server B saying “flight full.” – B enters “Flight ABC 123 full” + local clock value (which reads 9h:10m:10.11s) into its log. – Server C queries A’s and B’s logs. Is confused that a client purchased a ticket after the flight became full. May execute incorrect or unfair actions. 4 Motivation
Synchronization At any time the values of all non faulty clocks must be approximately equal (within Δ max ). There is a small bound on the amount by which a non faulty process is changed. 5
System definition A set of N distributed process Every process has a local physical clock No direct access to a shared global clock Communication between processes is message based 6
Applications At most once message delivery Cache consistency Active replication Medium access control GPS Global system for mobile communications (2 nd generation) 7
Time division multiple access Requirement: – Real-time communication using a shared medium Problem: – Collisions can arbitrarily delay messages Solution: – Synchronize clocks to determine access slots 8
Time division multiple access Synchronize clocks to determine access slots 1.Frame based data flow 2.Divide frames into slots 3.Scheduler assigns processes to slots 4.Clock synchronization: collision free schedule execution
Problem: instable clocks 10 Quartz oscillators oscillate at slightly different frequencies
Keeping time Oscillator – Pendulum – Quartz crystal CMOS – Microwave In a computer 11
Clock drifts in distributed systems 12
Clock resynchronization 13 Should we set back time?
Synchronization algorithms 14
Internal vs. external synchronization External Synchronize Δ max w.r.t. external time reference Internal Synchronize Δ max w.r.t. other system members 15 Externally synchronized clocks are also internally synchronized. The converse is not true.
Hardware vs. software synchronization Hardware (assisted) clock – Precise (e.g., phase locking) – Expensive (extra hardware needed) Attach GPS receivers to every machine Software clock – Less precise – More flexible – Cheap 16
Simple method Issue RPC to server to query for time Does not account for network or processing latency 17
Cristian’s algorithm 1989 – Flaviu Cristian Probabilistic algorithm – Only achieves synchronization if RTT << accuracy Used in low latency intranets Algorithm 1.P requests time from S connected to a UTC time source 2.P timestamps message from S and responds immediately 3.P sets time to be T + RTT/2 If min is the minimum time for one way transmission Then time at S when P receives message is [T + min, T + RTT - min] Therefore the accuracy of the algorithm is RTT/2 – min If RTT is long repeat request – Take request with smallest RTT – If non uniform RTTs take weighted average to compute error 18
Berkeley’s algorithm 1989 – Gusella and Zatti Assumes no machine has an accurate time source For intranets Algorithm 1.Pick S using an election process 2.S polls Ps who reply with their time 3.S observes RTT and estimates its own and Ps’ times 4.S averages the clock time by ignoring outliers 5.S sends out adjustments (+/-) to each P 1.Method cancels individual time drifts 2.Initial experiments shown clocks to be 20-25ms in sync 3.P usually applies the - correction over a period of time by sowing down the clock – not to break the monotonic time property 19
Network Time Protocol (NTP) 1985 – one of the oldest Internet protocols in use Coordinates participating computers within few ms of UTC – Tens of ms over Internet – <1ms in LANs Current version is NTPv4 20
Uses a network of time servers to synchronize all processes on a network. Time servers are connected by a synchronization subnet tree. The root is in touch with UTC. Each node synchronizes its children nodes. Secondary servers, synched by the primary server Primary server, direct synch. Strata 3, synched by the secondary servers NTP
A typical client will poll 3 or more servers to sync its time Time offset and roundtrip delay: 22 T i T i-1 T i-2 T i-3 Server B Server A Time mm' Time t and t’: actual transmission times for m and m’(unknown) o: true offset of clock at B relative to clock at A o i : estimate of actual offset between the two clocks d i : estimate of accuracy of o i ; total transmission times for m and m’; d i =t+t’
NTP accuracy problem Accurate offset from a suitable sample population Solution: – Minimum filter Order m readings according to RTT Select the lowest RTT reading 23
Next lecture Global states and snapshots 24