Download presentation
Presentation is loading. Please wait.
Published byVivien Jennings Modified over 9 years ago
1
Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety Willy Zwaenepoel, Fernando Pedone EPFL, University of Lugano, Microsoft Research
2
Replicated State Machines (RSM) Strong consistency – Execute same commands in same order – Reach same state from same initial state Fault tolerance – Store data at multiple replicas – Failure masking / fast failover 2
3
Geo-Replication Data Center High latency among replicas Messaging dominates replication latency 3
4
Leader-Based Protocols Order commands by a leader replica Require extra ordering messages at follower Leader client request client reply Ordering Replication High latency for geo replication Ordering 4 Follower
5
Clock-RSM Orders commands using physical clocks Overlaps ordering and replication 5 client request client reply Ordering + Replication Low latency for geo replication
6
Outline Clock-RSM Comparison with Paxos Evaluation Conclusion 6
7
Outline Clock-RSM Comparison with Paxos Evaluation Conclusion 7
8
Property and Assumption Provides linearizability Tolerates failure of minority replicas Assumptions – Asynchronous FIFO channels – Non-Byzantine faults – Loosely synchronized physical clocks 8
9
Protocol Overview client requestclient reply client requestclient reply 9 PrepOK cmd1.ts = Clock() cmd2.ts = Clock() Clock-RSM cmd1 cmd2 cmd1 cmd2 cmd1 cmd2 cmd1 cmd2 cmd1 cmd2
10
Major Message Steps Prep: Ask everyone to log a command PrepOK: Tell everyone after logging a command R0R0 R2R2 R1R1 client request R3R3 R4R4 Prep PrepOK cmd1.ts = 24 PrepOK cmd1 committed? client request cmd2.ts = 23 10
11
Commit Conditions A command is committed if – Replicated by a majority – All commands ordered before are committed Wait until three conditions hold C1: Majority replication C2: Stable order C3: Prefix replication 11
12
C1: Majority Replication More than half replicas log cmd1 R0R0 R2R2 R1R1 client request R3R3 R4R4 PrepOK cmd1.ts = 24 Prep Replicated by R 0, R 1, R 2 1 RTT: between R 0 and majority 12
13
C2: Stable Order Replica knows all commands ordered before cmd1 – Receives a greater timestamp from every other replica R0R0 R2R2 R1R1 client request R3R3 R4R4 24 cmd1.ts = 24 2523 25 0.5 RTT: between R 0 and farthest peer cmd1 is stable at R 0 13 Prep / PrepOK / ClockTime
14
C3: Prefix Replication All commands ordered before cmd1 are replicated by a majority 14 R0R0 R2R2 R1R1 client request R3R3 R4R4 cmd1.ts = 24 cmd2 is replicated by R 1, R 2, R 3 cmd2.ts = 23 Prep PrepOk 1 RTT: R 4 to majority + majority to R 0 client request Prep PrepOk
15
Overlapping Steps 15 R0R0 R2R2 R1R1 client request R3R3 R4R4 Latency of cmd1 : about 1 RTT to majority client reply Majority replication Stable order Prefix replication PrepOK Prep Log(cmd1) 242523 25 Prep PrepOk cmd1.ts = 24
16
Commit Latency StepLatency Majority replication 1 RTT (majority1) Stable order 0.5 RTT (farthest) Prefix replication 1 RTT (majority2) Overall latency = MAX{ 1 RTT (majority1), 0.5 RTT (farthest), 1 RTT (majority2) } 16 If 0.5 RTT (farthest) < 1 RTT (majority), then overall latency ≈ 1 RTT (majority).
17
R0R0 Topology Examples Majority1 Farthest R0R0 Majority1 Farthest R3R3 R4R4 R2R2 R1R1 R4R4 R3R3 R2R2 R1R1 17 client request
18
Outline Clock-RSM Comparison with Paxos Evaluation Conclusion 18
19
Paxos 1: Multi-Paxos Single leader orders commands – Logical clock: 0, 1, 2, 3,... R0R0 Leader R 2 R1R1 client request Prep Commit Forward client reply PrepOK R3R3 R4R4 Latency at followers: 2 RTTs (leader & majority) 19
20
Paxos 2: Paxos-bcast Every replica broadcasts PrepOK – Trades off message complexity for latency R0R0 Leader R 2 R1R1 client request Prep Forward client reply PrepOK R3R3 R4R4 Latency at followers: 1.5 RTTs (leader & majority) 20
21
Clock-RSM vs. Paxos With realistic topologies, Clock-RSM has – Lower latency at Paxos follower replicas – Similar / slightly higher latency at Paxos leader 21 ProtocolLatency Clock-RSMAll replicas: 1 RTT (majority) if 0.5 RTT (farthest) < 1 RTT (majority) Paxos-bcastLeader: 1 RTT (majority) Follower: 1.5 RTTs (leader & majority)
22
Outline Clock-RSM Comparison with Paxos Evaluation Conclusion 22
23
Experiment Setup Replicated key-value store Deployed on Amazon EC2 California (CA) Virginia (VA) Ireland (IR) Singapore (SG) Japan (JP) 23
24
Latency (1/2) All replicas serve client requests 24
25
Overlapping vs. Separate Steps CA VA IR SG JP 25 CA VA (L) IR SG JP Clock-RSM latency: max of three Paxos-bcast latency: sum of three client request
26
Latency (2/2) Paxos leader is changed to CA 26
27
Throughput Five replicas on a local cluster Message batching is key 27
28
Also in the Paper A reconfiguration protocol Comparison with Mencius Latency analysis of protocols 28
29
Conclusion Clock-RSM: low latency geo-replication – Uses loosely synchronized physical clocks – Overlaps ordering and replication Leader-based protocols can incur high latency 29
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.