Download presentation
Presentation is loading. Please wait.
Published byKristian Turner Modified over 6 years ago
1
Tolerating Latency in Replicated State Machines through Client Speculation
April 22, 2009 Benjamin Wester1, James Cowling2, Edmund B. Nightingale3, Peter M. Chen1, Jason Flinn1, Barbara Liskov2 University of Michigan1, MIT CSAIL2, Microsoft Research3
2
Simple Service Configuration
1 x=1 x=0 ++x NSDI'09 Benjamin Wester University of Michigan CSE
3
Replicated State Machines (RSM)
2 x=1 x=2 ++x 2 x=1 ++x 2 Agree on request All non-faulty replies are identical 2 NSDI'09 Benjamin Wester University of Michigan CSE
4
Benjamin Wester University of Michigan CSE
RSMs have high latency x=2 2 2 2 Need many replies Agreement Geographic Distribution NSDI'09 Benjamin Wester University of Michigan CSE
5
Benjamin Wester University of Michigan CSE
Hide the Latency Use speculative execution inside RSM Speculate before consensus is reached Without faults, any reply predicts consensus value Let client continue after receiving one reply NSDI'09 Benjamin Wester University of Michigan CSE
6
Benjamin Wester University of Michigan CSE
Overview Introduction Improving RSMs with speculation Application to PBFT Performance Conclusion NSDI'09 Benjamin Wester University of Michigan CSE
7
Speculative Execution in RSM
Take Checkpoint Predict: 1 Speculate! Blocked Commit Rollback x=1 x=2 x=1 Continue processing while waiting NSDI'09 Benjamin Wester University of Michigan CSE
8
Critical path: first reply
1 Completion latency less relevant First reply latency sets critical path Speed Accuracy Other desirable properties Throughput Stability under contention Smaller number of replicas NSDI'09 Benjamin Wester University of Michigan CSE
9
Requests while speculative
while !check_lottery(): submit_tps() buy_corvette() Predict win? = yes yes buy win? What do we do with this? Hold request Bad performance Distributed commit/rollback State tracking complex NSDI'09 Benjamin Wester University of Michigan CSE
10
Resolve speculations on the replicas
while !check_lottery(): submit_tps() buy_corvette() Predict win? = yes win? = yes buy = keys yes keys yes buy if win?=yes: buy buy win? Explicitly encode dependencies as predicates No special request handling needed Replicas need to log past replies Local decision at replicas matches client NSDI'09 Benjamin Wester University of Michigan CSE
11
Benjamin Wester University of Michigan CSE
Overview Introduction Improving RSMs with speculation Application to PBFT Performance Conclusion NSDI'09 Benjamin Wester University of Michigan CSE
12
Benjamin Wester University of Michigan CSE
Practical BFT -CS [Castro and Liskov 1999] client primary f=1 NSDI'09 Benjamin Wester University of Michigan CSE
13
Benjamin Wester University of Michigan CSE
Additional Details Tentative execution PBFT/PBFT-CS complete in 4 phases Read-only optimization Accurate answer from backup replica Failure threshold Bound worst-case failure Correctness NSDI'09 Benjamin Wester University of Michigan CSE
14
Benjamin Wester University of Michigan CSE
Overview Introduction Improving RSMs with speculation Application to PBFT Performance Conclusion NSDI'09 Benjamin Wester University of Michigan CSE
15
Benjamin Wester University of Michigan CSE
Benchmarks Shared counter Simple checkpoint No computation NFS: Apache httpd build Complex checkpoint Significant computation NSDI'09 Benjamin Wester University of Michigan CSE
16
Benjamin Wester University of Michigan CSE
Topology Primary-local Primary-remote Uniform 2.5 or 15 ms Primary NSDI'09 Benjamin Wester University of Michigan CSE
17
Base case: no replication
Primary-local Primary-remote Uniform 2.5 or 15 ms NSDI'09 Benjamin Wester University of Michigan CSE
18
Benjamin Wester University of Michigan CSE
Shared Counter Primary-local topology NSDI'09 Benjamin Wester University of Michigan CSE
19
Benjamin Wester University of Michigan CSE
Shared Counter Primary-local topology [Kotla et al. 07] NSDI'09 Benjamin Wester University of Michigan CSE
20
Shared Counter Uniform & Primary-remote topology NSDI'09
Benjamin Wester University of Michigan CSE
21
Shared Counter Uniform & Primary-remote topology NSDI'09
Benjamin Wester University of Michigan CSE
22
Benjamin Wester University of Michigan CSE
NFS: Apache build Primary-local topology NSDI'09 Benjamin Wester University of Michigan CSE
23
Benjamin Wester University of Michigan CSE
NFS: Apache build Uniform topology NSDI'09 Benjamin Wester University of Michigan CSE
24
NFS: Apache build Primary-remote topology NSDI'09
Benjamin Wester University of Michigan CSE
25
Benjamin Wester University of Michigan CSE
NFS: With Failure Primary-local topology (1% fail) NSDI'09 Benjamin Wester University of Michigan CSE
26
Throughput (Shared Counter)
LAN topology NSDI'09 Benjamin Wester University of Michigan CSE
27
Conclusion Makes WAN deployments more practical
Integrate client speculation within RSMs Predicated requests: performance without complexity Clients less sensitive to latency between replicas 5x speedup over non-speculative protocol Makes WAN deployments more practical NSDI'09 Benjamin Wester University of Michigan CSE
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.