Effectively Model Checking Real-World Distributed Systems Junfeng Yang Joint work with Huayang Guo, Ming Wu, Lidong Zhou, Gang Hu, Lintao Zhang, Heming Cui, Jingyue Wu, Chia-che Tsai, John Gallagher 1
One-slide Summary Distributed systems: important, but hard to get right Model checking: find serious bugs but is slow Dynamic Interface Reduction: a new type of state- space reduction technique in 25 years [DeMeter SOSP 11] – exponentially speed up model checking – One data point: 34 years 18 hours Stable Multithreading: a radically new approach [Tern OSDI '10] [Peregrine SOSP '11] [PLDI '12] [Parrot SOSP '13] [CACM '13] – what-you-check-is-what-you-run – Billions of years 7 hours – 2
Distributed Systems: Pervasive and Critical 3
Distributed Systems: Hard to Get Right Node has no centralized view of entire system Code must correctly handle many failures – Link failures, network partitions – message loss, delay, or reordering – machine crashes Worse: geo, larger, weird failures more likely Complex protocols, more complex code, bugs 4
Model Checking Distributed Systems Implementations 5 … Choices of actions – Send message – Recv message – Run thread – Delay message – Fail link – Crash machine –…–… Run checkers on states – E.g., assertions send fail link thread crash …
Good Error Detection Results E.g., [MoDist NSDI 09] [dBug SSV 10] – Easy: check unmodified, real code in native environment (“in-situ” [eXplode OSDI 06] ) – Comprehensive: check many corner cases – Deterministic: detected errors can be replay MoDist results – Checked Berkeley DB rep, MPS (Microsoft production), PacificA – Found 35 bugs 10 Protocol flaws found in every system checked – Transfer to Microsoft product groups 6
But, the State Explosion Problem Real-world distributed systems have too many states to completely explore –Even for conceptually small state spaces –3-node MPS: 34 years for MoDist! Incompleteness Low assurance Prior model checkers explored many redundant states 7
This Talk: Two Techniques to Effectively Reduce/Shrink State Space Dynamic Interface Reduction: check components separately to avoid costly global exploration [DeMeter SOSP 11] –34 years 18 hours, 10^5 reduction Leverage Stable Multithreading [Tern OSDI '10] [Peregrine SOSP '11] [PLDI '12] [Parrot SOSP '13] [CACM '13] to make what- you-check what-you-run (ongoing) 8
Dynamic Interface Reduction (DIR) Insight: system builders decompose a system into components with narrow interfaces – e.g., [Clarke, Long, McMillan 87] [Laster, Grumberg 98] Distinguish global and local actions Check local actions via conceptually local fork() 9 // main // ckpt n=recv() total+=n Send(n) Log(total)
Reduction Analysis N components, each having M local actions 10 w/o DIR: M * M * … * M = M^N w DIR: M + M + … + M = M * N Exponential reduction … … … … …
Challenge in Implementing DIR How to automatically compute interfaces from real code w/o causing false positives or missing bugs? Manual spec: tedious, costly, error-prone – Required by prior compositional or modular model checking work Made-up interfaces: difficult-to-diagnose false positives [Guerraoui and Yabandeh, NSDI 11] 11
Automatically Discover Interface by Running Code 12 Global Explorer Explore global actions Local Explorers Explore local actions Explore local actons Message Traces Insight: message traces collectively define interfaces Message Traces
13 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Example // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S
14 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Global Explorer: Compute Initial Global Trace // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global
15 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Global Explorer: Project Message Traces // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2)
16 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorers: Explore Local Actions Using Message traces // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2)
17 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Primary: Explore Local Trace 1 // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2) P.Log P.total+=1 P.total+=2
18 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Primary: Explore Local Trace 2 // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2) P.Log P.total+=1 P.total+=2
19 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Primary: Explore Local Trace 3 // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2) P.Log P.total+=1 P.total+=2
20 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Client // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2)
21 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Client // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 2) C.Toss(2) = 0
22 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Local Explorer of Client Found New Message Trace // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 3) C.Toss(2) = 1 C.Send(P, 2)
23 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Global Explorer: Composition // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 2) P.Recv(C, 2) P.total+=2 P.Send(S, 2) S.Recv(P, 2) S.total+=2 Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 3) C.Toss(2) = 1 C.Send(P, 2)
24 // main // ckpt While(n=recv()){ total+=n Send(S, n) } Log(total) if (Toss(2) == 0)) { Send(P, 1); Send(P, 2); } else { Send(P, 1); Send(P, 3); } Global Explorer: New Global Trace // main // ckpt While(n=recv()){ total+=n } Log(total) Client C Primary P Second S C.Toss(2) = 0 C.Send(P, 1) P.Recv(C, 1) P.Log P.total+=1 P.Send(S, 1) S.Recv(P, 1) S.Log S.total+=1 C.Send(P, 3) Global P.Recv(C, 1) P.Send(S, 1) P.Recv(C, 2) P.Send(S, 2) S.Recv(P, 1) S.Recv(P, 2) C.Send(P, 1) C.Send(P, 3) C.Toss(2) = 1 C.Send(P, 2)
Implementation 7,279 lines of C++ Integrated DIR with –MoDist [MoDist NSDI 09],757 lines –MaceMC [MaceMC NSDI 07],1,114 lines –Easy Orthogonal with partial order reduction through vector clock tricks 25
Verification/Reduction Results MPS (Microsoft production system) BDB: Berkeley DB Replication Chord: Chord implementation in Mace *-n: n nodes Results of other benchmarks in [Demeter SOSP 11] 26 AppMPS-2MPS-3BDB-2BDB-3Chord-2Chord-3 Reduction Speedup DIR-MoDistDIR-MaceMC
DIR Summary Proven sound (introduce no false positive) and complete (introduce no false negative) Fully automatic, real, exponential reduction Works seamlessly w/ existing model checkers –Integrated into MoDist and MaceMC; easy Results –Verified instances of real-world systems –Empirically observed large reduction 34 years 18 hours (10^5) on MPS 27
This Talk: Two Techniques to Effectively Reduce State Space Dynamic Interface Reduction: check components separately to avoid costly global exploration [DeMeter SOSP 11] –34 years 18 hours, 10^5 reduction Leverage Stable Multithreading [Tern OSDI '10] [Peregrine SOSP '11] [PLDI '12] [Parrot SOSP '13] [CACM '13] to make what- you-check what-you-run (ongoing) 28
Threads: Difficult to Model Check Many thread interleavings, or schedules – To verify, local explorer must explore all schedules Wide interfaces between threads – Any shared-memory load/store – Tracing load/store is costly – DIR may not work well 29
What-you-check is what-you-run Coverage = C/R Reduction: enlarge C exploiting equivalence But equivalence is rare, hard to find! – DIR took us 2-3 years Can we increase coverage w/o equivalence? Shrink R w/ Stable Multithreading [Tern OSDI '10] [Peregrine SOSP '11] [PLDI '12] [Parrot SOSP '13] [CACM '13] 30 All possible runtime schedules (R) Model checked schedules (C)
Stable Multithreading 31 Reuse well-checked schedules on diff. inputs How does it work? See papers [Tern OSDI '10] [Peregrine SOSP '11] [PLDI '12] [Parrot SOSP '13] [CACM '13] So much easier that it feels like cheating Nondeterministic Stable Deterministic
Conclusion Dynamic Interface Reduction: check components separately to avoid costly global exploration [DeMeter SOSP 11] – Automatic, real, exponential reduction – Proven sound and complete –34 years 18 hours, 10^5 reduction Leverage Stable Multithreading [Tern OSDI '10] [Peregrine SOSP '11] to make what-you-check what-you-run (ongoing) 32
Key Challenge Make stable multithreading work with real- world distributed systems – Physical time? – Message passing? – Dynamic load balancing? – Overhead? 33