Tongping Liu, Charlie Curtsinger, Emery Berger D THREADS : Efficient Deterministic Multithreading Insanity: Doing the same thing over and over again and expecting different results.
2 In the Beginning…
3 There was the Core.
4 And it was Good.
5 It gave us our Daily Speed.
6 Until the Apocalypse.
7 And the Speed was no Moore.
8 And then came a False Prophet…
9
10 Want speed?
11 I BRING YOU THE GIFT OF PARALLELISM!
12 color = ; row = 0; // globals void nextStripe(){ for (c = 0; c < Width; c++) drawBox (c,row,color); color = (color == )? : ; row++; } for (n = 0; n < 9; n++) pthread_create(t[n], nextStripe); for (n = 0; n < 9; n++) pthread_join(t[n]); JUST USE THREADS…
13
14
15
16
17
18 pthreads race conditions atomicity violations deadlock order violations
19 Salvation?
20
21 pthreads race conditions atomicity violations deadlock order violations D THREADS deterministic Dthreads
22 D THREADS Enables… Race-free Executions Replay Debugging w/o Logging Replicated State Machines
23 Overhead with CoreDet 7.8 D THREADS : Efficient Determinism Usually faster than the state of the art
24 Overhead with CoreDet 7.8 D THREADS : Efficient Determinism Generally as fast or faster than pthreads
25 % g++ myprog.cpp –l thread D THREADS : Easy to Use p
26 Isolation shared address space disjoint address spaces
27 Performance: Processes vs. Threads threads processes Thread Execution Time (ms) Normalized Execution Time
28 Performance: Processes vs. Threads threads processes Thread Execution Time (ms) Normalized Execution Time
29 Performance: Processes vs. Threads threads processes Thread Execution Time (ms) Normalized Execution Time
30 “Shared Memory”
31 Snapshot pages before modifications “Shared Memory”
32 Write back diffs “Shared Memory”
33 “Thread” 1 “Thread” 2 “Thread” 3 ParallelSerial Update in Deterministic Time & Order Parallel mutex_lock cond_wait pthread_create
34 D THREADS performance analysis
35 Thread 1 Main Memory Core 1 Thread 2 Core 2 Invalidate The Culprit: False Sharing
36 Thread 1Thread 2 Invalidate Main Memory Core 1 Core 2 The Culprit: False Sharing 20x
37 Process 1Process 2 Global State Core 1 Core 2 Process 2 Process 1 D THREADS : Eliminates False Sharing!
38 Dthreads detailed analysis D THREADS : Detailed Analysis
39 Dthreads detailed analysis D THREADS : Detailed Analysis
40 Dthreads detailed analysis D THREADS : Detailed Analysis
41 Scalability D THREADS : Scalable Determinism
42 Scalability D THREADS : Scalable Determinism
43 Scalability D THREADS : Scalable Determinism
44 D THREADS Dthreads % g++ myprog.cpp –l thread p
45 End
46 Scheduler Determinism
47 Excluding Outliers D THREADS : Without Outliers Just 5% slower than pthreads
48 Commit Protocol Time Twin Page Diff Global State Local State
49 a0b0a1b1 D THREADS Example Execution a0 b0a0b0a0b0 if(a == 0) b = 1; if(b == 0) a = 1; Global State Committed State a1 b1
50 No Problem a0b0 if(a == 0) b = 1; if(b == 0) a = 1; a1b1
51 That’s Better. a0b0 lock(); if(a == 0) b = 1; unlock(); lock(); if(b == 0) a = 1; unlock(); b1
52 a0b0a1 lock(); if(a == 0) b = 1; unlock(); lock(); if(b == 0) a = 1; unlock(); Or is it?
53 Determinism Is this enough?
54 Robust Determinism
55 External Nondeterminism socket = open_socket(80); listen(socket);
56 Problem already solved
57 Overhead
58 Wrap-Up Determinism Robust Determinism Internal Determinism
59 Wrap-Up Threads to Processes Commit Before Synch. Commit In Token Order
60 Overhead with CoreDet 7.8 [ASPLOS 10] Performance: D THREADS & CoreDet vs. pthreads
61 How D THREADS Provides Determinism Isolation Deterministic Time Deterministic Order
62 Evaluation Phoenix