Download presentation
Presentation is loading. Please wait.
Published byGodfrey Gilmore Modified over 8 years ago
1
Kendo: Efficient Deterministic Multithreading in Software M. Olszewski, J. Ansel, S. Amarasinghe MIT to be presented in ASPLOS 2009 slides by Evangelos
2
2 Motivation Parallel applications Non-determinism inherent in threaded applications Hard to develop, debug, test, maintain etc. Modify running environment to make the parallel application run deterministically Make thread communication through shared memory deterministic Deterministic interleaving of lock acquisition
3
3 Deterministic Multithreading Strong Determinism Same output for every run – too costly Weak Determinism Same output for all the inputs that lead to a race- free execution under the deterministic scheduler.
4
4 Benefits of Deterministic Multithreading Repeatability Closest approach: record/replay systems can provide determinism for a single recorded run Debugging Cyclic debugging methodology Testing Test output or intermediate states of a program to justify correctness Multithreaded Replicas Replica-based fault tolerant Give same input to replicas and expect same behavior
5
5 Deterministic Logical Time ‘P’ monotonically increasing clocks, one for every thread Counting arbitrary events (for every thread), that are repeatable across executions e.g. writes performed, instructions committed Measure of progress for every thread Decide on the thread interleaving (lock acquisition) based on logical time
6
6 Simplified Locking Algorithm At any given point it’s only one’s thread turn to acquire a lock: All threads with a smaller ID have greater deterministic logical clocks All threads with a larger ID have greater or equal deterministic logical clocks Turn waiting enforces a First-Come-First- Serve ordering of threads in logical time
7
7 Pseudocode for simplified locking algorithm
8
8 Improved Locking Algorithm
9
9
10
10 Optimizations Queueing for fairness Queue structure in every lock The thread at the head of the queue gets the lock; other threads spin increasing their logical clock Deterministic logical clock fast-forwarding A thread advances its clock to lock.released_logical_time to save time from spinning Lock priority boosting (?) If you can predict the next thread to get a lock, then decrease its clock to give it higher priority.
11
11 Implementation Deterministic Logical Clocks retire_stores hardware counter; on an overflow increment the software counter maintained in shared memory Chunk size: number of stores needed to cause an overflow Small chunk size higher overhead due to interrupt handlers Increment amount: fidelity of the logical clock Can be different when counter goes off and when trying to get a lock
12
12 Implementation Thread Creation Need to be careful when creating new threads parent thread need to wait for its turn before initiating new thread Lazy reads (unprotected reads) Provide API for deterministically reading unprotected data, writes always done with a lock Keep a table of all
13
13 Evaluation 2.66GHz Intel Core 2 Quad running Debbian SPLASH-2 benchmark suite also parallel traveling-sales-person (tsp) and parallel quicksort
14
14 Evaluation
15
15 Evaluation
16
16 Evaluation
17
17 Conclusions Software-only solution to provide weak deterministic multithreading Control the interleaving of lock acquisitions to make it deterministic Low overhead (16%) for up to four threads (?) in SPLASH benchmarks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.