Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin
Detecting data races in production
Overhead FastTrack [Flanagan & Freund ’09] 80x 8x
Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Number of threads
Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Problem in future Problem today
Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling rate
Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling periods Non-sampling periods
Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Probability (detecting any race) FastTrack1 Pacerr
Detect race first access sampled
Sampling period Thread AThread B Non-sampling period Sampling period Non-sampling period
Thread AThread B write x read x read y write y
Insight #1: Stop tracking variable after non-sampled access
Thread A write x unlock m Thread B
Thread A write x unlock m Thread B lock m
Thread A write x unlock m Thread B lock m write x
Thread A write x unlock m read x Thread B lock m write x
Thread A write x unlock m read x Thread B lock m write x Race!
Thread A write x unlock m read x Thread B lock m write x Race!
Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks
Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks
Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x Increment clock Increment clock ABAB
Thread A write x unlock m read x Thread B lock m write x Join clocks Join clocks ABAB
Thread A write x unlock m read x Thread B lock m write x Happens before? ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x No work performed ABAB
Thread A write x unlock m read x Thread B lock m write x Race uncaught ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x Happens before? Race! ABAB
Insight #2: We only care whether “A happens before B” if A is sampled
Thread AThread B Do these events happen before other events? We don’t care!
Increment clocks Thread AThread B Don’t increment clocks Increment clocks Don’t increment clocks Do these events happen before other events? We don’t care!
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m No clock increment ABAB
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m Unnecessary join ABAB
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m O(n) O(1) ABAB
1
Qualitative improvement in time & space
Probability (detecting any race) = r ?
LiteRace [Marino et al. ’09] Cold-region hypothesis [Chilimbi & Hauswirth ’04] Full analysis at synchronization operations
Accuracy, time, space sampling rate Detect race first access sampled
Accuracy, time, space sampling rate Detect race first access sampled Qualitative improvement
Accuracy, time, space sampling rate Detect race first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs
Accuracy, time, space sampling rate Detect race first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs Thank you
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v6 Vector clock versions
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v
Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v Join unnecessary v6
Qualitative improvement
Core 2 Quad (4 cores) Multithreaded benchmarks (DaCapo & SPECjbb2000) Evaluating sampling-based race detection Need 100s of trials to evaluate Some races are rare Evaluate only frequent races
Two accesses to same variable (one is a write) One access doesn’t happen before the other Program order Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write
Thread A write x unlock m Thread B Two accesses to same variable (one is a write) One access doesn’t happen before the other Program order Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write
Thread A write x unlock m Thread B lock m write x Two accesses to same variable (one is a write) One access doesn’t happen before the other Program order Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write
Thread A write x unlock m read x Thread B lock m write x Two accesses to same variable (one is a write) One access doesn’t happen before the other Program order Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write
Thread A write x unlock m read x Thread B lock m write x Race! Two accesses to same variable (one is a write) One access doesn’t happen before the other Program order Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write
Races indicate Atomicity violations Order violations
Races indicate Atomicity violations Order violations Races lead to Sequential consistency violations No races sequential consistency (Java/C++) Races writes observed out of order
Races indicate Atomicity violations Order violations Races lead to Sequential consistency violations No races sequential consistency (Java/C++) Races writes observed out of order Most races potentially harmful [Flanagan & Freund ’10]
class ProducerConsumer { boolean ready; int x; produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Can read old value
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { … = x; while (!ready) { } } Legal reordering by compiler or hardware
class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }
class ProducerConsumer { volatile boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Happens- before edge
class LibraryBook { Set borrowers; }
class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }
class LibraryBook { Set borrowers; addBorrower(Person p) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }
class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }
class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }
addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet(); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }
addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; obj. (); borrowers = obj; }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }
addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }
addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }}}... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }
33% base overhead ~50% overhead
Program alone FastTrackPacer Detection rate 0occurrence rateoccurrence rate × r Running time tt(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Evaluate only frequent races Evaluate scaling with r Don’t evaluate scaling with n
50 million people
Energy Management System Alarm and Event Processing Routine (1 MLOC)
Energy Management System Alarm and Event Processing Routine (1 MLOC) Post-mortem analysis: 8 weeks "This fault was so deeply embedded, it took them weeks of poring through millions of lines of code and data to find it.” –Ralph DiNicola, FirstEnergy
Race condition Two threads writing to data structure simultaneously Usually occurs without error Small window for causing data corruption
Tracks happens-before: sound & precise 80X slowdown Each analysis step: O(n) time (n = # of threads)
Tracks happens-before: sound & precise 80X slowdown Each analysis step: O(n) time (n = # of threads) FastTrack [Flanagan & Freund ’09] Reads & writes (97%): O(1) time Synchronization (3%): O(n) time 8X slowdown
Tracks happens-before: sound & precise 80X slowdown Each analysis step: O(n) time (n = # of threads) FastTrack [Flanagan & Freund ’09] Reads & writes (97%): O(1) time Synchronization (3%): O(n) time 8X slowdown Problem today Problem in future
Tracks happens-before: sound & precise 80X slowdown Each analysis step: O(n) time (n = # of threads) FastTrack [Flanagan & Freund ’09] Reads & writes (97%): O(1) time Synchronization (3%): O(n) time 8X slowdown
Thread AThread B ABAB Vector clocks
Thread AThread B ABAB Vector clocks Thread A’s logical time Thread B’s logical time
Thread AThread B ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A
ABAB Thread A unlock m Thread B lock m Increment clock
ABAB Thread A unlock m Thread B lock m Join clocks Join clocks
ABAB Thread A unlock m Thread B lock m n = # of threads O(n) time
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before? Race!
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Sampling rate
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) No. of threads
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writesSynchronization
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writes Problem today Problem in future Synchronization
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods
FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods Overhead in non-sampling periods (small)
Data race occurs extremely rarely Data race occurs periodically Pre-deploymentDeployed
“We test exhaustively … we had in excess of three million online operational hours [342 years] in which nothing had ever exercised that bug.” –Mike Unum, manager of commercial solutions, GE Energy
Data race buggy execution
Thread AThread B ABAB Vector clocks
Thread AThread B ABAB Vector clocks Thread A’s logical time Thread B’s logical time
Thread AThread B ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A
ABAB Thread A unlock m Thread B lock m Increment clock
ABAB Thread A unlock m Thread B lock m Join clocks Join clocks
ABAB Thread A unlock m Thread B lock m n = # of threads O(n) time
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?
Thread A write x unlock m read x Thread B lock m write x ABAB
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?
Thread A write x unlock m read x Thread B lock m write x ABAB Happens before? Race!