Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin.

Slides:



Advertisements
Similar presentations
50.003: Elements of Software Construction Week 6 Thread Safety and Synchronization.
Advertisements

Michael Bond (Ohio State) Milind Kulkarni (Purdue)
4 December 2001 SEESCOASEESCOA STWW - Programma Debugging of Real-Time Embedded Systems: Experiences from SEESCOA Michiel Ronsse RUG-ELIS.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
D u k e S y s t e m s Time, clocks, and consistency and the JMM Jeff Chase Duke University.
Chapter 6 Process Synchronization Bernard Chen Spring 2007.
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CSE , Autumn 2011 Michael Bond.  Name  Program & year  Where are you coming from?  Research interests  Or what’s something you find interesting?
Week 9, Class 3: Model-View-Controller Final Project Worth 2 labs Happens-Before ( SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:
Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.
SOS: Saving Time in Dynamic Race Detection with Stationary Analysis Du Li, Witawas Srisa-an, Matthew B. Dwyer.
An Case for an Interleaving Constrained Shared-Memory Multi- Processor CS6260 Biao xiong, Srikanth Bala.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.
Cormac Flanagan and Stephen Freund PLDI 2009 Slides by Michelle Goodstein 07/26/10.
S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.
Semaphores CSCI 444/544 Operating Systems Fall 2008.
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.
/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:
C. FlanaganType Systems for Multithreaded Software1 Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College Shaz Qadeer Microsoft Research.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
The University of Adelaide, School of Computer Science
Eraser: A Dynamic Data Race Detector for Multithreaded Programs STEFAN SAVAGE, MICHAEL BURROWS, GREG NELSON, PATRICK SOBALVARRO, and THOMAS ANDERSON Ethan.
Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.
50.530: Software Engineering Sun Jun SUTD. Week 8: Race Detection.
DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University.
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
ECE122 Feb. 22, Any question on Vehicle sample code?
Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Hybrid Static-Dynamic Analysis for Statically.
Dynamic Data Race Detection. Sources Eraser: A Dynamic Data Race Detector for Multithreaded Programs –Stefan Savage, Michael Burrows, Greg Nelson, Patric.
Data races, informally [More formal definition to follow] “race condition” means two different things Data race: Two threads read/write, write/read, or.
Drinking from Both Glasses: Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support Man Cao Minjia Zhang.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Effective Static Deadlock Detection Mayur Naik* Chang-Seo Park +, Koushik Sen +, David Gay* *Intel Research, Berkeley + UC Berkeley.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
Effective Static Deadlock Detection Mayur Naik (Intel Research) Chang-Seo Park and Koushik Sen (UC Berkeley) David Gay (Intel Research)
Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni PPPJ 2015, Melbourne, Florida, USA Toward Efficient Strong Memory Model Support for the Java.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Week 9, Class 3: Java’s Happens-Before Memory Model (Slides used and skipped in class) SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
Week 8, Class 3: Model-View-Controller Final Project Worth 2 labs Cleanup of Ducks Reducing coupling Finishing FactoryMethod Cleanup of Singleton SE-2811.
Concurrent Revisions: A deterministic concurrency model. Daan Leijen & Sebastian Burckhardt Microsoft Research (OOPSLA 2010, ESOP 2011)
FastTrack: Efficient and Precise Dynamic Race Detection [FlFr09] Cormac Flanagan and Stephen N. Freund GNU OS Lab. 23-Jun-16 Ok-kyoon Ha.
IThreads A Threading Library for Parallel Incremental Computation Pramod Bhatotia Pedro Fonseca, Björn Brandenburg (MPI-SWS) Umut Acar (CMU) Rodrigo Rodrigues.
Debuggers. Errors in Computer Code Errors in computer programs are commonly known as bugs. Three types of errors in computer programs –Syntax errors –Runtime.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Presenter: Godmar Back
Lightweight Data Race Detection for Production Runs
Concurrency 2 CS 2110 – Spring 2016.
An Operational Approach to Relaxed Memory Models
Threads Cannot Be Implemented As a Library
Atomic Operations in Hardware
Atomic Operations in Hardware
Effective Data-Race Detection for the Kernel
Specifying Multithreaded Java semantics for Program Verification
Jipeng Huang, Michael D. Bond Ohio State University
CSCI1600: Embedded and Real Time Software
CS333 Intro to Operating Systems
Why we have Counterintuitive Memory Models
Compilers, Languages, and Memory Models
CSCI1600: Embedded and Real Time Software
Lecture 20: Synchronization
Problems with Locks Andrew Whitaker CSE451.
Presentation transcript:

Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin

Detecting data races in production

Overhead FastTrack [Flanagan & Freund ’09] 80x  8x

Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Number of threads

Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Problem in future Problem today

Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling rate

Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Sampling periods Non-sampling periods

Overhead FastTrack [Flanagan & Freund ’09] c reads&writes + c sync n Pacer(c reads&writes + c sync n) r + c non-sampling (1 – r) Probability (detecting any race) FastTrack1 Pacerr

Detect race  first access sampled

Sampling period Thread AThread B Non-sampling period Sampling period Non-sampling period

Thread AThread B write x read x read y write y

Insight #1: Stop tracking variable after non-sampled access

Thread A write x unlock m Thread B

Thread A write x unlock m Thread B lock m

Thread A write x unlock m Thread B lock m write x

Thread A write x unlock m read x Thread B lock m write x

Thread A write x unlock m read x Thread B lock m write x Race!

Thread A write x unlock m read x Thread B lock m write x Race!

Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks

Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks

Thread A write x unlock m read x Thread B lock m write x ABAB Vector clocks

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x Increment clock Increment clock ABAB

Thread A write x unlock m read x Thread B lock m write x Join clocks Join clocks ABAB

Thread A write x unlock m read x Thread B lock m write x Happens before? ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x No work performed ABAB

Thread A write x unlock m read x Thread B lock m write x Race uncaught ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x Happens before? Race! ABAB

Insight #2: We only care whether “A happens before B” if A is sampled

Thread AThread B Do these events happen before other events? We don’t care!

Increment clocks Thread AThread B Don’t increment clocks Increment clocks Don’t increment clocks Do these events happen before other events? We don’t care!

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m No clock increment ABAB

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m Unnecessary join ABAB

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m O(n)  O(1) ABAB

1

Qualitative improvement in time & space

Probability (detecting any race) = r ?

LiteRace [Marino et al. ’09] Cold-region hypothesis [Chilimbi & Hauswirth ’04] Full analysis at synchronization operations

Accuracy, time, space  sampling rate Detect race  first access sampled

Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement

Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs

Accuracy, time, space  sampling rate Detect race  first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs Thank you

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v6 Vector clock versions

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v

Thread A unlock m1 … unlock m2 Thread B lock m1 … lock m ABAB v Join unnecessary v6

Qualitative improvement

 Core 2 Quad (4 cores)  Multithreaded benchmarks (DaCapo & SPECjbb2000)  Evaluating sampling-based race detection  Need 100s of trials to evaluate  Some races are rare  Evaluate only frequent races

 Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

Thread A write x unlock m Thread B  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

Thread A write x unlock m Thread B lock m write x  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

Thread A write x unlock m read x Thread B lock m write x  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

Thread A write x unlock m read x Thread B lock m write x Race!  Two accesses to same variable (one is a write)  One access doesn’t happen before the other  Program order  Synchronization order ▪ Acquire-release ▪ Wait-notify ▪ Fork-join ▪ Volatile read-write

Races indicate  Atomicity violations  Order violations

Races indicate  Atomicity violations  Order violations Races lead to  Sequential consistency violations  No races  sequential consistency (Java/C++)  Races  writes observed out of order

Races indicate  Atomicity violations  Order violations Races lead to  Sequential consistency violations  No races  sequential consistency (Java/C++)  Races  writes observed out of order Most races potentially harmful [Flanagan & Freund ’10]

class ProducerConsumer { boolean ready; int x; produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Can read old value

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { … = x; while (!ready) { } } Legal reordering by compiler or hardware

class ProducerConsumer { boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; }

class ProducerConsumer { volatile boolean ready; int x; T1 T2 produce() { x = … ; ready = true; } consume() { while (!ready) { } … = x; } Happens- before edge

class LibraryBook { Set borrowers; }

class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

class LibraryBook { Set borrowers; addBorrower(Person p) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

class LibraryBook { Set borrowers; addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet (); } borrowers.add(p); }

addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { borrowers = new HashSet(); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; obj. (); borrowers = obj; }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

addBorrower(Person p) { if (borrowers == null) { synchronized (this) { if (borrowers == null) { HashSet obj = alloc HashSet; borrowers = obj; obj. (); }}}... borrowers.add(p); } addBorrower(Person p) { if (borrowers == null) {... } borrowers.add(p); }

33% base overhead ~50% overhead

Program alone FastTrackPacer Detection rate 0occurrence rateoccurrence rate × r Running time tt(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ]  Evaluate only frequent races  Evaluate scaling with r  Don’t evaluate scaling with n

50 million people

 Energy Management System  Alarm and Event Processing Routine (1 MLOC)

 Energy Management System  Alarm and Event Processing Routine (1 MLOC)  Post-mortem analysis: 8 weeks "This fault was so deeply embedded, it took them weeks of poring through millions of lines of code and data to find it.” –Ralph DiNicola, FirstEnergy

 Race condition  Two threads writing to data structure simultaneously  Usually occurs without error  Small window for causing data corruption

 Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)

 Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown

 Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown Problem today Problem in future

 Tracks happens-before: sound & precise  80X slowdown  Each analysis step: O(n) time (n = # of threads)  FastTrack [Flanagan & Freund ’09]  Reads & writes (97%): O(1) time  Synchronization (3%): O(n) time  8X slowdown

Thread AThread B ABAB Vector clocks

Thread AThread B ABAB Vector clocks Thread A’s logical time Thread B’s logical time

Thread AThread B ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A

ABAB Thread A unlock m Thread B lock m Increment clock

ABAB Thread A unlock m Thread B lock m Join clocks Join clocks

ABAB Thread A unlock m Thread B lock m n = # of threads O(n) time

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before? Race!

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Sampling rate

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) No. of threads

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writesSynchronization

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n) Reads & writes Problem today Problem in future Synchronization

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods

FastTrack [Flanagan & Freund ’09] Pacer Detection rateoccurrence rateoccurrence rate × r Running timet(c 1 + c 2 n)t[(c 1 + c 2 n)r + c 3 ] Overhead in sampling periods Overhead in non-sampling periods (small)

Data race occurs extremely rarely Data race occurs periodically Pre-deploymentDeployed

“We test exhaustively … we had in excess of three million online operational hours [342 years] in which nothing had ever exercised that bug.” –Mike Unum, manager of commercial solutions, GE Energy

Data race  buggy execution

Thread AThread B ABAB Vector clocks

Thread AThread B ABAB Vector clocks Thread A’s logical time Thread B’s logical time

Thread AThread B ABAB Vector clocks Last logical time “received” from B Last logical time “received” from A

ABAB Thread A unlock m Thread B lock m Increment clock

ABAB Thread A unlock m Thread B lock m Join clocks Join clocks

ABAB Thread A unlock m Thread B lock m n = # of threads O(n) time

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?

Thread A write x unlock m read x Thread B lock m write x ABAB

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before?

Thread A write x unlock m read x Thread B lock m write x ABAB Happens before? Race!