Effective Static Race Detection for Java Mayur Naik Alex Aiken Stanford University.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Pallavi Joshi  Chang-Seo Park  Koushik Sen  Mayur Naik ‡  Par Lab, EECS, UC Berkeley‡
Effective Static Deadlock Detection
1 Chao Wang, Yu Yang*, Aarti Gupta, and Ganesh Gopalakrishnan* NEC Laboratories America, Princeton, NJ * University of Utah, Salt Lake City, UT Dynamic.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Conditional Must Not Aliasing for Static Race Detection Mayur Naik Alex Aiken Stanford University.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Scalable and Precise Dynamic Datarace Detection for Structured Parallelism Raghavan RamanJisheng ZhaoVivek Sarkar Rice University June 13, 2012 Martin.
2 Motivation Distributed Systems Notoriously difficult to build without appropriate assistance. First ones were based on low-level message-passing mechanisms.
Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
Scaling Model Checking of Dataraces Using Dynamic Information Ohad Shacham Tel Aviv University IBM Haifa Lab Mooly Sagiv Tel Aviv University Assaf Schuster.
Types for Atomicity in Multithreaded Software Shaz Qadeer Microsoft Research (Joint work with Cormac Flanagan)
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
C. FlanaganSAS’04: Type Inference Against Races1 Type Inference Against Races Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College.
Thread-modular Abstraction Refinement Tom Henzinger Ranjit Jhala Rupak Majumdar [UC Berkeley] Shaz Qadeer [Microsoft Research]
Mayur Naik Alex Aiken John Whaley Stanford University Effective Static Race Detection for Java.
Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
1 RELAY: Static Race Detection on Millions of Lines of Code Jan Voung, Ranjit Jhala, and Sorin Lerner UC San Diego speaker.
1 A Modular Checker for Multithreaded Programs Cormac Flanagan HP Systems Research Center Joint work with Shaz Qadeer Sanjit A. Seshia.
Cormac Flanagan UC Santa Cruz Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs Jaeheon Yi UC Santa Cruz Stephen Freund.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
C. FlanaganType Systems for Multithreaded Software1 Cormac Flanagan UC Santa Cruz Stephen N. Freund Williams College Shaz Qadeer Microsoft Research.
Finding Optimum Abstractions in Parametric Dataflow Analysis Xin Zhang Georgia Tech Mayur Naik Georgia Tech Hongseok Yang University of Oxford.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
1 Effective Static Race Detection for Java Mayur, Alex, CS Department Stanford University Presented by Roy Ganor 14/2/08 Point-To Analysis Seminar.
Deadlock Detection Nov 26, 2012 CS 8803 FPL 1. Part I Static Deadlock Detection Reference: Effective Static Deadlock Detection [ICSE’09]
Eraser: A Dynamic Data Race Detector for Multithreaded Programs STEFAN SAVAGE, MICHAEL BURROWS, GREG NELSON, PATRICK SOBALVARRO, and THOMAS ANDERSON Ethan.
/ PSWLAB Type-Based Race Detection for J AVA by Cormac Flanagan, Stephen N. Freund 22 nd Feb 2008 presented by Hong,Shin Type-Based.
Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of Singapore.
Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.
Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.
DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Hybrid Static-Dynamic Analysis for Statically.
Drinking from Both Glasses: Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support Man Cao Minjia Zhang.
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.
Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin.
Parallelism without Concurrency Charles E. Leiserson MIT.
Detecting Atomicity Violations via Access Interleaving Invariants
Effective Static Deadlock Detection Mayur Naik* Chang-Seo Park +, Koushik Sen +, David Gay* *Intel Research, Berkeley + UC Berkeley.
Threads. Readings r Silberschatz et al : Chapter 4.
Effective Static Deadlock Detection Mayur Naik (Intel Research) Chang-Seo Park and Koushik Sen (UC Berkeley) David Gay (Intel Research)
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni PPPJ 2015, Melbourne, Florida, USA Toward Efficient Strong Memory Model Support for the Java.
Gauss Students’ Views on Multicore Processors Group members: Yu Yang (presenter), Xiaofang Chen, Subodh Sharma, Sarvani Vakkalanka, Anh Vo, Michael DeLisi,
Using Escape Analysis in Dynamic Data Race Detection Emma Harrington `15 Williams College
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
A User-Guided Approach to Program Analysis Ravi Mangal, Xin Zhang, Mayur Naik Georgia Tech Aditya Nori Microsoft Research.
Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting Tian Tan, Yue Li and Jingling Xue SAS 2016 September,
Lightweight Data Race Detection for Production Runs
A brief intro to: Parallelism, Threads, and Concurrency
Effective Data-Race Detection for the Kernel
Amir Kamil and Katherine Yelick
Threads and Memory Models Hal Perkins Autumn 2011
Over-Approximating Boolean Programs with Unbounded Thread Creation
Threads and Memory Models Hal Perkins Autumn 2009
Background and Motivation
Amir Kamil and Katherine Yelick
Tools for the development of parallel applications
Compiler-assisted debugging of concurrent programs
Presentation transcript:

Effective Static Race Detection for Java Mayur Naik Alex Aiken Stanford University

The Hardware Concurrency Revolution Clock speeds have peaked But #transistors continues to grow exponentially Vendors are shipping multi-core processors Intel CPU Introductions

The Software Concurrency Revolution Until now: Increasing clock speeds => even sequential programs ran faster Henceforth: Increasing #CPU cores => only concurrent programs will run faster

x=t1 t2 = x; t2 = t2 + 1; x = t2; t1 = x; t1 = t1 + 1; x = t1; t2++ Reasoning about Concurrent Programs is Hard t1=x t1++ x=t1 t2=x t2++ x=t2 t2=x t2++ x=t2 t1=x t1++ x=t1 t1=x t2=x x=t2 t1++ (20 total)... x==k x==k+2 x==k x==k+2 x==k x==k+1 

Race Conditions The same location may be accessed by different threads simultaneously. (And at least one access is a write.)

Particularly insidious concurrency bug –Triggered non-deterministically –No fail-stop behavior even in safe languages like Java Fundamental in concurrency theory and practice –Lies at heart of many concurrency problems atomicity checking, deadlock detection,... –Today’s concurrent programs riddled with races “… most Java programs are so rife with concurrency bugs that they work only ‘by accident’.” – Brian Goetz, Java Concurrency in Practice, Addison-Wesley, 2006 Race Conditions

The same location may be accessed by different threads simultaneously without holding a common lock. (And at least one access is a write.) The same location may be accessed by different threads simultaneously. (And at least one access is a write.) x=t1 Locking for Race Freedom t1 = x; t1 = t1 + 1; x = t1; t2 = x; t2 = t2 + 1; x = t2; t2++ Example t1=x t1++ x=t1 t2=x t2++ x=t2 t2=x t2++ x=t2 t1=x t1++ x=t1 t1=x t2=x x=t2 t1++ (20 total)... x==k x==k+2 x==k x==k+2 x==k x==k+1  sync (l) { }}

Previous Work Allen & Padua 1987; Miller & Choi 1988; Balasundaram & Kennedy 1989; Karam et al. 1989; Emrath et al. 1989; Schonberg 1989; … Dinning & Schonberg 1990; Hood et al. 1990; Choi & Min 1991; Choi et al. 1991; Netzer & Miller 1991; Netzer & Miller 1992; Sterling 1993; Mellor-Crummey 1993; Bishop & Dilger 1996; Netzer et al. 1996; Savage et al. 1997; Cheng et al. 1998; Richards & Larus 1998; Ronsse & De Bosschere 1999; Flanagan & Abadi 1999; … Aiken et al. 2000; Flanagan & Freund 2000; Christiaens & Bosschere 2001; Flanagan & Freund 2001; von Praun & Gross 2001; Choi et al. 2002; Engler & Ashcraft 2003; Grossman 2003; O’Callahan & Choi 2003; Pozniansky & Schuster 2003; von Praun & Gross 2003; Agarwal & Stoller 2004; Flanagan & Freund 2004; Henzinger et al. 2004; Nishiyama 2004; Qadeer & Wu 2004; Yu et al. 2005; Sasturkar et al. 2005; Elmas et al. 2006; Pratikakis et al. 2006; Sen & Agha 2006; Zhou et al. 2007;... Observation: Existing techniques and tools find relatively few bugs

Our Results 392 bugs in mature Java programs comprising 1.5 MLOC –Many fixed within a week by developers

Our Race Detection Approach all pairs racing pairs

Precision –Showed precise may alias analysis is central (PLDI’06) –low false-positive rate (20%) Soundness –Devised conditional must not alias analysis (POPL’07) –Circumvents must alias analysis Challenges Handle multiple aspects –Same location accessed –… by different threads –… simultaneously Correlate locks with locations they guard –… without common lock held Same location accessedby different threadssimultaneously without common lock held

Our Race Detection Approach all pairs aliasing pairs shared pairs parallel pairs unlocked pairs Same location accessedby different threadssimultaneously without common lock held racing pairs False Pos. Rate: 20%

MUST-NOT-ALIAS( e1, e2 ) ¬ MAY-ALIAS( e1, e2 ) Field f is race-free if: Alias Analysis for Race Detection // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … } e1 and e2 never refer to the same value

k-Object-Sensitive May Alias AnalysisMay Alias Analysis Large body of work Idea #1: Context-insensitive analysis –Abstract value = set of allocation sites foo() { bar() { … e1.f … … e2.f … } ¬ MAY-ALIAS( e1, e2 ) if Sites( e1 ) ∩ Sites( e2 ) = ∅ Idea #2: Context-sensitive analysis (k-CFA) –Context (k=1) = call site foo() { bar() { e1.baz(); e2.baz(); } Analyze function baz in two contexts Problem: Too few abstract values! Problem: Too few or too many contexts! –Abstract value = set of strings of ≤ k allocation sites –Context (k=1) = allocation site of this parameter Recent may alias analysis [Milanova et al. ISSTA’03] Solution:

k-Object-Sensitive Analysis: Our Contributions No scalable implementations for even k = 1 Insights: –Symbolic representation of relations BDDs [Whaley-Lam PLDI’04, Lhotak-Hendren PLDI’04] –Demand-driven race detection algorithm Begin with k = 1 for all allocation sites Increment k only for those involved in races Allow scalability to k = 5

Our Race Detection Approach Same location accessedby different threadssimultaneously without common lock held all pairs aliasing pairs shared pairs parallel pairs unlocked pairs racing pairs

¬ MAY-ALIAS( e1, e2 ) l1 and l2 always refer to the same value Field f is race-free if: Alias Analysis for Race Detection // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … } MUST-ALIAS( l1, l2 ) OR e1 and e2 never refer to the same value

Must Alias Analysis Small body of work –Much harder problem than may alias analysis Impediment to many previous race detection approaches –Folk wisdom: Static race detection is intractable Insight: Must alias analysis not necessary for race detection!

Field f is race-free if: New Idea: Conditional Must Not Alias Analysis Whenever l1 and l2 refer to different values, e1 and e2 also refer to different values MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 ) // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … }

Example a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } … … a[0] h1 h0 a[N-1] h2 h1 g h2 g … … a[i] h1 g x2 = a[*]; sync (?) { x2.g.f = …; } x1 = a[*]; sync (?) { x1.g.f = …; }

MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 )MUST-NOT-ALIAS( a, a ) => MUST-NOT-ALIAS( x1.g, x2.g ) true Field f is race-free if: Easy Case: Coarse-grained Locking h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g x2 = a[*]; sync (a) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (a) { x1.g.f = …; }

Example x2 = a[*]; sync (?) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (?) { x1.g.f = …; } h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g

MUST-NOT-ALIAS( x1.g, x2.g ) => MUST-NOT-ALIAS( x1.g, x2.g ) MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 ) true Easy Case: Fine-grained Locking Field f is race-free if: x2 = a[*]; sync (x2.g) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (x1.g) { x1.g.f = …; } h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g

Example x2 = a[*]; sync (?) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (?) { x1.g.f = …; } h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g

MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 )MUST-NOT-ALIAS( x1, x2 ) => MUST-NOT-ALIAS( x1.g, x2.g ) true (field g of distinct h1 values linked to distinct h2 values) Hard Case: Medium-grained Locking Field f is race-free if: x2 = a[*]; sync (x2) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (x1) { x1.g.f = …; } h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g

h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g … … … … h0 h2 h1 h2 h1 ► ►► ► ► ► Disjoint Reachability –from distinct h1 values –we can reach (via 1 or more fields) –only distinct h2 values then {h2} ⊆ DR({h1}) In every execution, if: Note: Values abstracted by sets of allocation sites

MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 ) –( Sites ( e1 ) ∩ Sites ( e2 )) ⊆ DR( Sites ( l1 ) ∪ Sites ( l2 )) –e1 reachable from l1 and e2 reachable from l2 // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … } Conditional Must Not Alias Analysis using Disjoint Reachability Sites ( l1 ) Sites ( e1 ) Sites ( e2 ) Sites ( l2 ) Field f is race-free if: ⊆ DR

–( Sites ( x1.g ) ∩ Sites ( x2.g )) ⊆ DR( Sites ( x1 ) ∪ Sites ( x2 )) –x1.g reachable from x1 and x2.g reachable from x2 –( Sites ( e1 ) ∩ Sites ( e2 )) ⊆ DR( Sites ( l1 ) ∪ Sites ( l2 )) –e1 reachable from l1 and e2 reachable from l2 –({h2}) ⊆ DR({h1}) –x1.g reachable from x1 and x2.g reachable from x2 Hard Case: Medium-grained Locking Field f is race-free if: –true x2 = a[*]; sync (x2) { x2.g.f = …; } a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } x1 = a[*]; sync (x1) { x1.g.f = …; } h0 … … a[0] h1 a[N-1] h2 h1 g h2 g … … a[i] h1 g

Experience with Chord Experimented with 12 multi-threaded Java programs –smaller programs used in previous work –larger, mature and widely-used open-source programs –whole programs and libraries Tool output and developer discussions available at: Programs being used by other researchers in race detection

Benchmarks vect1.1 htbl1.1 htbl1.4 vect1.4 tsp hedc ftp pool jdbm jdbf jtds derby classes KLOC description JDK 1.1 java.util.Vector JDK 1.1 java.util.Hashtable JDK 1.4 java.util.Hashtable JDK 1.4 java.util.Vector Traveling Salesman Problem Web crawler Apache FTP server Apache object pooling library Transaction manager O/R mapping system JDBC driver Apache RDBMS time 0m28s 0m27s 2m04s 2m02s 3m03s 9m10s 11m17s 10m29s 9m33s 9m42s 10m23s 36m03s

Pairs Retained After Each Stage (Log scale)

Classification of Unlocked Pairs vect1.1 htbl1.1 htbl1.4 vect1.4 tsp hedc ftp pool jdbm jdbf jtds derby harmful benign false # bugs

Developer Feedback 16 bugs in jTDS –Before: “As far as we know, there are no concurrency issues in jTDS …” –After: “It is probably the case that the whole synchronization approach in jTDS should be revised from scratch...” 17 bugs in Apache Commons Pool –“Thanks to an audit by Mayur Naik many potential synchronization issues have been fixed” -- Release notes for Commons Pool bugs in Apache Derby –“This looks like *very* valuable information and I for one appreciate you using Derby … Could this tool be run on a regular basis? It is likely that new races could get introduced as new code is submitted...”

Related Work Static (compile-time) race detection –Need to approximate multiple aspects –Need to perform must alias analysis –Sacrifice precision, soundness, scalability Dynamic (run-time) race detection –Current state of the art –Inherently unsound –Cannot analyze libraries Shape Analysis –much more expensive than disjoint reachability

Summary of Contributions Precise race detection (PLDI’06) –Key idea: k-object-sensitive may alias analysis –Important client for may alias analyses Sound race detection (POPL’07) –Key idea: Conditional must not alias analysis –Has applications besides race detection Effective race detection –392 bugs in mature Java programs comprising 1.5 MLOC –Many fixed within a week by developers

Future Work Analysis of higher-level concurrency properties –transactions, atomicity, … –Races fundamental but low-level Language idioms for concurrent programming –Facilitate program analysis for tools –Facilitate program understanding for programmers

The End