1 Effective Static Race Detection for Java Mayur, Alex, CS Department Stanford University Presented by Roy Ganor 14/2/08 Point-To Analysis Seminar Tel Aviv University
2 A Few Definitions (not!)
3 x=t1 t2 = x; t2 = t2 + 1; x = t2; t1 = x; t1 = t1 + 1; x = t1; t2++ Motivation Concurrent = Hard t1=x t1++ x=t1 t2=x t2++ x=t2 t2=x t2++ x=t2 t1=x t1++ x=t1 t1=x t2=x x=t2 t1++ (20 total)... x==k x==k+2 x==k x==k+2 x==k × x==k+1 × sync (l) {
4 Definition Two threads access the same memory location Without ordering constraints At least one is a write Non-deterministic nature of thread difficult to reproduce and fix
5 Dynamic vs. Static Race Detection Dynamic (lock-set*, happens-before*) Program is executed Record and analyzing memory accesses and synchronization operations "post-mortem" - records critical events analyze later Static Employ compile-time analysis on the program source Reporting all potential races that could occur in any possible program execution
6 Pros and Cons Dynamic Feasible execution Lower false positive rate Not all paths are considered not sound Cannot certify a program to be race free Overhead on program execution Static False positive (reporting a potential data race when none exists) Scaling is also difficult Frameworks / Open libraries Sound?
7 Key Problems Precision Scalability Synchronization Idioms Open Programs Counterexamples
8 Harness Synthesis Problem - Detect races in open programs is important – missing callees and callers Solution - simulating scenarios of program’s exercise its interfaces. For each Interface: 1. declares a local variable of each type 2. Assigns to each local variable of reference type T, an object of each concrete class of type T 3. Invokes each method on each combination of local variables and assigns the return value if any to each local variable respecting the result type of the method 4.S imulates executing each pair of calls in separate threads on shared data.
9 Original Pairs F – get / set instance field x.f G – get / set static field Class.f A – get / set instance field x[i]
10 Algorithm Outline Starting with all (possible) pairs… JDBM - 11,189,853 pairs Reduce pairs “Step-by-Step” JDBM – 33,443 7,511 2,756 91 pairs Each access is reachable from a thread-spawning call site that it itself reacable from main() Both access the same location x.f == y.f Both access thread-shared data Without holding a common lock
11 Running Example public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); }
12 All pairs of accesses such that: –Both access the same instance field or the same static field or array elements –At least one is a write Computing Original Pairs
13 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Original Pairs static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; }
14 Computing Reachable Pairs Step 1 –Access pairs with at least one write to same field Step 2 –Consider access pair (e1, e2) –To have a race, e1 must be reachable from a thread-spawning call site s1 without “switching” threads –And s1 must be reachable from main –And similarly for e2
15 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Reachable Pairs
16 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Two Object-Sensitive Contexts
17 static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: 1st Context
18 Example: 2nd Context * static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; }
19 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Reachable Pairs
20 Computing Aliasing Pairs Steps 1-2 –Access pairs with at least one write to same field –And both are reachable from some thread Step 3 –To have a race, both must access the same memory location –Use alias analysis
21 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Aliasing Pairs
22 Computing Escaping Pairs Steps 1-3 –Access pairs with at least one write to same field –And both are reachable from some thread –And both can access the same memory location Step 4 –To have a race, the memory location must also be thread-shared –Use thread-escape analysis
23 public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Escaping Pairs
24 Computing Unlocked Pairs Steps 1-4 –Access pairs with at least one write to same field –And both are reachable from some thread –And both can access the same memory location –And the memory location is thread-shared Step 5 –Discard pairs where the memory location is guarded by a common lock in both accesses –Needs must-alias analysis –We use approximation of may-alias analysis, which is unsound
25 static public void main() { A a; a = new A(); a.get(); a.inc(); } private int rd() { return f; } private int wr(int x) { f = x; return x; } public A() { f = 0; } public int get() { return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); return wr(t); } private int rd() { return f; } private int wr(int x) { f = x; return x; } Example: Unlocked Pairs
26 ¬ MAY-ALIAS( e1, e2 ) l1 and l2 always refer to the same value Field f is race-free if: Alias Analysis // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … } MUST-ALIAS( l1, l2 ) OR e1 and e2 never refer to the same value
27 Must Alias Analysis Small body of work –Much harder problem than may alias analysis Impediment to many previous race detection approaches –Folk wisdom: Static race detection is intractable Insight: Must alias analysis not necessary for race detection!
28 Field f is race-free if: New Idea: Conditional Must Not Alias Analysis Whenever l1 and l2 refer to different values, e1 and e2 also refer to different values MUST-NOT-ALIAS( l1, l2 ) => MUST-NOT-ALIAS( e1, e2 ) // Thread 1: // Thread 2: sync (l1) { sync (l2) { … e1.f … … e2.f … }
29 Example a = new h0[N]; for (i = 0; i < N; i++) { a[i] = new h1; a[i].g = new h2; } … … a[0] h1 h0 a[N-1] h2 h1 g h2 g … … a[i] h1 g x2 = a[*]; sync (?) { x2.g.f = … ; } x1 = a[*]; sync (?) { x1.g.f = … ; } MUST-NOT-ALIAS( ?, ? ) => MUST-NOT-ALIAS( x1.g.f, x1.g.f )
30 static public void main() { A a; a = new A(); 4:a.get(); 5:a.inc(); } field reference A.f (A.java:10) [Rd] A.get(A.java:4) Harness.main(Harness.java:4) field reference A.f (A.java:12) [Wr] A.inc(A.java:7) Harness.main(Harness.java:5) Example: Counterexample public A() { f = 0; } public int get() { 4:return rd(); } public sync int inc() { int t = rd() + (new A()).wr(1); 7:return wr(t); } private int rd() { 10: return f; } private int wr(int x) { 12: f = x; return x; }
31 Benchmarks vect1.1 htbl1.1 htbl1.4 vect1.4 tsp hedc ftp pool jdbm jdbf jtds derby classes KLOC description JDK 1.1 java.util.Vector JDK 1.1 java.util.Hashtable JDK 1.4 java.util.Hashtable JDK 1.4 java.util.Vector Traveling Salesman Problem Web crawler Apache FTP server Apache object pooling library Transaction manager O/R mapping system JDBC driver Apache RDBMS
32 Pairs Retained After Each Stage
33 Conclusions A scalable and precise approach to static race detection –Largest program analyzed: ~ 650 KLOC (derby) Handles common synchronization idioms, analyzes open programs, and generates counterexamples An example where precise alias analysis is key –Not just any alias analysis (k-object sensitivity)
34 Implementation
35 References R. Agarwal, A. Sasturkar, Wang L, and S. Stoller. Optimized run-time race detection and atomicity checking using partial discovered types. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering (ASE’05), pages 233–242, 2005