Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.

Slides:



Advertisements
Similar presentations
Runtime Techniques for Efficient and Reliable Program Execution Harry Xu CS 295 Winter 2012.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Memory Models (1) Xinyu Feng University of Science and Technology of China.
Java PathRelaxer: Extending JPF for JMM-Aware Model Checking Huafeng Jin, Tuba Yavuz-Kahveci, and Beverly Sanders Computer and Information Science and.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Scalable and Precise Dynamic Datarace Detection for Structured Parallelism Raghavan RamanJisheng ZhaoVivek Sarkar Rice University June 13, 2012 Martin.
Timed Automata.
Program Representations. Representing programs Goals.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Aliases in a bug finding tool Benjamin Chelf Seth Hallem June 5 th, 2002.
Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Programming Language Semantics Java Threads and Locks Informal Introduction The Java Specification Language Chapter 17.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Overview of program analysis Mooly Sagiv html://
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Precision Going back to constant prop, in what cases would we lose precision?
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D.
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
Pointer Analysis as a System of Linear Equations. Rupesh Nasre (CSA). Advisor: Prof. R. Govindarajan. Jan 22, 2010.
Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Hybrid Static-Dynamic Analysis for Statically.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Effective Static Deadlock Detection Mayur Naik (Intel Research) Chang-Seo Park and Koushik Sen (UC Berkeley) David Gay (Intel Research)
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Points-to Analysis as a System of Linear Equations Rupesh Nasre. Computer Science and Automation Indian Institute of Science Advisor: Prof. R. Govindarajan.
CS223: Software Engineering Lecture 26: Software Testing.
An Operational Approach to Relaxed Memory Models
Data Flow Analysis Suman Jana
Memory Consistency Models
Threads Cannot Be Implemented As a Library
Representation, Syntax, Paradigms, Types
Memory Consistency Models
Pointer Analysis Lecture 2
Graph-Based Operational Semantics
Automatic Detection of Extended Data-Race-Free Regions
Topic 17: Memory Analysis
Amir Kamil and Katherine Yelick
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Threads and Memory Models Hal Perkins Autumn 2011
University Of Virginia
Representation, Syntax, Paradigms, Types
Threads and Memory Models Hal Perkins Autumn 2009
Pointer Analysis Lecture 2
Representation, Syntax, Paradigms, Types
Pointer analysis.
Memory Consistency Models
Amir Kamil and Katherine Yelick
Representation, Syntax, Paradigms, Types
Xinyu Feng University of Science and Technology of China
Relaxed Consistency Finale
Compilers, Languages, and Memory Models
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore

Why Datarace-Free Programs? DRF programs Very weak guarantees Sequentially consistent semantics Java, C++, … programs Dataraces are often indicators of bugs. Racy programs

SC for DRF DRF? No Bug/Memory model specific reasoning required Perform optimization assume DRF Optimized code Analysis for DRF programs! Yes Verifier Compiler

Datarace-Free Programs In an execution, a release action synchronizes-with (sw) all acquire actions on same variable after it. In an execution, happens-before (hb) relation is reflexive, transitive closure of synchronizes-with and program-order. In all SC executions, all conflicting accesses must be ordered by happens-before.

Datarace-Free Programs t1++; lock l; x = 1; unlock l; t2++; lock l; x = 2; unlock l; t++; lock l; x = 1; unlock l; t2++; lock l; x = 2; unlock l; sw edge po edge

buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (prod); spawn(cons); prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } cons () { while (1) { lock (l); v = *p->data; unlock (l); } }

Dataflow Analysis for Concurrent Programs Kill dataflow facts conservatively. –More precise. Track interleavings precisely. –More efficient. Handle simple program constructs. –Handle modern language constructs. Handle simple analyses. –Handle more complex analyses.

buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (prod); spawn (cons); prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } cons () { while (1) { lock (l); v = *p->data; unlock (l); } p p,p->data p p,p->data p.p->data p,p->data

buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (prod); spawn (cons); prod () { while (1) { lock (l); oldv = *p->data; free (p->data); newv = nextv (oldv); p->data = new (...); *p->data = newv; unlock (l); } cons () { while (1) { lock (l); v = *p->data; unlock (l); } p p,p->data p p,p->data p.p->data p,p->data

buf *p; lock l; p = new (...); p->data = new (...); *p->data = VAL; spawn (prod); spawn (cons); prod () { while (1) { lock (l); oldv = *p->data; free (p->data); unlock (l); newv = nextv (oldv); lock (l); p->data = new (...); *p->data = newv; unlock (l); } cons () { while (1) { lock (l); v = *p->data; unlock (l); } p p,p->data p p,p->data p.p->data p,p->data p

Our Algorithm for Lifting Sequential Analyses for Concurrent Programs Build sync-CFG: add may-synchronize-edges from release to corresponding acquire instructions, if they can run in parallel. –From fork to first instruction of child thread. –From unlock to lock instructions on same lock variable. –From last instruction of a child thread to join instruction waiting for it. –… –May need to over-approximate the edges.

Our Algorithm for Lifting Sequential Analyses for Concurrent Programs Sequential analysis on sync-CFG: –Consider flow function for synchronization instructions as id. –Construct flow equations on sync-CFG. –Compute least fixed point (lfp) of flow equations.

Restrictions on Analysis Value Set analysis: –Collects set of values for each lvalue at each program point, loses the correlation. –l := e : evaluate e on the input value set and update the value set of l. –if(e) : propagate values that can make e true to true branch, similarly for false branch. –Join operation is point-wise union. –Treats aliases conservatively.

Restrictions on Analysis (2) Abstractions of value set analysis: –A is an abstraction of VS if there are α and γ such that α (lfp of VS) lfp of A and lfp of VS γ (lfp of A). –Null-pointer analysis, Interval analysis, Constant propagation, May pointer analysis…

Interpreting the Result We assume that the value set of an lvalue (or its abstraction) is relevant only at those program points where that lvalue is read. –Result of NPA is important only where the pointer is dereferenced. –Result of CP is important only where that variable is read. Our result is sound only for relevant lvalues at a given program point.

Why does it work? For Value Set analysis: –LFP of sequential analysis over- approximates join-over-all-paths in sync- CFG. –It is enough to show that if an execution produces a value v for an lvalue l relevant at a program point E, then there is a path in sync-CFG that includes v in VS( l ) at E.

Path in Sync-CFG W: x = y R: … = x Induction over execution length. W and R are related by hb. hb = (po U sw)* Flow functions of po edges over- approximate execution behavior. Flow functions of sw edges are identity.

Context-Sensitive Analysis Analysis domain: –call string -> abstract state On a call site c, –[s -> a] -> [sc -> a] On return to call site c, –[sc -> a] -> [s -> a]

Context-Sensitive Analysis for Concurrent Programs Use a summary component at each may-synchronize-with edge. Join all the states at acquire and put in summary. Join the summary with all (non- bottom) states at release.

Results seq analysis actually safe our analysis all derefs

Comparison with RADAR

Sources of Imprecision Alias analysis, may happen in parallel analysis, … Representation of multiple dynamic threads by a single static thread. Paths in sync-CFG that do not correspond to any real execution.

foo() { lock l; x++; unlock l; } main() { fork(foo); … fork(foo); } bar() { lock l; x++; unlock l; } baz() { lock l; x++; unlock l; }

Conclusion A dataflow analysis technique for DRF programs. Defined the conditions for soundness. Demonstrated scalability and precision.