Michael Ernst, page 1 Static and dynamic analysis: synergy and duality Michael Ernst CSE 503, Winter 2010 Lecture 1.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Runtime Techniques for Efficient and Reliable Program Execution Harry Xu CS 295 Winter 2012.
Partial Order Reduction: Main Idea
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } P1() Challenge: Correct and Efficient Synchronization { ……………………………
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } T1() Challenge: Correct and Efficient Synchronization { ……………………………
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
© Janice Regan Problem-Solving Process 1. State the Problem (Problem Specification) 2. Analyze the problem: outline solution requirements and design.
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Building Reliable Software Requirements and Methods.
Program analysis Mooly Sagiv html://
CSE 830: Design and Theory of Algorithms
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael Ernst, Jake Cockrell, William Griswold, David Notkin Presented by.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Overview of program analysis Mooly Sagiv html://
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Data Flow Analysis Compiler Design Nov. 8, 2005.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Winter 2012SEG Chapter 11 Chapter 1 (Part 2) Introduction to Requirements Modeling.
DATA STRUCTURE Subject Code -14B11CI211.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Automated Diagnosis of Software Configuration Errors
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
Computer Programming and Basic Software Engineering 4. Basic Software Engineering 1 Writing a Good Program 4. Basic Software Engineering.
An Introduction to MBT  what, why and when 张 坚
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Benjamin Gamble. What is Time?  Can mean many different things to a computer Dynamic Equation Variable System State 2.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
90-723: Data Structures and Algorithms for Information Processing Copyright © 1999, Carnegie Mellon. All Rights Reserved. 1 Lecture 1: Introduction Data.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
What’s Ahead for Embedded Software? (Wed) Gilsoo Kim
D A C U C P Speculative Alias Analysis for Executable Code Manel Fernández and Roger Espasa Computer Architecture Department Universitat Politècnica de.
Model Checking Lecture 1. Model checking, narrowly interpreted: Decision procedures for checking if a given Kripke structure is a model for a given formula.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Abstraction and Abstract Interpretation. Abstraction (a simplified view) Abstraction is an effective tool in verification Given a transition system, we.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Optimistic Hybrid Analysis
State your reasons or how to keep proofs while optimizing code
The Future of Software Engineering: Tools
Iterative Program Analysis Abstract Interpretation
Objective of This Course
Static and dynamic analysis: synergy and duality
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
The Challenge of Cross - Language Interoperability
Advanced Compiler Design
String Analysis for JavaScript Programs Using JSAI
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

Michael Ernst, page 1 Static and dynamic analysis: synergy and duality Michael Ernst CSE 503, Winter 2010 Lecture 1

Michael Ernst, page 2 Goals Theme: Static and dynamic analyses are more similar than many people believe One person’s view of their relationship Goals: Encourage blending of the techniques and communities Start productive discussions

Michael Ernst, page 3 Outline Review of static and dynamic analysis Synergy: combining static and dynamic analysis Aggregation Analogies Hybrids Duality: subsets of behavior Conclusion

Michael Ernst, page 4 Static analysis Examples: compiler optimizations, program verifiers Examine program text (no execution) Build a model of program state An abstraction of the run-time state Reason over possible behaviors E.g., “run” the program over the abstract state

Michael Ernst, page 5 Abstract interpretation Typically implemented via dataflow analysis Each program statement’s transfer function indicates how it transforms state Example: What is the transfer function for y = x++; ?

Michael Ernst, page 6 Selecting an abstract domain  x = { 3, 5, 7 }; y = { 9, 11, 13 }  y = x++;  x = { 4, 6, 8 }; y = { 3, 5, 7 }   x is prime; y is prime  y = x++;  x is anything; y is prime   x is odd; y is odd  y = x++;  x is even; y is odd   x n = f(a n-1,…,z n-1 ); y n = f(a n-1,…,z n-1 )  y = x++;  x n+1 = x n +1; y n+1 = x n   x=3, y=11 ,  x=5, y=9 ,  x=7, y=13  y = x++;  x=4, y=3 ,  x=6, y=5 ,  x=8, y=7 

Michael Ernst, page 7 Research challenge: Choose good abstractions The abstraction determines the expense (in time and space) The abstraction determines the accuracy (what information is lost) Less accurate results are poor for applications that require precision Cannot conclude all true properties in the grammar

Michael Ernst, page 8 Static analysis recap Slow to analyze large models of state, so use abstraction Conservative: account for abstracted-away state Sound: (weak) properties are guaranteed to be true *Some static analyses are not sound

Michael Ernst, page 9 Dynamic analysis Examples: profiling, testing Execute program (over some inputs) The compiler provides the semantics Observe executions Requires instrumentation infrastructure Must choose what to measure, and what test runs

Michael Ernst, page 10 Research challenge: What to measure? Coverage or frequency Statements, branches, paths, procedure calls, types, method dispatch Values computed Parameters, array indices Run time, memory usage Test oracle results Similarities among runs [Podgurski 99, Reps 97] Like abstraction, determines what is reported

Michael Ernst, page 11 Research challenge: Choose good tests The test suite determines the expense (in time and space) The test suite determines the accuracy (what executions are never seen) Less accurate results are poor for applications that require correctness Many domains do not require correctness! *What information is being collected also matters

Michael Ernst, page 12 Dynamic analysis recap Can be as fast as execution (over a test suite, and allowing for data collection) Example: aliasing Precise: no abstraction or approximation Unsound: results may not generalize to future executions Describes execution environment or test suite

Michael Ernst, page 13 Static analysis Abstract domain slow if precise Conservative due to abstraction Sound due to conservatism Concrete execution slow if exhaustive Precise no approximation Unsound does not generalize Dynamic analysis

Michael Ernst, page 14 Outline Review of static and dynamic analysis Synergy: combining static and dynamic analysis Aggregation Analogies Hybrids Duality: subsets of behavior Conclusion 

Michael Ernst, page 15 Combining static and dynamic analysis 1.Aggregation: Pre- or post-processing 2.Inspiring analogous analyses: Same problem, different domain 3.Hybrid analyses: Blend both approaches

Michael Ernst, page Aggregation: Pre- or post-processing Use output of one analysis as input to another Dynamic then static Profile-directed compilation: unroll loops, inline, reorder dispatch, … Verify properties observed at run time Static then dynamic Reduce instrumentation requirements Efficient branch/path profiling Discharge obligations statically (type/array checks) Type checking (e.g., Java, including generics) Indicate suspicious code to test more thoroughly

Michael Ernst, page Analogous analyses: Same problem, different domain Any analysis problem can be solved in either domain Type safety: no memory corruption or operations on wrong types of values Static type-checking Dynamic type-checking Slicing: what computations could affect a value Static: reachability over dependence graph Dynamic: tracing

Michael Ernst, page 18 Memory checking Goal: find array bound violations, uses of uninit. memory Purify [Hastings 92] : run-time instrumentation Tagged memory: 2 bits (allocated, initialized) per byte Each instruction checks/updates the tags Allocate: set “A” bit, clear “I” bit Write: require “A” bit, set “I” bit Read: require “I” bit Deallocate: clear “A” bit LCLint [Evans 96] : compile-time dataflow analysis Abstract state contains allocated and initialized bits Each transfer function checks/updates the state Identical analyses! Another example: atomicity checking [Flanagan 2003]

Michael Ernst, page 19 Specifications Specification checking Statically: theorem-proving Dynamically: assert statement Specification generation Statically: by hand or abstract interpretation [Cousot 77] Dynamically: by invariant detection [Ernst 99], reporting unfalsified properties

Michael Ernst, page 20 Your analogous analyses here Look for gaps with no analogous analyses! Try using the same analysis But be open to completely different approaches There is still low-hanging fruit to be harvested

Michael Ernst, page Hybrid analyses: Blending static and dynamic Combine static and dynamic analyses Not mere aggregation, but a new analysis Disciplined trade-off between precision and soundness: find the sweet spot between them

Michael Ernst, page 22 Possible starting points Analyses that trade off run-time and precision Different abstractions (at different program points) Switch between static and dynamic at analysis time Ignore some available information Examine only some paths [Evans 94, Detlefs 98, Bush 00] Merge based on observation that both examine only a subset of executions (next section of talk) Problem: optimistic vs. pessimistic treatment More examples: (bounded) model checking, security analyses, delta debugging [Zeller 99], etc.

Michael Ernst, page 23 Outline Review of static and dynamic analysis Synergy: combining static and dynamic analysis Aggregation Analogies Hybrids Duality: subsets of behavior Conclusion 

Michael Ernst, page 24 Static analysis Abstract domain slow if precise Conservative due to abstraction Sound due to conservatism Concrete execution slow if exhaustive Precise no approximation Unsound does not generalize Dynamic analysis

Michael Ernst, page 25 Sound dynamic analysis Observe every possible execution! Problem: infinite number of executions Solution: test case selection and generation Efficiency tweaks to an algorithm that works perfectly in theory but exhausts resources in practice

Michael Ernst, page 26 Precise static analysis Reason over full program state! Problem: infinite number of executions Solution: data or execution abstraction Efficiency tweaks to an algorithm that works perfectly in theory [Cousot 77] but exhausts resources in practice

Michael Ernst, page 27 Dynamic analysis focuses on a subset of executions The executions in the test suite Easy to enumerate Characterizes program use Typically optimistic for other executions

Michael Ernst, page 28 Static analysis focuses on a subset of data structures More precise for data or control described by the abstraction Concise logical description Typically conservative elsewhere (safety net) Example: k-limiting [Jones 81] Represents each object reachable by ≤k pointers Groups together (approximates) more distant objects

Michael Ernst, page 29 Dual views of subsets Execution and data subsets are views on the same space Every execution subset corresponds to a data subset Executions induce data structures and control flow Every data subset corresponds to an execution subset A set of objects represents the executions that generate them Subset description may be concise in one domain but complex in the other What if the test suite was generated from a specification? Any analysis may be conservative over other behaviors

Michael Ernst, page 30 Differences between the approaches Static and dynamic analysis communities work with different subsets Each subset and characterization is better for certain uses What subsets have a concise description in both domains? Augment a test suite to fill out the data structures that it creates, making the data structure description a smaller logical formula

Michael Ernst, page 31 A hybrid view of subsets Bring together static and dynamic analysis by unifying their subset descriptions Find subsets with small descriptions with respect to both data structures and executions Find a new, smaller description Advantages of this approach Directly compare previous disparate analyses Directly apply analyses to other domain Switch between the approaches Obtain insight in order to devise and optimize analyses

Michael Ernst, page 32 Outline Review of static and dynamic analysis Synergy: combining static and dynamic analysis Aggregation Analogies Hybrids Duality: subsets of behavior Conclusion 

Michael Ernst, page 33 Potential pitfalls Analogies between analyses What applications tolerate unsoundness/imprecision? Any more low-hanging fruit? Most static and dynamic approaches differ Hybrid analyses How to measure and trade off precision and soundness What is “partial soundness”? What is in between? Not all static analyses are abstract interpretation Optimistic vs. pessimistic treatment of unseen executions Subset characterization Find the unified characterization of behavior

Michael Ernst, page 34 Conclusion Static and dynamic analysis share many similarities Communities should be closer Create analogous analyses Many successes so far Hybrid approach holds great promise Analyses increasingly look like points in this continuum Unified theory of subsets of executions/data is key (Our) future work: explore this space

Michael Ernst, page 35 Discussion