Crystal-izing Sophisticated Code Analyses Ciera Jaspan Kevin Bierhoff Jonathan Aldrich

Slides:



Advertisements
Similar presentations
OO Programming in Java Objectives for today: Overriding the toString() method Polymorphism & Dynamic Binding Interfaces Packages and Class Path.
Advertisements

Control Structures Ranga Rodrigo. Control Structures in Brief C++ or JavaEiffel if-elseif-elseif-else-end caseinspect for, while, do-whilefrom-until-loop-end.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Chapter 1 Object-Oriented Concepts. A class consists of variables called fields together with functions called methods that act on those fields.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Chapter 10- Instruction set architectures
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS162 Week 2 Kyle Dewey. Overview Continuation of Scala Assignment 1 wrap-up Assignment 2a.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
Written by: Dr. JJ Shepherd
Program Representations. Representing programs Goals.
FIT FIT1002 Computer Programming Unit 19 Testing and Debugging.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
1 Flash Programming Introduction Script Assist. 2 Course Description This course concentrates on the teaching of Actionscript, the programming language.
CS0007: Introduction to Computer Programming Introduction to Arrays.
Precision Going back to constant prop, in what cases would we lose precision?
Language Evaluation Criteria
Crystal-izing Sophisticated Code Analyses Ciera Jaspan Kevin Bierhoff Jonathan Aldrich
Software Systems Verification and Validation Laboratory Assignment 3
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
CS4311 Spring 2011 Unit Testing Dr. Guoqiang Hu Department of Computer Science UTEP.
CMSC 202 Exceptions. Aug 7, Error Handling In the ideal world, all errors would occur when your code is compiled. That won’t happen. Errors which.
Polymorphism, Inheritance Pt. 1 COMP 401, Fall 2014 Lecture 7 9/9/2014.
Netprog: Java Intro1 Crash Course in Java. Netprog: Java Intro2 Why Java? Network Programming in Java is very different than in C/C++ –much more language.
CS 206 Introduction to Computer Science II 09 / 10 / 2009 Instructor: Michael Eckmann.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
P.R. James © P.Chalin et al.1 An Integrated Verification Environment for JML: Architecture and Early Results Patrice Chalin, Perry R. James, and George.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
White-box Testing.
COP-3330: Object Oriented Programming Flow Control May 16, 2012 Eng. Hector M Lugo-Cordero, MS.
Java development environment and Review of Java. Eclipse TM Intergrated Development Environment (IDE) Running Eclipse: Warning: Never check the “Use this.
Eclipse 24-Apr-17.
Object Oriented Software Development
SilkTest 2008 R2 SP1: Silk4J Introduction. ConfidentialCopyright © 2008 Borland Software Corporation. 2 What is Silk4J? Silk4J enables you to create functional.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
90-723: Data Structures and Algorithms for Information Processing Copyright © 1999, Carnegie Mellon. All Rights Reserved. 1 Lecture 1: Introduction Data.
IBM TSpaces Lab 2 Customizing tuples and fields. Summary Blocking commands Tuple Expiration Extending Tuples (The SubclassableTuple) Reading/writing user.
Interfaces About Interfaces Interfaces and abstract classes provide more structured way to separate interface from implementation
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
Java Annotations for Types and Expressions Mathias Ricken October 24, 2008 COMP 617 Seminar.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Quick Review of OOP Constructs Classes:  Data types for structured data and behavior  fields and methods Objects:  Variables whose data type is a class.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
CS 367 Introduction to Data Structures Charles N. Fischer Fall s367.html.
1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
COMPOSITE PATTERN NOTES. The Composite pattern l Intent Compose objects into tree structures to represent whole-part hierarchies. Composite lets clients.
© Dr. A. Williams, Fall Present Software Quality Assurance – Clover Lab 1 Tutorial / lab 2: Code instrumentation Goals of this session: 1.Create.
Notices Assn 2 is due tomorrow, 7pm. Moodle quiz next week – written in the lab as before. Everything up to and including today’s lecture: Big Topics are.
SWE 434 SOFTWARE TESTING AND VALIDATION LAB2 – INTRODUCTION TO JUNIT 1 SWE 434 Lab.
CS 215 Final Review Ismail abumuhfouz Fall 2014.
Software Testing and Maintenance 1
Spark Presentation.
Structural testing, Path Testing
Topic 10: Dataflow Analysis
Another example: constant prop
Data Flow Analysis Compiler Design
Fundaments of Game Design
CMSC 202 Exceptions.
Presentation transcript:

Crystal-izing Sophisticated Code Analyses Ciera Jaspan Kevin Bierhoff Jonathan Aldrich

Crystal A framework for creating static analyses Primarily for educational purposes Easy startup into a static analysis Direct from theory to implementation Incremental development of analyses “Wow factor”: power of Eclipse Also found to be useful for research prototypes 2

This tutorial Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity We’ll provide sample analyses and sample assignments for classes For more information, visit our wiki 3

Installation Pre-requisites Eclipse with Java Development Tools (JDT) Plugin Development Environment (PDE) Crystal Available from our Eclipse update site The USB drive contains: A version of Eclipse with Crystal already installed Sample code for several null pointer analyses Sample test code to run the null analysis on 4

Crystal in the classroom Crystal is used in a professional masters program in software engineering Course: Analysis of Software Artifacts Students: Real-world experience, may not have (or want!) a theoretical background How can program analysis be used in industry, now and in the near future? Students should Know what can affect the usability and precision of a static analysis Understand what kind of problems static analysis can solve 5

Crystal in research High transferability from paper to code made Crystal natural choice for research Currently 4 research analyses written in Crystal 3 are published (OOPSLA 08, ECOOP 09, OOPSLA 09) Allows incremental development to more sophisticated features Annotations with custom parsers Branch-sensitivity Automated testing Widening v. regular join 6

An incremental approach Students typically answer questions on paper first We’ll note questions we can ask students in red! Then, students transfer that knowledge directly to code We’ll note how to code the answer in blue! Use an incremental approach Instructions on the wiki with verification points Today, everyone will make a simple flow analysis We’ll move a little faster to create a smart flow analysis You can follow along with the code samples 7

Everything installed right? Crystal menu is where analyses appear Several built-in ones already available 8

Steps for making an analysis Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity 9

Register and Run Create a new plugin project Make it depend on Crystal and JDT Implement ICrystalAnalysis Register with the extension-point CrystalAnalysis in the plugin.xml file Select Run -> Run Configuration… Make a new Eclipse configuration Run! Your analysis name should appear in the Crystal menu. 10

Steps for making an analysis Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity 11

Use an AST walker We’re ready to make an analysis now. What kinds of expressions do we want to check for an null pointer analysis? Method calls Field access Array access In analyzeMethod(), create an ASTVisitor that gives an error when it encounters these operations. 12

Everything running? Create a new project in the child Eclipse Add code to analyze Test with Crystal->MyAnalysis Can also run automated tests in JUnit Will not cover that today See wiki for more information 13

Steps for making an analysis Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity 14

Dataflow concepts: review Lattice The abstract states the program can be in Control flow graph The order of control flow through the nodes of the AST Transfer functions How the states change as the analysis encounters new program instructions Worklist algorithm Traverses the control flow graph and runs the transfer functions 15

Lattice A finite lattice of the abstract states the program can be in Control flow graph The order of control flow through the nodes of the AST Transfer functions How the states change as the analysis encounters new program instructions Worklist algorithm Traverses the control flow graph and runs the transfer functions Dataflow concepts: review 16

Lattice review Top of lattice represents least precise info Bottom of lattice represents an unanalyzed element Must have finite height to ensure termination Unique least upper bound must exist for any two elements  a b  17

Transfer function review Given An instruction An incoming lattice element  Produce An outgoing lattice element  ’  ( instr,  ) =  ’ Make a different transfer function on each type of instruction 18

Map all variables to an element in the lattice above Tuple Lattice: A lattice which maps a key to an element in another lattice A simple null analysis  ( x = null,  ) =  [ x  NULL ]  ( x = y,  ) =  [ x   ( y )]  ( x = new C(),  ) =  [ x  NOT_NULL ]  ( x = y.m(z1,…,zn),  ) =  [ y  NOT_NULL ] (array access, field access, etc.) MAYBE_NULL NOT_NULLNULL  19

The lattice What are the elements in the lattice? Bottom, null, not null, and maybe null Create a type which represents the elements Crystal allows this to be an arbitrary type This is likely an immutable type, like an enum Tuple Lattices We will use TupleLatticeElement Just create the type which represents the sub-lattice public enum NullLatticeElement { BOTTOM, NULL, NOT_NULL, MAYBE_NULL; } 20

The lattice What is the bottom-most element? What is the the top-most? What is the ordering of elements? What does the join operation look like? Extend SimpleLatticeOperations LE bottom() boolean atLeastAsPrecise(LE, LE) LE join(LE, LE) LE copy(LE) 21

Setting up the flow analysis What is the lattice element at the start of the method? Everything may be null (except this is not null) If using a tuple lattice, what is the default? Maybe null Extend AbstractingTransferFunction Implement createEntryValue() and getLatticeOperations() (Don’t override the transfer functions yet) We’ll also now have the visitor call the flow analysis 22

Transfer functions Which instructions cause the lattice element to change? How do they change the lattice element? Null: makes target null Constructor: makes target non-null Copying assignment: makes target same as right side In the derived transfer function, override the relevant instructions 23

You now have a Crystal flow analysis And that’s it! From here, we’re just going to improve it Annotations: teach students power of specifications Branch-sensitivity: teach students power of abstractions closer to code We’ll move a little faster Assistants are ready to help if you wish to follow along 24

Why Three Address Code Does mean students work with both Eclipse AST and TAC However TAC has no sub-expressions TAC has many fewer kinds of nodes Students able to understand TAC as it matched what they wrote down on paper 25

Relevant packages edu.cmu.cs.crystal Core package for analyses org.eclipse.jdt.core.dom The Eclipse AST edu.cmu.cs.crystal.simple Simple interfaces for flow analyses edu.cmu.cs.crystal.tac.model The interfaces for three address code instructions 26

Steps for making an analysis Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity 27

Annotations What specifications could we add to make the analysis more precise? Non-null on method parameters Create a Java annotation Put it in a jar separate from your analysis Make it available to the code being ElementType.PARAMETER}) NonNull {} 28

Annotations What transfer functions can use this annotation to improve precision? Initial lattice information Method call instruction (return value) Annotations are available from the Eclipse AST But hard to get No desugaring Use the AnnotationDatabase Pass in an AnnotationDatabase to the transfer functions Query it to find instances of annotation Can also give Crystal a custom parser for complex == foo and y != bar”) 29

Annotations Where can the visitor use annotations for additional checking? Method call parameters Constructor call parameters Use the AnnotationDatabase Query it to find instances of annotation Check that parameter is not null 30

Steps for making an analysis Install Crystal Register an analysis Create a simple AST walker for nullness Add a simple flow analysis Add annotations Add branch-sensitivity 31

Branch-sensitivity Take advantage of knowledge gained through tests Specify different exit paths through a method An invariant that doesn’t hold on exceptional exit Labeled branches let us distinguish these Must return different lattice elements for each label if (x != null) { //hey, it’s safe //to use x in here! } else { //but it’s an //error in here! } 32

On paper… No branch sensitivity  ( x == y,  ) =  Branch sensitivity  T ( x == y,  ) = if (  ( x ) != MAYBE_NULL )  [ y   ( x )] else if (  ( y ) != MAYBE_NULL )  [ x   ( y )] else  Separate definition for the false branch  F ( x == y,  ) 33

Types of labels True/false All conditionals (if, while, ?:, etc.) Methods calls that return a boolean Binary relational operators (&&, <, ==, etc.) Exceptional Methods calls that throw exceptions Throw statements Catch and Finally statements Switch (used on switch) Iterator (used on enhanced for) Normal 34

Changing to branch-sensitive analyses 1.Implement AbstractTACBranchSensitiveTransferFunction 2.Change signatures on transfer functions 3.Wrap return lattice in an IResult At this point, transfer functions run as they did before public LE transfer(TACInstruction instr, LE value)  public IResult transfer(TACInstruction instr, List labels, LE value) return value;  return LabeledSingleResult.createResult(value, labels); 35

Using the branches Which instructions can provide different information on each branch? x == y x != y Create a new LabeledResult with the labels and a default value Copy the lattice element for each branch Change the lattice elements Put them into the labeled result with the right label 36

Crystal Static Analysis Framework Fast startup into a simple analysis Direct from theory to implementation Incremental sophistication of analysis Full power of Eclipse infrastructure Proven useful for both teaching and research 4 research analyses Used for several years in a professional master’s course 37