Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs Dave Hovemeyer Bill Pugh Jaime Spacco.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Advanced programming tools at Microsoft
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Automating Software Module Testing for FAA Certification Usha Santhanam The Boeing Company.
Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs David Hovemeyer, Jaime Spacco, and William Pugh Presented by Nathaniel Ayewah CMSC838P.
API Design CPSC 315 – Programming Studio Fall 2008 Follows Kernighan and Pike, The Practice of Programming and Joshua Bloch’s Library-Centric Software.
Nov 10, Fall 2006IAT 8001 Debugging. Nov 10, Fall 2006IAT 8002 How do I know my program is broken?  Compiler Errors –easy to fix!  Runtime Exceptions.
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
RubyPolish: Static Bug Detection in Ruby Programs John Locke Alex Mont.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
14-Jun-05White Elephant GmbH1 Ada Bug Finder. 14-Jun-05White Elephant GmbH2 Ada Bug Finder The Ada Bug Finder is a Windows application that searches Ada.
Precision Going back to constant prop, in what cases would we lose precision?
CSE 6329 Project Team 1 Aliasgar Kagalwala Aditya Mone Derek White Dengfeng (Thomas) Xia.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
Language Evaluation Criteria
Testing. What is Testing? Definition: exercising a program under controlled conditions and verifying the results Purpose is to detect program defects.
Testing. Definition From the dictionary- the means by which the presence, quality, or genuineness of anything is determined; a means of trial. For software.
Coverage tools Program is typically compiled with special options, to add extra source or object code. –Additional data structures, such as a flow graph,
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
JUnit in Action SECOND EDITION PETAR TAHCHIEV FELIPE LEME VINCENT MASSOL GARY GREGORY ©2011 by Manning Publications Co. All rights reserved. Slides Prepared.
CSCE 548 Code Review. CSCE Farkas2 Reading This lecture: – McGraw: Chapter 4 – Recommended: Best Practices for Peer Code Review,
University of Maryland Bug Driven Bug Finding Chadd Williams.
Path Testing + Coverage Chapter 9 Assigned reading from Binder.
Object Oriented Programming Elhanan Borenstein Lecture #4.
SWE 619 © Paul Ammann Procedural Abstraction and Design by Contract Paul Ammann Information & Software Engineering SWE 619 Software Construction cs.gmu.edu/~pammann/
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
Use of Coverity & Valgrind in Geant4 Gabriele Cosmo.
COMP 121 Week 1: Testing and Debugging. Testing Program testing can be used to show the presence of bugs, but never to show their absence! ~ Edsger Dijkstra.
P.R. James © P.Chalin et al.1 An Integrated Verification Environment for JML: Architecture and Early Results Patrice Chalin, Perry R. James, and George.
Refactoring1 Improving the structure of existing code.
User-defined type checkers for error detection and prevention in Java Michael D. Ernst MIT Computer Science & AI Lab
Coverage Estimating the quality of a test suite. 2 Code Coverage A code coverage model calls out the parts of an implementation that must be exercised.
Static Analysis James Walden Northern Kentucky University.
Unit Testing with JUnit and Clover Based on material from: Daniel Amyot JUnit Web site.
1 Splint: A Static Memory Leakage tool Presented By: Krishna Balasubramanian.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Confidential Continuous Integration Framework (CIF) 5/18/2004.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Object-Oriented Programming Chapter Chapter
Demo of Scalable Pluggable Types Michael Ernst MIT Dagstuhl Seminar “Scalable Program Analysis” April 17, 2008.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Recommending Adaptive Changes for Framework Evolution Barthélémy Dagenais and Martin P. Robillard ICSE08 Dec 4 th, 2008 Presented by EJ Park.
Quick Review of OOP Constructs Classes:  Data types for structured data and behavior  fields and methods Objects:  Variables whose data type is a class.
Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.
Random Test Generation of Unit Tests: Randoop Experience
Week 6 MondayTuesdayWednesdayThursdayFriday Testing III Reading due Group meetings Testing IVSection ZFR due ZFR demos Progress report due Readings out.
Tracking Bad Apples: Reporting the Origin of Null & Undefined Value Errors Michael D. Bond UT Austin Nicholas Nethercote National ICT Australia Stephen.
Preventing bugs with pluggable type-checking Michael Ernst MIT
Defensive Programming. Good programming practices that protect you from your own programming mistakes, as well as those of others – Assertions – Parameter.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Code improvement: Coverity static analysis Valgrind dynamic analysis GABRIELE COSMO CERN, EP/SFT.
SWE 434 SOFTWARE TESTING AND VALIDATION LAB2 – INTRODUCTION TO JUNIT 1 SWE 434 Lab.
Design issues for Object-Oriented Languages
Working with Java.
Testing Tutorial 7.
Accessible Formal Methods A Study of the Java Modeling Language
High Coverage Detection of Input-Related Security Faults
Computer Science 340 Software Design & Testing
Defensive Programming
Presentation transcript:

Evaluating and Tuning a Static Analysis to Find Null Pointer Bugs Dave Hovemeyer Bill Pugh Jaime Spacco

How hard is it to find null- pointer exceptions? Large body of work –academic research too much to list on one slide –commercial applications PREFix / PREFast Coverity Polyspace

Lots of hard problems Aliasing Infeasible paths Resolving call targets Providing feedback to developers under what conditions an error can happen

Can we use simple techniques to find NPE? Yes, when you have code like: // Eclipse if (in == null) try { in.close(); } catch (IOException e) {} Easy to confuse == and !=

Easy to confuse && with || // JBoss 4.0.0RC1 if (header != null || header.length > 0) {... } This type of error (and less obvious bugs) occur in production mode more frequently than you might expect

The FindBugs Project Open-Source static bug finder – –127,394 downloads as of Saturday –Java bytecode Used at several companies –Goldman-Sachs Bug-driven bug finder –start with a bug –What’s the simplest analysis to find the bug?

FindBugs null pointer analysis Intra-procedural analysis –Compute all reaching paths for a value Take conditionals into account –Use value numbering analysis to update all copies of updated value No modeling of heap values Don’t report warnings that might be false positives due to infeasible paths Extended basic analysis with limited inter- procedural analysis using annotations

Dataflow Lattice

Null on a Simple Path (NSP) Merge null with anything else We only care that there is control flow where the value is null –We don’t try to identify infeasible paths –The NPE happens if the program achieves full branch coverage

Null on a Simple Path (NSP)

Null on a Complex Path (NCP) Most conservative approximation –Tell the analysis we lack sufficient information to justify issuing a warning when the value is dereferenced so we don’t issue any warnings Used for: –method parameters –Instance variables –NSP values that reach a conditional

No Kaboom Non-Null Definitely non-null because the pointer was dereferenced Suspicious when programmer compares a No-Kaboom value against null –Confusion about program specification or contracts

// Eclipse // fTableViewer is method parameter property = fTableViewer.getColumnProperties();... if (fTableViewer != null) {... }

// Eclipse // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties();... if (fTableViewer != null) {... }

// Eclipse // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties(); // fTableViewer : NoKaboom nonnull... if (fTableViewer != null) {... }

// Eclipse // fTableViewer is method parameter // fTableViewer : NCP property = fTableViewer.getColumnProperties(); // fTableViewer : NoKaboom nonnull... // redundant null-check => warning! if (fTableViewer != null) {... }

Redundant Checks for Null (RCN) Compare a value statically known to be null (or non-null) with null Does not necessarily indicate a problem –Defensive programming Assume programmers don’t intend to write (non-trivial) dead code

Extremely Defensive Programming // Eclipse File dir = new File(...); if (dir != null && dir.isDirectory()) {... }

Non-trivial dead code x = null … does not assign x… if (x!=null) { // non-trivial dead code x.importantMethod() }

What do we report? Dereference of value known to be null –Guaranteed NPE if dereference executed –Highest priority Dereference of value known to be NSP –Guaranteed NPE if the path is ever executed –Exploitable NPE assuming full branch coverage –Medium priority If paths can only be reached if an exception occurs –lower priority

Reporting RCNs No-Kaboom RCNs –higher priority RCNs that create dead code –medium priority other RCNs –low priority

Evaluate our analysis using: Production software –jdk1.6.0-b48 –glassfish-9.0-b12 (Sun's application server) –Eclipse Manually classified each warning Student programming projects

Production Results Software # NP derefs and RCN warnings JDK b48242 Glassfish-9.0-b12 (Sun’s app server) 317 Eclipse

Eclipse Results with Manual Inspection of warnings Warning TypeAccurate Warnings False Positives Precision Null Deref % No KaBoom RCN % Other RCN151747%

How many of the existing NPEs are we detecting? Difficult question for production software Student code base allows us to study all NPE produced by a large code base covered by fairly complete unit tests –How many NP Warnings correspond with a run-time fault? False Positives –How many NPE do we issue a warning for? False Negatives

The Marmoset Project Automated snapshot, submission and testing system –Eclipse plug-in captures snapshots of all saves to central repository Students submit code to a central server for testing against suite of unit tests –End of semester we run all snapshots against tests –Also run FindBugs on all intermediate snapshots

student73 snapshots51,484 compilable40,742 unique33,015 total test outcomes505,423 not implemented67,650 exception thrown63,488 NP exception29,467 assertion failed138,834 passed235,448 Overall numbers, Fall 2004, 2nd semester OOP course

Analyzing Marmoset results Analyze two projects –Binary Search Tree –WebSpider Difficult to decide what to count –per snapshot, per warning, per NPE? –false positives persist and get over-counted –multiple warnings / NPEs per snapshot –exceptions can mask each other –difficult to match warnings and NPEs

project snapshots with recall NPEwarning BST7111% WebSpider % project snapshots with precision warningNPE BST22100% WebSpider777597% Marmoset Results

What are we missing? Projects have javadoc specifications about which parameters and return values can be null Encode specifications into a format FindBugs can use for limited inter-procedural analysis Easy to add annotations to the interface students were to implement –Though we did this after the semester

Annotations Lightweight way to communicate specifications about method parameters or return values issue warning if ever passed a null value issue warning if unconditionally dereferenced null in a complicated way no warnings issued

@CheckForNull By default, all values are Mark an entire class or by default –Must explicitly mark some values –Map.get() can return null Not every application needs to check every call to Map.get()

project snapshots with precision previous precision warningNPE BST403690%100% WebSpider %97% Marmoset Results with Annotations project snapshots with recall previous recall NPEwarning BST713854%1% WebSpider %29%

Related Work Lint (Evans) Metal (Engler et al) –“Bugs as Deviant Behavior” ESC Java –more general annotations Fahndrich and Leino –Non-null types for C#

Conclusions We can find bugs with simple methods –in student code –in production code –student bug patterns can often be generalized into patterns found in production code Annotations look promising –lightweight way of simplifying inter-procedural analysis –helpful when assigning blame

Thank you! Questions?

Difficult to decide what to count False positives tend to persist –over-counted Students fix NPEs quickly –under-count Multiple warnings / exceptions per snapshot Some exceptions can mask other exceptions