Annotation-Assisted Lightweight Static Checking Thanks Thesis work on policy-directed code safety. Goal: Prevent programs from doing bad things but allow good programs to do useful work. David Evans evans@cs.virginia.edu http://lclint.cs.virginia.edu Collaborators: John Knight, David Larochelle LCLint University of Virginia Department of Computer Science
A Gross Oversimplification all Formal Verifiers Bugs Detected State of the world before LCLint (gross simplification) Static checking as part of compilers or standard lint Formal verification tools – range from fiendishly expensive to unfathomable not used much except in academic research projects and when taxpayers are paying Just to prove I have some powerpoint animation skills, even if they aren’t as impressive as Nick McKeown’s Compilers none Low Unfathomable Effort Required 4 June 2000 David Evans
State of the world before LCLint (gross simplification) Static checking as part of compilers or standard lint Formal verification tools – range from fiendishly expensive to unfathomable not used much except in academic research projects and when taxpayers are paying Just to prove I have some powerpoint animation skills, even if they aren’t as impressive as Nick McKeown’s LCLint 4 June 2000 David Evans
Requirements No interaction required – as easy to use as a compiler Fast checking – as fast as a compiler Gradual Learning/Effort Curve Little needed to start Clear payoff relative to user effort 4 June 2000 David Evans
Approach Programmers add annotations (formal specifications) Simple and precise Describe programmers intent: Types, memory management, data hiding, aliasing, modification, null-ity, etc. LCLint detects inconsistencies between annotations and code Simple (fast!) dataflow analyses 4 June 2000 David Evans
Sample Annotation: only extern only char *gptr; extern only out null void *malloc (int); Reference (return value) owns storage No other persistent (non-local) references to it Implies obligation to transfer ownership Transfer ownership by: Assigning it to an external only reference Return it as an only result Pass it as an only parameter: e.g., extern void free (only void *); 4 June 2000 David Evans
Example extern only null void *malloc (int); in library 1 int dummy (void) { 2 int *ip= (int *) malloc (sizeof (int)); 3 *ip = 3; 4 return *ip; 5 } Note: user didn’t have to write any annotations to discover these bugs! LCLint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated 4 June 2000 David Evans
Checking Examples Encapsulation – abstract types (rep exposure), global variables, documented modifications Memory management – leaks, dead references De-referencing null pointers, dangerous aliasing, undefined behavior (order of modifications, etc.) 4 June 2000 David Evans
Unsoundness & Incompleteness are Good! Okay to miss errors Report as many as possible Okay to issue false warnings But don’t annoy the user to too many Make it easy to configure checking and override warnings Design tradeoff – do more ambitious checking the best you can 4 June 2000 David Evans
LCLint Status Public distribution since 1993 Effective checking >100K line programs (checks about 1K lines per second) Detects lots of real bugs in real programs (including itself, of course) Over 1000 users More information: lclint.cs.virginia.edu PLDI ’96, FSE’94 4 June 2000 David Evans
Where do we go from here? Extensible Checking Allow users to define new annotations and associated checking Integrate run-time checking Combine static and run-time checking to enable additional checking and completeness guarantees Generalize framework Support static checking for multiple source languages in a principled way 4 June 2000 David Evans
Extensible Checking LCLint engine provides analysis core Events associated with code points Control flow analysis, alias analysis User (library) defines checking rules Introduces state associated with types of objects (e.g., references, numeric values) Defines annotations for initializing and constraining that state Defines checking rules associated with code events 4 June 2000 David Evans
Example: Nullity state nullity { applies to reference oneof { notnull, maybenull, isnull } default maybenull merge { notnull + (maybenull | isnull) = maybenull maybenull + * = maybenull isnull + (maybenull | notnull) = maybenull } guard { != NULL => notnull == NULL => isnull 4 June 2000 David Evans
Nullity Checks checks (reference x) nullity { transfer (notnull, { notnull, maybenull }) transfer (maybenull, maybenull) transfer (isnull, { isnull, maybenull }); *x requires notnull } annotation null : reference declaration requires state nullity nullity = maybenull; annotation notnull : reference declaration nullity = notnull; 4 June 2000 David Evans
Example: Sharing state sharing { applies to reference oneof { only, temp, shared, dead } default { result, global, field } only parameter temp } annotation only : reference x { x.sharing = only; } annotation temp : reference x { x.sharing = temp; } 4 June 2000 David Evans
Sharing Checks checks (reference x) { must transfer transfer (only, only) becomes dead transfer (temp, only) use x requires not dead } 4 June 2000 David Evans
Open Questions What are the primitives? What are the limits on checking? Without sacrificing static and efficient requirements Can decent messages be generated automatically from checking definitions? Is this framework sufficiently powerful to describe a useful class of checks? Is this framework sufficiently accessible to allow programmers to define application-specific checks? 4 June 2000 David Evans
Test Cases Built-in LCLint annotations and checks Buffer Overflows [David Larochelle] Adapted from [Wagner00] Depends on numerical range analysis Can this be defined or must it be a primitive? Information flow Based on JFlow [Myers99] Need to provide parameterized annotations Support for polymorphism Can this be defined or must to be added to engine? 4 June 2000 David Evans
Summary A little redundancy goes a long way Don’t need a full specification to do useful checking Gradually add more redundancy to get better checking Lots of opportunity for user-defined checking But many open questions to answer 4 June 2000 David Evans
Credits LCLint Funding David Larochelle LCL: Yang Meng Tan, John Guttag, Jim Horning Funding DARPA, NSF, ONR, NASA 4 June 2000 David Evans
State, Annotations and Checking A large class of useful checking involves: Associating state with references at execution points Providing type-judgment rules for the new state Providing state transition rules Can we define this class? Can we describe new annotations and checks in a way the is accessible to normal programmers? Don’t want rules for every grammar production 4 June 2000 David Evans
Concrete Example: Buffer Overflow Errors [credits] State: reference { bool nullterminated; int allocated; int 4 June 2000 David Evans