Download presentation
Presentation is loading. Please wait.
2
ESP: Program Verification Of Millions of Lines of Code Manuvir Das Researcher PPRC Reliability Team Microsoft Research
3
Motivation No Buffer Overruns! No Resource Leaks! No Privilege Misuse!
4
Approach Redundency is good Redundency is good Redundancy exposes inconsistency Redundancy exposes inconsistency Inconsistency points to errors Inconsistency points to errors Compare Compare what programmer should do what her code actually does
5
Lightweight specifications Rules Rules Describe correct behavior Readable/writable by programmers Specify limited properties Specify limited properties not total correctness/verification Compare rules against code Compare rules against code
6
Types are rules Programmers use types to Programmers use types to document interface syntax represent program abstractions Types are written, read and checked Types are written, read and checked routine part of development process Why are types successful? Why are types successful? types are lightweight specifications type checking is fast & routine errors are found early, at compile-time
7
Can we extend this approach? Specify and check other properties Specify and check other properties languages to express rules tools to check that code obeys rules Goal is partial correctness Goal is partial correctness detect and report important classes of errors no guarantee of program correctness Systematic tools of various flavors Systematic tools of various flavors compile-time verifiers and bug-finders run-time monitors and fault injectors document generators
8
Source Code Testing Development Precise Rules Program Analysis Engine Read for understanding New API rules Drive testing tools Static Verification Tool Rules 100% path coverage Defects Rule-based programming
9
C/C++ Code OPAL Rules Path-sensitive Dataflow Analysis ESP Rules 100% path coverage Defects ESP
10
Requirements Scalability Scalability Complete coverage Millions of lines of code All features of C/C++ Usability Usability Low number of false positives Simple rule description language Informative error reports
11
The bottom line Can ESP verify a million lines of code? Can ESP verify a million lines of code? We’re not sure …. yet We’re not sure …. yet We’ve done 150 KLOC in 70s and 50MB We’ve done 150 KLOC in 70s and 50MB So, we’re cautiously optimistic So, we’re cautiously optimistic
12
Are we running into a wall? Verification demands precision Verification demands precision Need to minimize false error reports Must analyze each execution path Big programs demand scalability Big programs demand scalability Exponentially/infinitely many paths Cannot analyze each execution path Must use approximate analysis
13
Research problem Can we invent a verification method that Can we invent a verification method that is always conservative, is always scalable, is almost always precise, and matches our intuition? Yes, for a certain class of rules Yes, for a certain class of rules Finite state, temporal safety properties
14
Finite state safety properties Property is described by an FSA Property is described by an FSA As the program executes, a monitor As the program executes, a monitor tracks the current state of the FSA updates the current state signals an error when the FSA transitions into special error states Goal of verification: Goal of verification: Is there some execution path that would cause the monitor to signal an error?
15
Example: stdio usage in gcc void main () { if (dump) fil = fopen(dumpFile,”w”); if (p) x = 0; else x = 1; if (dump) fclose(fil); } Closed Opened Error Open Print Open Close Print/Close * void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }
16
Path-sensitive property analysis Symbolically evaluate the program Symbolically evaluate the program Track FSA state and execution state Track FSA state and execution state At branch points: At branch points: Execution state implies branch direction? Yes: process appropriate branch No: split state and process both branches
17
Example [Opened|dump=T] [Closed|dump=T] [Opened|dump=T,p=T] [Opened|dump=T,p=T,x=0] [Closed|dump=T,p=T,x=0] [Closed] entry dump p x = 0x = 1 Open Close exit dump T T T F F F [Opened|dump=T,p=F] [Opened|dump=T,p=F,x=1] [Closed|dump=T,p=F,x=1]
18
Dataflow property analysis Track only FSA state Track only FSA state Ignore non-state-changing code Ignore non-state-changing code At control flow join points: At control flow join points: Accumulate FSA states
19
Example {Closed,Opened} {Error,Closed,Opened} {Closed} {Closed,Opened} entry dump p x = 0x = 1 Open Close exit dump T T T F F F
20
Why is this code correct? void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; } Closed Opened Error Open Print Open Close Print/Close *
21
When is a branch relevant? Precise answer Precise answer When the value of the branch condition determines the property FSA state Heuristic answer Heuristic answer When the property FSA is driven to different states along the arms of the branch statement
22
Property simulation Modification of path-sensitive analysis Modification of path-sensitive analysis At control flow join points: At control flow join points: States agree on property FSA state? Yes: merge states No: process states separately
23
Example [Opened|dump=T] [Closed] [Closed|dump=T] [Closed|dump=F] [Opened|dump=T,p=T,x=0] [Opened|dump=T,p=F,x=1] [Opened|dump=T] [Closed|dump=F] [Closed|dump=T] [Closed|dump=F] [Closed] entry dump p x = 0x = 1 Open Close exit dump T T T F F F
24
Loop example entry * new++ Close exit new != old T T F F new = old Open [Closed] [Opened|new=old] [Closed|new=old+1] [Opened|new=old] [Closed|new=old] [Closed|new=old+1]
25
Making property simulation work Real programs are complex Real programs are complex Multiple FSAs Aliasing Real code bases are very large Real code bases are very large Well beyond a million lines ESP = ESP = Property Simulation + Multiple FSAs + Property Simulation + Multiple FSAs + Aliasing + Component-wise Analysis Aliasing + Component-wise Analysis
26
Problem: Multiple FSAs void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); if (dump1) fclose(fil1); if (dump2) fclose(fil2); } TransitionSource code pattern Close fclose(e) fclose(e) Open e = fopen(_) e = fopen(_) void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); if (dump1) fclose(fil1); if (dump2) fclose(fil2); } void main () { if (dump1) Open(fil1); if (dump2) Open(fil2); if (dump1) Close(fil1); if (dump2) Close(fil2); } Transition Source code pattern Close fclose(e) fclose(e) Open e = fopen(_) e = fopen(_) Closed Opened Error Open Print Open Close Print/Close *
27
Property simulation, bit by bit void main () { if (dump1) Open; if (dump2) ID; if (dump1) Close; if (dump2) ID; } void main () { if (dump1) ID; if (dump2) Open; if (dump1) ID; if (dump2) Close; } Problem: property state can be exponential Problem: property state can be exponential Solution: track one FSA at a time Solution: track one FSA at a time
28
Property simulation, bit by bit One FSA at a time One FSA at a time + Avoids exponential property state + Fewer branches are relevant + Lifetimes are often short + Smaller memory footprint + Embarassingly parallel − Cannot correlate FSAs
29
Problem: Aliasing void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); fil3 = fil1; if (dump1) fclose( fil3 ); if (dump2) fclose( fil2 ); }
30
ESP Model: Values Have State During execution, the program During execution, the program creates stateful values changes the state of stateful values The programmer defines The programmer defines how values are created (syntactic patterns) how values change state (syntactic patterns) Syntactic expressions are aliases for values Syntactic expressions are aliases for values
31
OPAL Rule Descriptions Object Property Automata Language Object Property Automata Language State Closed State Opened State Error Initial Event Open { _object_ ASTFUNCTIONCALL { ASTSYMBOL “fopen” } { _anyargs_ } } Event Close { ASTFUNCTIONCALL { ASTSYMBOL “fclose” } { _object_ } } Transition _ -> Opened on Open Transition Opened -> Closed on Close Transition Closed -> Error on Close “File already closed”
32
Parameterized transitions void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); fil3 = fil1; if (dump1) fclose( fil3 ); if (dump2) fclose( fil2 ); }
33
Parameterized transitions void main () { if (dump1) { t1 = fopen(dumpFile1,”w”); Open(t1); fil1 = t1; } if (dump2) { t2 = fopen(dumpFile2,”w”); Open(t2); fil2 = t2; } fil3 = fil1; if (dump1) { fclose( fil3 ); Close(fil3); } if (dump2) { fclose( fil2 ); Close(fil2); }
34
Expressions are value aliases void main () { if (dump1) { t1 = fopen(dumpFile1,”w”); Open(t1); fil1 = t1; } if (dump2) { t2 = fopen(dumpFile2,”w”); Open(t2); fil2 = t2; } fil3 = fil1; if (dump1) { fclose( fil3 ); Close(fil3); } if (dump2) { fclose( fil2 ); Close(fil2); }
35
Value-alias analysis Is expression e an alias for value v? Is expression e an alias for value v? ESP uses GOLF to answer this query ESP uses GOLF to answer this query Generalized One Level Flow Generalized One Level Flow Context-sensitive Largely flow-insensitive Millions of lines of code, in seconds
36
Putting it all together Property simulation Property simulation Identify and track relevant execution state Syntactic patterns + value-alias analysis Syntactic patterns + value-alias analysis Identify and isolate individual FSAs One FSA at a time One FSA at a time Bit vector analysis for safety properties
37
Case study: stdio usage in gcc cc1 from gcc version 2.5.3 (Spec95) cc1 from gcc version 2.5.3 (Spec95) Does cc1 always print to opened files? Does cc1 always print to opened files? cc1 is a complex program: cc1 is a complex program: 140K non-blank, non-comment lines of C 2149 functions, 66 files, 1086 globals Call graph includes one 450 function SCC
38
Skeleton of cc1 source FILE *f1, …, *f15; int p1, …, p15; void compileFile() { if (p1) f1 = fopen(…); … if (p15) f15 = fopen(…); restOfComp(); if (p1) fclose(f1); … if (p15) fclose(f15); } void restOfComp() { if (p1) printRtl(f1); … if (p15) printRtl(f15); restOfComp(); } void printRtl(FILE *f) { fprintf(f); }
39
OPAL rules for stdio usage State Uninit State Closed State Opened State Error Initial Event Decl {ASTDECLARATION {_object_ ASTSYMBOL _any_}} Initial Event Open {_object_ ASTFUNCTIONCALL {ASTSYMBOL “fopen”} {_anyargs_}} Event Print {ASTFUNCTIONCALL {ASTSYMBOL “fprintf”} {_object_,_anyargs_}} Event Close {ASTFUNCTIONCALL {ASTSYMBOL “fclose”} {_object_}} Transition _ -> Uninit on Decl Transition _ -> Opened on Open Transition Uninit -> Error on Print “File not opened” Transition Opened -> Opened on Print Transition Closed -> Error on Print “Printing to closed file” Transition Opened -> Closed on Close Transition Closed -> Error on Close “File already closed”
40
Experimental results Precision Precision Verification succeeds for every file handle No transitions to Error ; no false errors Scalability Scalability Ave. per handle: 72.9 seconds, 49.7 MB Single 1GHz PIII laptop with 512 MB RAM We have proved that: We have proved that: Each of the 646 calls to fprintf in the source code prints to a valid, open file
41
Ongoing research Path-sensitive value-alias analysis Path-sensitive value-alias analysis Value-alias sets Expressions that hold tracked value Track value-alias sets during simulation Add value-alias sets to property state When things get complicated, use GOLF Component-wise analysis Component-wise analysis Identify and analyze components Link using less precise analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.