Download presentation
Presentation is loading. Please wait.
Published byAbagail Garnes Modified over 10 years ago
1
Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International Static Analysis Symposium – September 2011 Microsoft Research University of Wisconsin – Madison
2
Valid input Constraints Recorded trace Background Systematic Dynamic Test Generation (= DART) 2 Used in many tools o EXE, CUTE, SAGE, PEX, KLEE, BitScope, Apollo, etc. Run program Symbolically execute program Negate and solve constraints New inputs And the process repeats (possibly forever!)
3
o 200+ machines (since 2008) #1 application for SMT solvers today (CPU usage) o 1 st whitebox fuzzer for security testing 3 SAGE @ Microsoft o 1 billion+ constraints o 100s of apps, 100s of security bugs Example: Win7 file fuzzing Found ~1/3 of all fuzzing bugs o Millions of dollars saved for Microsoft + time/energy for the world
4
Compositional Test Generation Compositional Dynamic Test Generation Compute summaries that can be reused later Avoid retesting Can provide the same path coverage exponentially faster! 4 Systematically executing all feasible paths does not scale
5
Example of Function Summary 5 1 int is_positive(int x) { 2 if (x > 0) return 1; 3 return 0; 4 } Where ret denotes the value returned by the function is_positive
6
Function Summaries 6 Conjunction of constraints on the inputs of f Conjunction of constraints on the outputs of f
7
Must Summaries 7 1 int g(int x, int y) { 2 if ((x > 0) && (hash(y) > 10)) 3 return 1; 4 return 0; 5 } Under-approximate with smaller precondition Assume hash is a complex or unknown function Assume if g is invoked with y = 45, then hash(45) = 987
8
Must Summaries 8 Defined as quadruple lp, P, lq, Q where: Prog Ip lq P summary precondition holding at lp Q summary postcondition holding at lq
9
Some Facts About Summaries Time to be produced: weeks/months 9 Number of summaries: millions Number of instructions executed between lp and lq: can be hundreds of thousands
10
Incremental Compositional Test Generation 10 Have to start from scratch if there is a small code change Incremental compositional test generation As in smart/selective regression testing Reuse summaries still valid in new program Recompute invalid summaries
11
Must Summary Checking 11 Given a valid must summary for a program and a new version of the program, is the summary still valid for the new version? Intraprocedural summaries o locations lp and lq are in a same function f o function f does not return between lp to lq when the summary is generated
12
Some proposals Naïve o For each summary, record executed instructions Too expensive, ~100K of instructions executed Runtime overhead 12 Our proposal o Verify statically what summaries are valid in order to reuse them Less precise than recomputing summaries from scratch, but cheaper
13
Algorithms 1. Static Change Impact Analysis 13 2. Predicate-Sensitive Change Impact Analysis 3. Must Summary Validity Checking Analysis
14
Phase 1: Static Change Impact Analysis Impact analysis of code changes in the control- flow and call graphs of the program 14 Old programNew program Ip lq Ip lq
15
Modified Instructions and Functions Instruction i of a program Prog is modified if: o i is changed or deleted in Prog or o Its ordered set of immediate successors has changed 15 Function f in a program Prog is modified if f: o contains a modified instruction o calls a modified function o calls an unknown function
16
Phase 1: Static Change Impact Analysis 16... Construct call graph for the program 1
17
17... U MMU M IM IU IM S S S S SS Find modified and unknown functions 2 Find indirectly modified and unknown functions 3 Phase 1: Static Change Impact Analysis 4 Map summaries, construct control-flow graphs
18
18... U MMU M IM IU IM S S S S SS Find summaries as valid or invalid 5 Phase 1: Static Change Impact Analysis
19
Phase 2: Predicate-Sensitive Change Impact Analysis 19 Exploit the predicates P and Q in a summary if(x > 0) if (y==0) w = w + 1w = 0w = 1... Ip lq Q: w = 0 Old program Invalidated by Phase 1
20
Phase 2: Predicate-Sensitive Change Impact Analysis 20... if (x > 0) { if (y == 10) w++; // MODIFIED else w = 0; } else { w = 1; // MODIFIED }... Old program void foo() { return; } Ip lq Q: w = 0
21
Phase 2: Predicate-Sensitive Change Impact Analysis 21 Instrumented old program 1 2 4 3 3 void foo() { return; } Ip lq Q: w = 0
22
Phase 2: Predicate-Sensitive Change Impact Analysis Check assertion in instrumented code does not fail for all possible inputs 22 Verification-condition based program verifier o Create logic formula from program with assertions o Check formula validity using theorem prover o If valid, the assertion does not fail in any execution
23
Phase 3: Must Summary Validity Checking 23 Check must summary validity against some code, independently of code changes if(x < 0) if (y < 0) r = 1r = 0w = 1... Ip lq P: x < 0 Old program r = 4 New program Invalidated by Phase 1 and Phase 2
24
Phase 3: Must Summary Validity Checking 24... if (x < 0) { if (y < 0) r = 1; else { r = 4; // r = 0 in old code }... New program void bar() { return; } Ip lq P: x < 0
25
Phase 3: Must Summary Validity Checking 25 reach_lq = false; goto lp;... assume P; if (x < 0) { if (y < 0) r = 1; else { r = 4; // r = 0 in old code } assert(Q); reach_lq = true;... assert(reach_lq); Instrumented new program 1 2 3 4 void bar() { return; } Ip lq P: x < 0
26
Phase 3: Must Summary Validity Checking Check that assertions hold in the instrumented program for all possible inputs 26
27
Result 27 Validated summaries can be reused o Because of soundness Invalidated summaries are discarded and need to be recomputed o New tests are generated to cover their preconditions Algorithms can be used in isolation or in a pipeline
28
Experimental Results 28
29
Implementation Details 29 Map summaries, find modified insts and funcs (C++) Old DLL Summaries Old DLL New DLL Vulcan Produced by SAGE Phase 1 Change Impact Phase 2 Predicate Sensitive Phase 3 Validity Checking Valid/Invalid Summaries Library to statically analyze Windows binaries Used in pipeline or isolation
30
Implementation Details Translator from X86 to BoogiePL Procedure (x86) Vulcan Summary lp,P,lq,Q Sound translation Instrumented BPL file (Phase 2 or Phase 3) Boogie/Z3
31
Benchmarks 31 Image parsers embedded in Windows o ANI, GIF and JPEG Ran SAGE to generate summaries (small sample) o 286 for ANI, 288 for GIF and 517 for JPEG Identified the DLLs involved o 3 for ANI, 4 for GIF and 8 for JPEG Compared old version against a randomly picked newer version o Delta ~1 to 3 years
32
Difference Between Program Versions 32 Modified functions: 3% - 10%Indirectly modified functions: 30% - 45% Unknown functions: 27% - 37% Indirectly unknown functions: 60% - 74%
33
Applying Phases in Isolation 33 # Validated Summaries 58%85%30%69%92% 31% 61%94%33% Total Validated: 256/286 (90% ) Total Validated: 274/288 (95% ) Total Validated: 501/517 (97% ) Phase 1: Change Impact Phase 2: Predicate Sensitive Phase 3: Validity Checking
34
Applying Phases in Pipeline Fashion Phase 1 Phase 2 Phase 3 34 # Validated Summaries 58%27%4% Total Validated: 256/286 (90% ) 69%25%1% 61%35%1% Total Validated: 274/288 (95% ) Total Validated: 501/517 (97% ) Phase 1: Change Impact Phase 2: Predicate Sensitive Phase 3: Validity Checking
35
Running Time (Isolation) 35 # Minutes Phase 1: Change Impact Phase 2: Predicate Sensitive Phase 3: Validity Checking
36
Running Time Phase 1 Phase 2 Phase 3 36 # Minutes 43 min28min41min Preliminary results show that statically validating must summaries is up to 20 times faster than recomputing them! Phase 1: Change Impact Phase 2: Predicate Sensitive Phase 3: Validity Checking
37
Summary Formulated the problem of statically validating must summaries 37 Demonstrated the effectiveness of static must summary checking o Validated hundreds of must summaries in minutes Described three approaches for validating must summaries Presented a preliminary evaluation on three large Windows image parsers
38
Questions? 38 Map summaries, find modified insts and funcs (C++) Old DLL Summaries Old DLL New DLL Vulcan Phase 1 Change Impact Phase 2 Predicate Sensitive Phase 3 Validity Checking Valid/Invalid Summaries
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.