Willem Visser Corina Pasareanu and Radek Pelanek

Slides:

Advertisements

Similar presentations

1 Verification by Model Checking. 2 Part 1 : Motivation.

Advertisements

Author: Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, Thomas Ball MIT CSAIL.

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.

Symbolic Execution with Mixed Concrete-Symbolic Solving

CS 267: Automated Verification Lecture 8: Automata Theoretic Model Checking Instructor: Tevfik Bultan.

Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } P1() Challenge: Correct and Efficient Synchronization { ……………………………

Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.

Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.

Parallel Symbolic Execution for Structural Test Generation Matt Staats Corina Pasareanu ISSTA 2010.

1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)

1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.

Symbolic execution © Marcelo d’Amorim 2010.

Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.

CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.

A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.

Hybrid Concolic Testing Rupak Majumdar Koushik Sen UC Los Angeles UC Berkeley.

Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.

CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.

Abstractions. Outline Informal intuition Why do we need abstraction? What is an abstraction and what is not an abstraction A framework for abstractions.

UnInformed Search What to do when you don’t know anything.

DART Directed Automated Random Testing Patrice Godefroid, Nils Klarlund, and Koushik Sen Syed Nabeel.

272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.

CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.

Verification of Java Programs using Symbolic Execution and Loop Invariant Generation Corina Pasareanu (Kestrel Technology LLC) Willem Visser (RIACS/USRA)

Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.

Lazy Abstraction Jinseong Jeon ARCS, KAIST CS750b, KAIST2/26 References Lazy Abstraction –Thomas A. Henzinger et al., POPL ’02 Software verification.

Finding Feasible Counter-examples when Model Checking Abstracted Java Programs Corina S. Pasareanu, Matthew B. Dwyer (Kansas State University) and Willem.

Model Counting A Quest for Nails 2 Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.

Test Input Generation for Java Containers using State Matching Willem Visser Corina Pasareanu and Radek Pelanek Automated Software Engineering Group NASA.

Model Checking Java Programs using Structural Heuristics

Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.

Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.

Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.

CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.

Concrete Model Checking with Abstract Matching and Refinement Corina Păsăreanu QSS, NASA Ames Research Center Radek Pelánek Masaryk University, Brno, Czech.

( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.

CSE 331 SOFTWARE DESIGN & IMPLEMENTATION SYMBOLIC TESTING Autumn 2011.

Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin University.

Lecture 3: Uninformed Search

CSG3F3/ Desain dan Analisis Algoritma

Software Testing.

Control Flow Testing Handouts

Graph Coverage for Specifications CS 4501 / 6501 Software Testing

Top-down parsing cannot be performed on left recursive grammars.

Reasoning About Code.

Outline of the Chapter Basic Idea Outline of Control Flow Testing

Structural testing, Path Testing

White-Box Testing.

Artificial Intelligence (CS 370D)

White-Box Testing Using Pex

Symbolic Implementation of the Best Transformer

Uninformed Search Introduction to Artificial Intelligence

Software Testing (Lecture 11-a)

Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.

Objective of This Course

White-Box Testing.

Programming Languages 2nd edition Tucker and Noonan

Problem Solving and Searching

What to do when you don’t know anything know nothing

Searching for Solutions

Problem Solving and Searching

Automatic Test Generation SymCrete

Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.

Symbolic Execution and Test-input Generation

CUTE: A Concolic Unit Testing Engine for C

Outline System architecture Current work Experiments Next Steps

UNINFORMED SEARCH -BFS -DFS -DFIS - Bidirectional

Programming Languages 2nd edition Tucker and Noonan

Basic Search Methods How to solve the control problem in production-rule systems? Basic techniques to find paths through state- nets. For the moment: -

Presentation transcript:

Going from Concrete to Symbolic Model Checking via Predicate Abstraction Willem Visser Corina Pasareanu and Radek Pelanek Automated Software Engineering Group NASA Ames Research Center

Overview Abstraction Lightweight framework for testing Classic over-approximation based Counter-example based refinement Under-approximation based Refinement based on abstraction’s exactness Lightweight framework for testing Test generation environment built around JPF with symbolic execution Measure predicate coverage Evaluate against other test-case generation methods Java Container classes

Predicate Abstraction 1: x = 2; 2: while (x>0) 3: x = x - 1; 4: assert false; Abstraction Mapping p = (x>0) 1: p = T; 2: while (p) 3: p = !p ? F : T | F;

Abstraction Mapping For a,a’ in 2{preds}: if wp(a’,T) /\ a add transition a → a’ may transition must transition a a’ T wp(a’,T) a → a’ a’ T wp(a’,T) a a → a’

Example Abstraction 1: p = T; 1: x = 2; 2: while (p) 2: while (x>0) 3: x = x - 1; 4: assert false; Abstraction Mapping p = (x>0) 1: p = T; 2: while (p) 3: p = !p ? F : T | F; {x – 1 > 0} x = x – 1 {p} {x – 1 <= 0} x = x – 1 {!p} wp(!p,x=x-1) /\ p add p → !p wp(p,x=x-1) /\ p add p → p wp(!p,x=x-1) /\ !p add !p → !p wp(p,x=x-1) /\ !p !p → wp(!p,x=x-1) !p→ !p is must trans

Infeasible Counter Example Refinement 1: p = T; 2: while (p) 3: p = !p ? F : T | F; 4: assert false; Infeasible Counter Example 1,2,3(F),2,4 1: x=2 {x>0}; 2: x=2 {x>0}; 3: x=1 {x<=0} {x > 1} x = x -1 {x > 0} may must X>0 X<=0 X>0 X>1 X<=1 X>0 X<=1 X<=0

Let’s Go Outside the Box Rather than over-approximate and refine, we under-approximate and refine Clearly complements existing techniques If we restrict ourselves only to feasible behaviors when under-approximating then all safety property violations will be preserved Build on top of classic explicit-state model checking infrastructure

Classic Explicit-State Search PROCEDURE dfs() { s = top(Stack); FOR all transitions t enabled in s DO s' = successor(s) after executing t; IF s' NOT IN VisitedStates THEN Enter s' into VisitedStates; Push s' onto Stack; dfs(); END END; Pop s from Stack; } INIT { Enter s0 into VisitedStates; Push s0 onto Stack;

Explicit-State (1-step) αSearch PROCEDURE dfs() { s = top(Stack); FOR all transitions t enabled in s DO s' = successor(s) after executing t; IF α(s‘) NOT IN VisitedStates THEN Enter α(s‘) into VisitedStates; Push s' onto Stack; dfs(); END END; Pop s from Stack; } INIT { Enter α(s0) into VisitedStates; Push s0 onto Stack;

αSearch Map concrete states to abstract states for state storing 1: x = 2; 2: while (x>0) 3: x = x - 1; 4: assert false; 1,p 2,p 3,p Abstraction Mapping p = (x>0) Under-approximation of the behaviors Always traverse only feasible paths

Concrete, May & Must May Transitions p = (x < 2) D,1 D,0 E,1 E,2 B,1 C,0 A,p D,p E,p E,!p B,p C,p May Transitions p = (x < 2) A,p D,p B,p C,p Must Transitions p = (x < 2) Concrete

Concrete & αSearch Abstraction Search p = (x<2) A,p A,0 D,1 D,0 E,1 E,2 B,1 C,0 A,0 B,p C,p B,1 C,0 Abstraction Search p = (x<2) D,p D,1 D,0 Transition not “exact” E,!p E,2

Refinement & αSearch A,0 D,1 D,0 E,1 E,2 B,1 C,0 A,p,q A,p,!q A,!p,!q After Refinement Step p = (x<2); q = x < 1 D,1 D,0 E,2 E,1

Example 1: x = 2; 2: while (x>0) 3: x = x - 1; 4: assert false; Abstraction Mapping p = (x>0)

Refinement Check if the induced abstract transition is a must transition? If not, add new predicates Only 1 DP call 1,p 2,p 3,p x = x – 1 {x – 1 > 0} wp(p,x=x-1) {x > 0} Add x > 1 to abstraction predicates and repeat search Globally for all transitions Locally only for the transition (location) it refines

Predicate Abstraction αSearch … … Showing property holds Over-approximation based Counter-example driven refinement Expensive computation to calculate abstraction Finding defects Under-approximation based Abstraction driven refinement Trivial computation to calculate abstraction mapping

Issue αSearch tries to compute a finite reachable bisimulation quotient this is only possible if a finite reachable bisimulation quotient exists unreachable reachable wp(p,T) T p if new predicates are infinitely required to refine the unreachable area the algorithm will not terminate

… Example x = 0; y = 0; while (y >= 0) y = x + y; y >= 0 p,q,r y >= 0 x+y >= 0 2x+y >= 0 … p,q y >= 0 x+y >= 0 y >= 0 p The refinement only refines the unreachable state space!

Modified Bakery Search Order Matters!! while true { x = y; x = x + 1; wait (x<=y); x = 0; } while true { y = x; y = y + 1; wait (y<x); y = 0; } 1st iteration 18 concrete states and 12 abstract x+1 <= y, x <= y+1 and y >= 0 2nd iteration 26 concrete state and 19 abstract x+2<=y, y>=1 and x <= 1 3rd iteration 44 concrete states and 32 abstract y <= 1, x <= 0 and y >= 2 4th iteration 48 concrete, 36 abstract BFS 1st iteration 14 concrete states and 20 abstract x+1 <= y, x <= y+1, y >= 0 and x <= 0 2nd iteration 29 concrete state and 21 abstract x+2<=y, y<= 0, x <= -1 and x <= 1 3rd iteration 45 concrete states and 33 abstract DFS Search Order Matters!!

Symbolic Execution and αSearch Current implementation is for a simple input language oCaml using Simplify as a decision procedure We would like to integrate the technique in Java Pathfinder (JPF) that supports symbolic execution (using the Omega Library) To allow application to programs with complex data structures (objects)

From Concrete to Symbolic X=1, Y = 0 X > Y Concrete Behavior Symbolic Behavior

Possible Approach Execute the concrete program on valid inputs Collect all predicates in path condition Solve constraints over all combinations of these predicates Use results as inputs for step 1 When no new predicates are found, or, if an error is found, terminate

Example public static void method(int x, int y) { method(1,1) + {true} public static void method(int x, int y) { if ((x > 0) && (y < 10)) { if (y < 5) { … } else { … } } else { if (x > 0) { … } else { … } } } x > 0 && y < 10 y < 5 end p1,!p2 → method(1,6) !p1,p2 → method(-1,1) !p1,!p2 → method(-1,6) Solve Constraints p1 = x > 0 && y < 10 p2 = y < 5 method(1,1) + {p1,p2}

Example (2) public static void method(int x, int y) { method(1,6) + {p1,!p2} public static void method(int x, int y) { if ((x > 0) && (y < 10)) { if (y < 5) { … } else { … } } else { if (x > 0) { … } else { … } } } x > 0 && y < 10 y < 5 end p1 = x > 0 && y < 10 !p2 = y < 5 method(1,6) + {p1,!p2}

Example (4) public static void method(int x, int y) { method(-1,1) + {!p1,p2} public static void method(int x, int y) { if ((x > 0) && (y < 10)) { if (y < 5) { … } else { … } } else { if (x > 0) { … } else { … } } } x > 0 && y < 10 x > 0 end !p1 = x > 0 && y < 10 !p3 = x > 0 Solve Constraints !p1,p3 → method(1,11) method(-1,1) + {!p1,p2,!p3}

Example (3) public static void method(int x, int y) { method(1,11) + {!p1,p3} public static void method(int x, int y) { if ((x > 0) && (y < 10)) { if (y < 5) { … } else { … } } else { if (x > 0) { … } else { … } } } x > 0 && y < 10 x > 0 end !p1 = x > 0 && y < 10 p3 = y < 5 method(1,11) + {!p1,p3}

End of Part One Showed under-approximation based search with refinement Backward weakest precondition based Forward symbolic execution based Part Two Rather than automated refinement we use user-provided abstractions Motivation is to generate test-cases to achieve high behavioral coverage for Java container classes

Explicit-State (1-step) αSearch PROCEDURE dfs() { s = top(Stack); FOR all transitions t enabled in s DO s' = successor(s) after executing t; IF α(s‘) NOT IN VisitedStates THEN Enter α(s‘) into VisitedStates; Push s' onto Stack; dfs(); END END; Pop s from Stack; } INIT { Enter α(s0) into VisitedStates; Push s0 onto Stack;

General Idea SUT ENV (m,n) API m is the seq. length of API calls … & n is the number of values used in the parameters of the calls API … put(v) del(v) Evaluate different techniques for selecting test-cases from ENV(m,n) to obtain maximum coverage

Predicate Coverage Cover all combinations of a given set of predicates at each branch in the code Red-Black Tree Predicates root = null, e.left = null, e.right = null, e.parent = null, e.color = BLACK

Techniques Considered Random selection Classic model checking State matching on complete state Abstraction search State matching on abstract (partial) state Symbolic Execution Complete matching using subsumption checks Abstract matching

minor instrumentation Framework SUT with minor instrumentation ENV Coverage Manager Abstraction Mapping + State Storage TestListener JPF

Sample Output Test case number 77 for '15,L+R+P-REDroot': Branch Number Predicate Values Unique ID for the test Test case number 77 for '15,L+R+P-REDroot': put(0);put(4);put(5);put(1);put(2);put(3);remove(4); Test-case to achieve above coverage Test case number 7 for '32,L-R-P+RED': X2(0) == X1(0) && X2(0) < X0(1) && X1(0) < X0(1) put(X0);put(X1);remove(X2); put(1);put(0);remove(0); Concrete Symbolic Path Condition with solutions Symbolic TC

Environment Skeleton M : sequence length N : parameter values A : abstraction used for (int i = 0; i < M; i++) { int x = Verify.random(N - 1); switch (Verify.random(1)) { case 0: put(x); break; case 1: remove(x); break; } } Verify.ignoreIf(checkAbstractState(A));

Symbolic Environment Skeleton M : sequence length A : abstraction used for (int i = 0; i < M; i++) { SymbolicInteger x = new SymbolicInteger(“X“+i); switch (Verify.random(1)) { case 0: put(x); break; case 1: remove(x); break; } } Verify.ignoreIf(checkAbstractState(A));

Abstraction Search Map state to an abstract version and backtrack if the abstract state was seen before, i.e. discard test-case Mapping can be lossy or not Abstraction mappings can be created by the user/tester Default abstraction mappings are provided

Default Mappings Structure of the heap of the program e.g. structure of the containers Structure augmented with non-data fields Structure augmented with symbolic constraints on the data in the structure This requires checking constraint subsumption

Linearization Comparing Structures 1 1 1 2 3 -1 -1 4 -1 -1 5 -1 -1 1 2 3 -1 -1 4 -1 -1 5 -1 -1 2 5 2 5 3 4 3 4 1 1 1 2 3 -1 -1 4 -1 -1 5 -1 -1 1 2 3 -1 -1 4 -1 5 -1 -1 -1 2 2 5 3 4 3 4 5

Linearization + Mapping 1b 2b 3r -1 -1 4r -1 -1 5b -1 -1 1b 2r 3r -1 -1 4r -1 -1 5r -1 -1 1 1 2 5 2 5 3 4 3 4 Linearization takes a mapping object as parameter to indicate how each node in the heap should be linearized. In the example above each node gets, besides the unique identifier, a mapping of “r” if the original structure had a red node and “b” if the original structure had a black node in that position. If we also added the key values for each node the linearization might have looked something like: 1b6 2b4 3r3 -1 -1 4r5 -1 -1 5b7 -1 -1

+ Symbolic Execution Symbolic State x1 > x2 & x2 > x3 & Symbolic Constraints Shape

+ + Subsumption Checking If only it was this simple! x1 > x2 &

Getting Ready for Checking Existential Elimination PC s1 < s2 & s4 > s3 & s4 < s1 & s4 < s5 & s7 < s2 & s7 > s1 + s4 x2 x5 s2 s3 x3 x4 s5  s1,s2,s3,s4,s5 such that x1 = s1 & x2 = s4 & x3 = s3 & x4 = s5 & x5 = s2 & PC x1 > x2 & x2 > x3 & x2 < x4 & x5 > x1

Bidirectional Subsumption Checking If new => old backtrack If old => new new is more general than old replace old with new to increase chances of getting a match in the future Continue on path from new, i.e. don’t backtrack Ultimately for each shape we want to use disjunction of constraints Small technicality prevents us – bug in omega lib

Evaluation Red-Black Trees Out of Memory runs are not reported Breadth-first Search unless stated Sequence Length = Values for the non-symbolic searches First compare under Branch Coverage

Exhaustive Techniques Branch Coverage Seq Cov Len Time Mem Full MC 7 39 4.3 536 584 S+C+V 10.635 17.47 Sym – S+Sub 14.201 16.95 Optimal Branch Coverage is 39

Under-Approximation Techniques Branch Coverage Seq Cov Len Time Mem S 21 39 6.1 57.353 72.07 S+C 18 5.8 32.577 21.16 Sym - S 7 4.3 10.054 15.43 Sym – S+C 11.998 10.76 Random 9 40.429 3.06 Optimal Branch Coverage is 39

Exhaustive Techniques Predicate Coverage Seq Cov Len Time Mem Full MC 7 79 5.2 543 309 S+C+V 10 95 5.7 350 228 Sym – S+Sub 11 102 6.1 222 117 Optimal Predicate Coverage is 106

Under-Approximation Techniques Predicate Coverage Seq Cov Len Time Mem S 25d 106 21.7 90 13.31 S+C 30 8.3 354 100 Sym - S 12 6.1 230 123.27 Sym – S+C 104 6.2 356 138 Random 60 30.1 61.459 7.74 Optimal Predicate Coverage is 106

Observations For a simple coverage such as branch coverage, all the techniques work well, including the exhaustive ones But making the coverage more “behavioral”, even by a small increment, kills off the exhaustive techniques

Observations Full Blown Model Checking doesn’t work here Its close cousin, that only looks at the relevant state at the relevant time, scales much better Branch - full coverage after: MC: 536s & 584Mb Complete: 10s & 17Mb Predicate – best coverage after: MC: 79 covered with 543s & 309Mb Complete: 95 covered with 350s & 228Mb

Observations Symbolic techniques have a slight edge over concrete ones for exhaustive analysis Comparing for Predicate Coverage (10) Full Concrete(95): 350s & 228Mb Full Symbolic(95): 123s & 62Mb Current results indicate symbolic under-approximation based search is less efficient than concrete Further experimentation required

Observations Random Search? Seems to work rather well here It will always have an edge on memory, since it uses almost none It will most likely have an edge on speed, since it needs to do little additional work – it will however redo work often It will in general do worse on test-case length, since it requires longer sequences to achieve more complex coverage

Observations Search Order Matters for the lossy techniques BFS is inherently better than DFS On occasion though it is the other way round

Conclusions & Future Work Showed how predicate abstraction can be used for an under-approximation based search with refinement Showed how a lightweight variant, where the abstraction mapping is given and no refinement is done, can be used for bug-finding and test-case generation Goal: Derive predicates for analyzing containers automatically through the use of symbolic execution during refinement Can we derive shape predicates automatically?