Model Checking Java Programs (Java PathFinder) CS 510/10 Model Checking Java Programs (Java PathFinder) Slides partially compiled from the NASA JavaPathFinder project and E. Clarke’s course material
Java PathFinder JPF is an explicit state software model checker for Java bytecode JPF is a Java virtual machine that executes your program not just once (like a normal VM), but theoretically in all possible ways, checking for property violations like deadlocks or unhandled exceptions along all potential execution paths.
Symbolic Model Checking Program Analysis Engine CNF SAT Solver Claim Cbmc, smv SAT (counterexample exists) UNSAT (no counterexample found)
Explicit State Model Checking The program is indeed executing jpf <your class> <parameters> Very similar to “java <your class> <parameters> Execute in a way that all possible scenarios are explored Thread interleaving Undeterministic values (random values) Concrete input is provided A state is indeed a concrete state, consisting of Concrete values in heap/stack memory Jpf, spin, slam
JPF Status developed at the Robust Software Engineering Group at NASA Ames Research Center currently in it’s fourth development cycle v1: Spin/Promela translator - 1999 v2: backtrackable, state matching JVM - 2000 v3: extension infrastructure (listeners, MJI) - 2004 v4: symbolic execution, choice generators - 4Q 2005 open sourced since 04/2005 under NOSA 1.3 license: <javapathfinder.sourceforge.net> it’s a first: no NASA system development hosted on public site before 11100 downloads since publication 04/2005
An Example
One execution corresponds to one path. An Example (cont.) One execution corresponds to one path.
JPF explores multiple possible executions GIVEN THE SAME CONCRETE INPUT
Another Example
Choice points are those that jpf VM selects which thread to proceed Choice points are those that jpf VM selects which thread to proceed. They are usually thread start, join points, synchronization points, shared variable reference points. Why? Because they are the points that matter. We will come back to this issue when we are going to talk about partial order reduction. For this moment, students should understand that either main thread or the racer thread gets executed at each choice point. For example, if you have two threads. T1 { int i, j; i=10; j= 20; } T2 { int x, y; y=20;
Two Essential Capabilities Backtracking Means that JPF can restore previous execution states, to see if there are unexplored choices left. While this is theoretically can be achieved by re-executing the program from the beginning, backtracking is a much more efficient mechanism if state storage is optimized. State matching JPF checks every new state if it already has seen an equal one, in which case there is no use to continue along the current execution path, and JPF can backtrack to the nearest non-explored non-deterministic choice Heap and thread-stack snapshots.
The Challenge
The Challenge (cont.) State Explosion!! Random variables A program that potentially has infinite long execution State Explosion!!
JPF’s Solution Configurable search strategy Host VM Execution Directing the search so that defects can be found quicker A debugging tool instead of a “proof” system. User can easily develop his/her own strategy Host VM Execution Delegate execution to the underlying host VM (no state tracking). Reducing state storage State collapsing Premise: only a tiny part of the state is changed upon each transaction. (e.g. a single stack frame) Dividing a state into components, use hashtable to index a specific value for a component.
Solution- State Collapsing For example, x, y, z are static fields, they comprise a component. That is, a hash value is created for a unique value assignment for the component. If i, j, k are fields in a class C, a hash value is used for a dynamic object, representing that object. Question: why do you use a vector instead of an hash value to represent a state? In one extreme, we can use one hash value to represent the whole state. But that is useless as each time a single variable is changed, a new entry (the entire image) of the state has to be stored in the pool.
Solution (3) – State Reduction Orthogonal (our focus) State Abstraction Partial Order Reduction
Abstraction Eliminate details irrelevant to the property Obtain simple finite models sufficient to verify the property Disadvantage Loss of Precision: False positives/negatives
Data Abstraction h S S’ Abstraction Function h : from S to S’
Data Abstraction Example Abstraction proceeds component-wise, where variables are components Even Odd …, -2, 0, 2, 4, … x:int …, -3, -1, 1, 3, … Pos Neg Zero …, -3, -2, -1 y:int 1, 2, 3, …
How do we Abstract Behaviors? Abstract domain A Abstract concrete values to those in A Then compute transitions in the abstract domain
Data Type Abstraction Code Abstract Data domain int x = 0; if (x == 0) x = x + 1; we transform the code so that to operate on the abs domain and it looks like this; here the concrete type int was replaced by abs type signs, concrete constants 0 and 1 were replaced with abs ct 0 and pos; and primitive ops on ints were replaced with calls to some methods that implement the abs.ops.that manipulate abstract values. Ex : equality operator was replaced with a call to method signs.eq and + was replace by signs.add . So, how do we apply this abstraction technique to the DEOS example ? We have to decide which variables to abstract, what abstrations to use and then we have to effectively transform the system to encode the abstractions. The translation s homomorphic. h(v*u)=h(v)@h(u), * is an operator on the original domain and @ is an operator on the abstract domain. (n<0) : NEG (n==0): ZERO (n>0) : POS Signs NEG POS ZERO Signs x = ZERO; if (Signs.eq(x,ZERO)) x = Signs.add(x,POS);
Existential/Universal Abstractions Make a transition from an abstract state if at least one corresponding concrete state has the transition. Abstract model M’ simulates concrete model M Universal Make a transition from an abstract state if all the corresponding concrete states have the transition.
Existential Abstraction (Over-approximation) Use the x=x+1 example (abstracted to the Signs domain) to explain the idea. Red is a faulty state x=x+1; y=10/x This makes x=0 a faulty state. I h I S’
Universal Abstraction (Under-Approximation) abstract state 1 does not have a self-loop, last state does not have a successor abs state 1 does have a transition to the next state. no self loop on the red state h I S’
Guarantees from Abstraction Assume M’ is an abstraction of M Strong Preservation: P holds in M’ iff P holds in M Weak Preservation: P holds in M’ implies P holds in M
Guarantees from Exist. Abstraction Let φ be a hold-for-all-paths property M’ existentially abstracts M M’ Preservation Theorem M’ ⊨ φ M ⊨ φ M Converse does not hold M’ ⊭ φ M ⊭ φ M’ ⊭ φ : counterexample may be spurious
Spurious counterexample in Over-approximation Deadend states I Bad States I Failure State f
Refinement Problem: Deadend and Bad States are in the same abstract state. Solution: Refine abstraction function. The sets of Deadend and Bad states should be separated into different abstract states.
Refinement h’ Refinement : h’
Automated Abstraction/Refinement Good abstractions are hard to obtain Automate both Abstraction and Refinement processes Counterexample-Guided AR (CEGAR) Build an abstract model M’ Model check property P, M’ ⊨ P? If M’ ⊨ P, then M ⊨ P by Preservation Theorem Otherwise, check if Counterexample (CE) is spurious Refine abstract state space using CE analysis results Repeat
Counterexample-Guided Abstraction-Refinement (CEGAR) Build New Abstract Model Model Check M M’ Pass No Bug Fail Check Counterexample Obtain Refinement Cue Spurious CE Real CE Bug