Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bandera: Extracting Finite-state Models from Java Code

Similar presentations


Presentation on theme: "Bandera: Extracting Finite-state Models from Java Code"— Presentation transcript:

1 Bandera: Extracting Finite-state Models from Java Code
Faculty Students and Post-docs James Corbett Matthew Dwyer John Hatcliff Shawn Laubach Corina Pasareanu Robby Hongjun Zheng Roby Joehanes Ritesh Desai Venkatesh Ranganath Oksana Tkachuk

2 Goal: Increase Software Reliability
Trends: Size, complexity, concurrency, distributed Cost of software engineer………………………. Cost of CPU cycle……………………………….. The broad goal of the Bandera project is to increase software reliability, and the approach we are taking is influence by several current trends. First of all, there’s been a lot of discussion the past few days about how software is changing: it’s getting bigger, it’s becoming more concurrent and distributed. The problem is that traditional testing techniques have trouble dealing with things like concurrency, because it’s often difficult to choose your test cases so that the threads in your system are forced through all possible interleavings. Let’s go to another trend: what about the cost of a software engineer --- the person that we pay to develop, test, and maintain our systems? Well, as everyone knows, the cost of an engineer is going through the roof. About the only good news that we have is that Moore’s law continues to hold, and the cost of a CPU cycle continues to drop. If you put these trends together, this suggests that there is a future in automated fault detection. That is, rather than pay those increasingly expensive warm bodies to track down difficult to find bugs, we might find it more beneficial to give up a few engineers and instead buy cheaper faster CPU’s and have them automatically check the code using various forms of static analysis. Future: Automated Fault Detection

3 The Dream OK Program or Error trace Checker Requirement
void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; OK Program or For many people, the dream of automated checking would go something like this: take a program and some software requirements rendered in a particular formalism, submit the software and the requirements to the checker, and the checker would run and then answer, “OK” the requirements are satisfied, or “You have an error, and by the way, here is an error trace --- a path through your program that lead to the requirement violation. Time: 35 secs Error trace Checker Property 1: … Property 2: … Requirement

4 Model Checking OK Finite-state model or Error trace Model Checker
We are interested in model-checking because it works a lot like the dream: However, instead of processing source code, a model-checker processes a finite-statemachine that abstractly models the computation of some system. Instead of a software specification, the checker processes a specification in temporal logic that describes some invariant or some event sequencing property that you would like to hold in the model. <click> When feed these to the model-checker, the checker will explore all possible paths in the finite-state machine to see if the specification is satisfied. The model-checker will then answer OK, your model satisfies the temporal specification, or No, the model doesn’t satisfy the specification, and by the way, here’s a listing of the transitions in the model that cause the specification to be violated. Time: 1 min Error trace Model Checker (F W) Line 5: … Line 12: … Line 15:… Line 21:… Line 25:… Line 27:… Line 41:… Line 47:… Temporal logic formula

5 Spin Example L1 L4 L2 L3 L5 Fragment of Alternating Bit Protocol ?b1
proctype A(chan in, out) { byte mt; /* message data */ bit vr; L1: mt = (mt+1%MAX); out!mt,1; goto L2; L2: in?vr; if :: (vr == 1) goto L1 :: (vr == 0) goto L3 :: printf(“Error”); goto L5 fi; L3: out!mt,1; L4: in?vr; :: goto L1; L5: out!mt,0; goto L4 } L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Spin Example? Fragment of Alternating Bit Protocol

6 Explicit State Model-checking
Explored State-Space (computation tree) Conceptual View L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Fragment of Alternating Bit Protocol Spin Example? [L1, (mt1, vr1), ….] Pending Seen Before Implementation

7 Explicit State Model-checking
Conceptual View Explored State-Space (computation tree) L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Fragment of Alternating Bit Protocol [L1, (mt1, vr1), ….] Spin Example? Implementation Pending Seen Before [L2, (mt2, vr2), ….] [L1, (mt1, vr1), ….]

8 Explicit State Model-checking
Conceptual View Explored State-Space (computation tree) L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Fragment of Alternating Bit Protocol [L1, (mt1, vr1), ….] [L2, (mt2, vr2), ….] Spin Example? Implementation Pending Seen Before [L3, (mt3, vr3), ….] [L1, (mt1, vr1), ….] [L5, (mt5, vr5), ….] [L2, (mt2, vr2), ….] [L1, (mt1’, vr1’), ..]

9 Explicit State Model-checking
Conceptual View Explored State-Space (computation tree) L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Fragment of Alternating Bit Protocol [L1, (mt1, vr1), ….] [L2, (mt2, vr2), ….] [L3, (mt3, vr3), ….] Spin Example? Implementation Pending Seen Before [L5, (mt5, vr5), ….] [L1, (mt1, vr1), ….] [L1, (mt1’, vr1’), ..] [L2, (mt2, vr2), ….] [L3, (mt3, vr3), ….]

10 Explicit State Model-checking
Conceptual View Explored State-Space (computation tree) L1 L4 L2 L3 L5 ?b1 ?err ?b0 !a1 ?a1 !a0 Fragment of Alternating Bit Protocol [L1, (mt1, vr1), ….] [L2, (mt2, vr2), ….] [L3, (mt3, vr3), ….] [L5, (mt5, vr5), ….] Spin Example? Implementation Pending Seen Before [L1, (mt1’, vr1’), ..] [L1, (mt1, vr1), ….] [L2, (mt2, vr2), ….] [L3, (mt3, vr3), ….] [L5, (mt5, vr5), ….]

11 Why Try to Use Model Checking for Software?
Automatically check, e.g., invariants, simple safety & liveness properties absence of dead-lock and live-lock, complex event sequencing properties, “Between the window open and the window close, button X can be pushed at most twice.” In contrast to testing, gives complete coverage by exhaustively exploring all paths in system, It’s been used for years with good success in hardware and protocol design So why might model-checking be useful for checking software systems? <click> A lot of people like model-checking because you can use it to automatically check things like invariants, simple safety and liveness properties, absence of deadlock and livelock, and rather complex event-sequencing properties. For example, if you are modeling the actions in a GUI, you might use a model-checker to check a requirement that: between the widow open even and the window close event, button X can be pushed at most twice. Model-checking is also nice, because it contrast to testing, it gives complete coverage by exhaustively exploring all paths through the transition system. Finally, model checking has been used for years with good success in hardware and protocol design. In summary, this suggests that model-checking can complement existing software quality assurance techniques. Time: 1 min This suggests that model-checking can complement existing software quality assurance techniques.

12 What makes model-checking software difficult?
OK Error trace or Finite-state model Temporal logic formula Model Checker (F W) Line 5: … Line 12: … Line 15:… Line 21:… So we have these model-checkers out there that have been used successfully in other domains. What makes it difficult to use these to check software systems? I’ve listed four problems on this slide, and I’ll go through each one in detail. <click> First, let’s take the model construction problem --- if we use existing checkers, where do we get the input model? Time: 30 secs Problems using existing checkers: Model construction State explosion Property specification Output interpretation

13 Model Construction Problem
void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; Gap Model Checker Program Model Description Semantic gap: The dream would be to simply feed the program to the checker. But existing model-checkers don’t process software source code, they process model-description languages. <click> And unfortunately, there is a large semantic gap between the two. Programming languages have complex features like methods, inheritance, dynamic thread and object creation, and exceptions. Model description languages are tailored for describing automata. In the past when people want to model-check source code, they’ve built a model by hand using clever encodings. Of course, this is error prone and won’t scale. Time: 40 secs. Programming Languages methods, inheritance, dynamic creation, exceptions, etc. Model Description Languages automata

14 What makes model-checking software difficult?
OK Error trace or Finite-state model Temporal logic formula Model Checker (F W) Line 5: … Line 12: … Line 15:… Line 21:… The next problem deals with the second input to the model-checker -- the temporal logic specification. Time: 10 sec Problems using existing checkers: Model construction State explosion Property specification Output interpretation

15 Property Specification Problem
Difficult to formalize a requirement in temporal logic “Between the window open and the window close, button X can be pushed at most twice.” …is rendered in LTL as... Many people, even experienced users often find it difficult to formalize requirements in temporal logic. <click> For example, let’s take the GUI requirement from a couple of slides ago. If we had to formalize this requirement in Linear Temporal Logic, we would write something like this. And you can see that this is a fairly complex formula and the structure is rather unintuitive. Time: 30 secs []((open /\ <>close) -> ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ ((!pushX /\ !close) U (close \/ (!pushX U close))))))))))

16 Property Specification Problem
Forced to state property in terms of model rather than source: We want to write source level specifications... Heap.b.head == Heap.b.tail We are forced to write model level specifications... (((_collect(heap_b) == 1)\ && (BoundedBuffer_col.instance[_index(heap _b)].head == BoundedBuffer_col.instance[_index(heap _b)].tail) )\ || ((_collect(heap _b) == 3)\ && (BoundedBuffer_col_0.instance[_index(heap _b)].head == BoundedBuffer_col_0.instance[_index(heap _b)].tail) )\ || ((_collect(heap _b) == 0) && TRAP))

17 What makes model-checking software difficult?
OK Error trace or Finite-state model Temporal logic formula Model Checker (F W) Line 5: … Line 12: … Line 15:… Line 21:… The next problem is the state explosion problem which is related to the cost of actually running the model-checker on the inputs. Time: 10 secs Problems using existing checkers: Model construction State explosion Property specification Output interpretation

18 State Explosion Problem
Cost is exponential in the number of components Bit x1,…,xN 2^N states Moore’s law and algorithm advances can help Holzmann: 7 days (1980) ==> 7 seconds (2000) Explosive state growth in software limits scalability We already noted that model-checking explores all possible paths through the given transition system and so the cost is exponential in the number of components. <click> For example, if we have a system with n components and each component contains only a single bit, we have a possible state-space of 2^N states. However, the situation is not hopeless, thanks to Moore’s law and improvements in algorithms. Gerard Holzmann has noted that a system that was taking 7 days to check in 1980 can now be checked in 7 seconds. However, there is explosive state growth for even small systems, and this is a huge problem that must be attacked on multiple fronts. Time: 50 sec Misc notes: Holzmann: Moore’s law -- cpu speed implies that twenty years a problem that took seven days now takes seven seconds. (even without CPU speed up, algorithmic improvements would yield 8 hours). Jackson: 3 hours (1980) --> 1 sec (2000)

19 What makes model-checking software difficult?
OK Error trace or Finite-state model Temporal logic formula Model Checker (F W) Line 5: … Line 12: … Line 15:… Line 21:… The final problem that I’m going to discuss is the output interpretation problem --- the problem of using the model-checker’s counterexample trace to locate the actual defect in the software source code. Time: 15 sec Problems using existing checkers: Model construction State explosion Property specification Output interpretation

20 Output Interpretation Problem
Error trace Line 5: … Line 12: … Line 15:… Line 21:… Line 25:… Line 27:… Line 41:… Line 47:… void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; Gap Model Description Program Raw error trace may be 1000’s of steps long The problem starts with the fact that the raw error trail from the model-checker can be thousands of steps long. For example, a group from NASA Ames and Honeywell used SPIN to check properties of the DEOS real-time scheduler and the bug that they found had an error trace 2700 steps long. <click> First, it’s fairly tedious to even map the error trace back to the model description because you are usually dealing with multiple threads -- each with their own program counter --- so you can imagine that this would be fairly tedious to simulate by hand. Mapping back to the source is even more challenging because a lot of clever encodings are typically used when modeling things like method call stacks and heap structures. The problem is somewhat related to debugging in the presence of an highly optimizing compilation, but the problem here is probably even more difficult because more aggressive abstractions are being used. Time: 1 minute Must map line listing onto model description Mapping to source is made difficult by Semantic gap & clever encodings of complex features multiple optimizations and transformations

21 Bandera: An open tool set for model-checking Java source code
Graphical User Interface Optimization Control Checker Inputs Bandera Temporal Specification Transformation & Abstraction Tools Checker Outputs Model Checkers Java Source void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; To address these problems, we’ve built a tool set called Bandera that provides a number of forms of support for model-checking Java source code. <click>It’s designed to allow easy experimentation with software model-checking using existing model-checkers which can be inserted into the framework as pluggable components. <click>To use Bandera, the user inputs the Java source and a property to be checked which is written in Bandera’s own model-checker independent temporal specification language. <click>Bandera contains a several different transformations and abstraction mechanisms that are designed to extract tractable models from Java source. Once the user has selected appropriate settings for these transformations, Bandera runs the transformations to produces a model description and a property specification for a particular model-checker that’s been chosen by the user. <click>The selected model-checker is then run on these inputs <click>If the checker produces a counter-example, Bandera will map it all the way back to the source level. <click> The entire collection of tools is encapsulated by a GUI that provides a consistent interface regardless of the model-checker used. Now, I want to talk about how Bandera addresses each of the four problems that we saw on the previous slides. Time: 1 min 10 secs Error Trace Mapping Bandera

22 Addressing the Model Construction Problem
Java Source void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; Model Compiler Static Analyses Abstract Interpretation Slicing Optimizations Model Description Model extraction: compiling to model checker inputs: First of all, let’s take the model construction problem. Bandera addresses the model construction problem by automatically compiling source code to model checker inputs. <click> Just like a traditional compiler, it includes numerous static analysis and optimizing transformations, two intermediate languages and multiple backends. In fact, we believe that almost any analysis that you would want to do in the front end of a traditional Java compiler, you would also want to do in model compilation. Bandera also includes slicing, abstract interpretation, and program specialization that you usually would not find in a compiler to further compress the code to a tractable model. Just like a compiler, you can use Bandera in some default modes that are rather simple minded, or alternatively, there are a number of knobs that you can turn to tune the various optimizations and transformations. Time: 1 min Numerous analyses, optimizations, two intermediate languages, multiple back-ends Slicing, abstract interpretation, specialization Variety of usage modes: simple...highly tuned

23 Addressing the Property Specification Problem
An extensible language based on field-tested temporal property specification patterns []((open /\ <>close) -> ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ ((!pushX /\ !close) U (close \/ (!pushX U close)))))))))) Bandera addresses the property specification problem by providing an extensible language based on a collection of field-test temporal property specification patterns. <click> Let’s go back to the atrocious LTL formula that we had a few slides ago. It turns out that this is an instance of what’s called to the 2-bounded existence property, and you would write it in Bandera’s specification language as follows: Here, the curly brackets and the yellow fonts indicate the parameters to the pattern. Time: 45 sec Using the pattern system: 2-bounded existence Between {open} and {close} {pushX} exists atMost {2} times;

24 Addressing the State Explosion Problem
Property void add(Object o) { buffer[head] = o; head = (head+1)%size; } Java Source Model Compiler Model Descriptions Generate models customized wrt property! Result: multiple models --- even as many as one per property Now to the state explosion problem. I’ve already described Bandera as a model compiler. We believe that to have any hope of tackling the state explosion problem, the compiler should not only take the source code as input, <click> it should also take the property specification as an input. This allows the model compiler to produce not just a single general model, but a model that is customized wrt to the given property. Therefore, if you want to check multiple properties, Bandera produces not a single model, but multiple models, where each model is customized for the respective property. Not of course, this would be impractical if you were hand-coding models, but with Bandera’s support for automated model-construction, it’s perfectly feasible. In each case, the models are customized using slicing, abstract interpretation, and program specialization (more about that later). Time: 1 min Aggressive customization via slicing, abstract interpretation, program specialization

25 Addressing the Output Interpretation Problem
Model Description Java Source void add(Object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; Intermediate Representations Model Checker + simulator Model Compiler Line 5: … Line 12: … Line 15:… Line 21:… Error trace Like a debugger: error traces mapped back to source Finally, we have the output interpretation problem: if the model checker gives you this error trail consisting of steps through the model, how do you figure out what that corresponds to at the source level? I’ve said already that Bandera is like a compiler, but it’s also like a debugger: it will map the error traces produced by any of the model-checkers back to the source level. <click> In addition, there is a simulator or interpreter for the lowest level intermediate language that allows the error traces to be run both forward and backwards with breakpoints watch variables and so forth. Just like a debugger, you can query the program state, look at the value of individual variables, navigate heap structures. Bandera also allows you to see the implicit information associated with each Java object such as which thread is holding a lock, whose waiting on a lock, and whose blocked on a lock. Time: 1 min Run error traces forwards and backwards Program state queried Heap structures navigated Locks, wait sets, blocked sets displayed

26 Bandera Architecture Translators Property Tool Java Jimple Parser
SPIN dSPIN SMV JPF Property Tool Java Jimple Parser Error Trace Display Abstraction Engine Slicer Analyses BIRC BIR Simulator Now, you seen the main idea of Bandera, as time permits, I’ll show you the details of the internal architecture. Just like a compiler, Bandera is organized into a front-end, a middle-end, and a back-end. For each of these, we have the code flowing from front to back, but we’ll also have error trace information flowing from back to front. <click>The Bandera front end translates Java source into a 3-address code intermediate language called Jimple. Jimple is part of a Java compiler infrastructure developed by Laurie Hendren’s Sable group at the University of McGill. The front end also includes a tool for declaring observable features of the source code which we’ll use in our temporal property specs, and there are also data structures required for the error trace display. <click>The middle-end includes standard data and control-flow analyses including a flexible points-to analysis, as well as the slicing and abstraction components. <click>The back includes a translation from Jimple to our low-level intermediate representation called BIR. I’ll show you more about BIR later. In this diagram you also see the simulator for BIR that’s used to generate the error trace information. <click>Finally, we have four checkers plugged into the back-end and now you can see the rationale for BIR --- to incorporate a new model-checker, you only need to write a translator to the model-checker’s input from BIR. Time: 2 minutes

27 Front End Translates Java source to Jimple IR
Supports specification of property Provides debugger-like step facilities for error traces Label1: if (x <= 0) goto Label2; t0 = y * 2; x = t0; Label2: … if (x > 0) x = y * 2; Java Jimple

28 Property Specification
/** * observable * EXP Full: (head == tail); */ class BoundedBuffer { Object [] buffer; int head, tail, bound; public synchronized void add(Object o) {…} Object take () } Requirement: If a buffer becomes full, it will eventually become non-full. Bandera Specification: FullToNonFull: forall[b:BoundedBuffer]. {Full(b)} leads to {!Full(b)} globally; First, let’s consider how properties are specified in Bandera. Here’s a little snippet of code for a simple BoundedBuffer object that’s implemented using an array and integer values for the size of the buffer and indices for the current head and tail. You can also see that we have methods add and take which operate on the buffer. <click> Now let’s say that we have a requirement that says if a buffer becomes full, it will eventually become non-full. This is reasonable if we use the buffer in an environment where we assume that we have an infinite and fair sequence of add and take operations on the buffer. Notice that, in this requirement, the main proposition or observable feature of the program is the notion of “full” which we can express in terms of the variables head and tail. Since this is a circular buffer, the buffer is going to be full when head is equal to tail. Bandera provides a system of predicates for reasoning about program features, and I’ve used one of these predicates to define the observable notion of full in Java-doc comments above the class. Since we defined the relevant observable features for the requirement, we can now write the requirement in the Bandera specification language. And it reads…. Bandera’s specification language has an interesting notion of quantification that allows you to quantify over collections of objects and you can see that in action here. Time: 2 min

29 Property-directed Slicing
indirectly relevant Resulting slice Slice mentioned in property Once the source code and the property to check have been supplied, a slicing phase is used to remove all sections of the code that are irrelevant for the property being checked. We call this “property-directed slicing” because it’s driven by the observables or primitive propositions that appear in the specification. <click> To start off, a slicing criterion is generated automatically from the observables mentioned in the property. For example, in the bounded buffer property, the observables are the variables head and tail, and the slicing criterion will include all the statements in the program that assign values to head and tail. Given this slicing criterion, the slicer will walk backwards through the program’s control-flow graph and find all the other nodes in the program that potentially influence the criterion nodes. Then, a slice is generated by copying all the relevant nodes, while throwing away the irrelevant nodes. Time: 1 min Source program slicing criterion generated automatically from observables mentioned in the property backwards slicing automatically finds all components that might influence the observables.

30 Property-directed Slicing
Slicing Criterion All statements that assign to head, tail. /** EXP Full: (head == tail) */ class BoundedBuffer { Object [] buffer_; int bound; int head, tail; public synchronized void add(Object o) { while ( tail == head ) try { wait(); } catch ( InterruptedException ex) {} buffer_[head] = o; head = (head+1) % bound; notifyAll(); } ... removed by slicing Included in slicing critirion indirectly relevant

31 Abstraction Specializer
Collapses data domains via abstract interpretation: Code Data domains int int x = 0; if (x == 0) x = x + 1; (n<0) : neg (n==0): zero (n>0) : pos Signs neg pos zero Signs x = zero; if (x == zero) x = pos; Even after slicing, you still may have a state-space that is intractable due to the large data domains associated with your program variables. Bandera provides an abstract interpretation component that allows you collapse these data domains down to small collections of symbolic values. For example, on the slide you see a little snippet of code that manipulates integers values. Often, in model checking, one only wants to keep enough information around to allow conditional tests to be decided. So you can see that for this example, to decide the conditional we only need to know if x is zero or non-zero. <click>Now, for this example, an appropriate abstraction would be the familiar Signs abstraction that maps integers onto a set of three symbolic values neg, zero, pos. Once the user specifies that x is to be abstracted with the signs abstraction, the code is transformed <click>so that it looks like this. Here you can see that the concrete type int has been replaced with the abstract type Signs, and the concrete constants 0 and 1 have been replaced with the abstract constants zero and pos. Moreover, you can see that as we enter the branch of the conditional, <click> we know x has the value zero, so we are able to replace the computation zero + pos (for one) with pos. Time: 1 min 30 sec

32 Abstraction Component Functionality
BASL Compiler PVS Bandera Abstraction Specification Language Variable Concrete Type Abstract Inferred x int Signs y int Signs Signs done bool Bool Abstraction Library count int intAbs …. …. o Object Point So now that you seen that motivation for abstractions, let’s look at the interface that Bandera presents to the user. When you want to invoke the abstraction component, the GUI will bring a nice hierarchical table of all the variables in the program along with their types. Now, to work correctly, the abstraction engine needs to know an abstract type for each variable. Of course, you don’t the user to have to actually select abstractions for each variable. To address this problem, Bandera allows the user to select an abstraction from an abstraction library for just a few variables that the user deems particularly relevant. <click> So in this little example, we might select the Signs abstraction for variable x. Once this initial selection is made, Bandera uses a type inference phase to attach abstract types to the remaining variables. This process is meant to be iterative, so you can refine the types that Bandera has inferred and run the type inference again. <click>Once you are satisfied with the abstract types, the abstraction engine is run to perform that type of code transformation that I showed you on the prevous slide. So one remaining issue is where do the abstractions in this library come from? The library is populated by an expert user who defines abstractions using a little special purpose language that we’ve designed. For base types, the user only needs to specify the domain of tokens and the abstraction function, and PVS is used to infer the definition of the abstract operations. The abstractions are then compiled to a Java representation and held in the abstraction library. Time: 1 min 45 sec b Buffer Buffer Jimple Abstraction Engine Abstracted

33 Abstraction Specification
abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) n < > {NEG}; n == 0 -> {ZERO}; n > > {POS}; end operator + add (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; /* case (POS,NEG), (NEG,POS) */ public class Signs { public static final int NEG = 0; // mask 1 public static final int ZERO = 1; // mask 2 public static final int POS = 2; // mask 4 public static int abstract(int n) { if (n < 0) return NEG; if (n == 0) return ZERO; if (n > 0) return POS; } public static int add(int arg1, int arg2) { if (arg1==NEG && arg2==NEG) return NEG; if (arg1==NEG && arg2==ZERO) return NEG; if (arg1==ZERO && arg2==NEG) return NEG; if (arg1==ZERO && arg2==ZERO) return ZERO; if (arg1==ZERO && arg2==POS) return POS; if (arg1==POS && arg2==ZERO) return POS; if (arg1==POS && arg2==POS) return POS; return Bandera.choose(7); /* case (POS,NEG), (NEG,POS) */ Compiled

34 Specification Creation Tools
abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) n < > {NEG}; n == 0 -> {ZERO}; n > > {POS}; end operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; end Automatic Generation Example: Start safe, then refine: +(NEG,NEG)={NEG,ZERO,POS} Forall n1,n2: neg?(n1) and neg?(n2) implies not pos?(n1+n2) Forall n1,n2: neg?(n1) and neg?(n2) implies not zero?(n1+n2) Forall n1,n2: neg?(n1) and neg?(n2) implies not neg?(n1+n2) Proof obligations submitted to PVS...

35 Back End Bandera Intermediate Representation (BIR)
guarded command language includes: locks, threads, references, heap info to help translators (live vars, invisible) loc s5: live { r0, r1 } when lockAvail(r0.lock) do { lock(r0.lock); } goto s6; loc s6: live { r1 } when true do invisible { r1.count = 0;} goto s7; After the code has been sliced, abstracted, and otherwise optimized, it’s compiled down to the low-level intermediate language called BIR. BIR is essentially a language of guarded commands that allow includes primitives for implementing the low-level synchronization and memory model of Java. For example, here you can see that the entermonitor construct in Java has been implemented in terms of primitives that check the availability of the lock and that acquire the lock. Bir also includes annotations that help optimization the translations to Spin, SMV, etc. For example, here you see that BIR includes information about live variables, as well as information about the granularity of operations. entermonitor r0 r1.count = 0; Jimple BIR

36 Bounded Buffer BIR process BoundedB()
BoundedBuffer_ref = ref { BoundedBuffer_col, BoundedBuffer_col_0 }; BoundedBuffer_rec = record { bound_ : range -1..4; head_ : range -1..4; tail_ : range -1..4; BIRLock : lock wait reentrant; }; BoundedBuffer_col : collection [3] of BoundedBuffer_rec; BoundedBuffer_col_0 : collection [3] of BoundedBuffer_rec; ……. ………. loc s34: live { b2, b1, add_JJJCTEMP_0, add_JJJCTEMP_6, add_JJJCTEMP_8 } when true do invisible { add_JJJCTEMP_8 := (add_JJJCTEMP_6 % add_JJJCTEMP_8); } goto s35; loc s35: live { b2, b1, add_JJJCTEMP_0, add_JJJCTEMP_8 } when true do { add_JJJCTEMP_0.head_ := add_JJJCTEMP_8; } goto s36; loc s36: live { b2, b1, add_JJJCTEMP_0 } when true do { notifyAll(add_JJJCTEMP_0.BIRLock); } goto s37; loc s37: live { b2, b1, add_JJJCTEMP_0 } when true do { unlock(add_JJJCTEMP_0.BIRLock); } goto s38;

37 Bounded Buffer Promela
typedef BoundedBuffer_rec { type_8 bound_; type_8 head_; type_8 tail_; type_18 BIRLock; } loc_25: atomic { printf("BIR: OK\n"); if :: (_collect(add_JJJCTEMP_0) == 1) -> add_JJJCTEMP_8 = BoundedBuffer_col. instance[_index(add_JJJCTEMP_0)].tail_; :: (_collect(add_JJJCTEMP_0) == 2) -> add_JJJCTEMP_8 = BoundedBuffer_col_0. :: else -> printf("BIR: NullPointerException\n"); assert(0); fi; goto loc_26; }

38 Translators Plug-in component that interfaces to specific model checker Translates BIR to checker input language Parses output of checker for error trace Currently SPIN, dSPIN, SMV translators complete JPF (from NASA Ames) integrated XMC, FDR translators in progress

39 Case Studies Small examples thus far (< 2000 loc)
illustrating use of property-pattern system and other components Scheduler from DEOS real-time OS kernel (1600, 22 classes, seven tasks) Now trying systems up to 20,000 loc collection of 15 open-source 100% pure Java Jigsaw web-server from W3C Tomcat, James (from Apache/Jakarta) In general, 1-2 minutes for model extraction on (~2000k systems) State space reductions can dramatically reduce cost

40 Summary Bandera provides an open platform for experimentation
Separates model checking from extraction uses existing model checkers supports multiple model checkers Specialize models for specific properties using automated support for slicing, abstraction, etc. Designed for extensibility well-defined internal representations and interfaces We hope this will contribute to the definition of APIs for software model-checkers

41 Context of Project Researchers with different backgrounds (programming languages, static analysis, verification of concurrent systems, software engineering) Started on Bandera in November 1998 (previously built verification tools for Ada) Funding from NASA, National Science Foundation, Honeywell, US Air Force

42 Current Status A reasonable subset of concurrent Java
not handled: recursive methods, exceptions, inner classes, native methods, libraries(*) You can play around with a “pre-alpha” version of the tools accompanied by a draft tutorial Public release: October 2000

43 Schedule of BRICS Mini-Course
Monday -- Overview overview talk basic demo Tuesday -- Specifying Temporal Properties of Software overview of temporal specification review of CTL, LTL temporal specification design patterns example driven presentation of Bandera’s specification tools Wednesday -- Details of Bandera Components slicing concurrent Java programs Bandera abstraction tools model generation via Bandera’s back-end summary of case studies (e.g., space-craft controller examples)


Download ppt "Bandera: Extracting Finite-state Models from Java Code"

Similar presentations


Ads by Google