Model Checking Doron A. Peled University of Warwick Coventry, UK

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Translating from logic to automata Book: Chapter 6.
Lecture 2: testing Book: Chapter 9 What is testing? Testing is not showing that there are no errors in the program. Testing cannot show that the program.
Model Checking and Testing combined
Black Box Checking Book: Chapter 9 Model Checking Finite state description of a system B. LTL formula. Translate into an automaton P. Check whether L(B)
Automatic Verification Book: Chapter 6. How can we check the model? The model is a graph. The specification should refer the the graph representation.
Formal Methods and Testing Goal: software reliability Use software engineering methodologies to develop the code. Use formal methods during code development.
Modeling issues Book: chapters 4.12, 5.4, 8.4, 10.1.
CS 267: Automated Verification Lecture 8: Automata Theoretic Model Checking Instructor: Tevfik Bultan.
Partial Order Reduction: Main Idea
1 Model checking. 2 And now... the system How do we model a reactive system with an automaton ? It is convenient to model systems with Transition systems.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Timed Automata.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
1 Temporal Claims A temporal claim is defined in Promela by the syntax: never { … body … } never is a keyword, like proctype. The body is the same as for.
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture 05.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Introduction to Computability Theory
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture # 11.
Specification Formalisms Book: Chapter 5. Properties of formalisms Formal. Unique interpretation. Intuitive. Simple to understand (visual). Succinct.
Abstractions. Outline Informal intuition Why do we need abstraction? What is an abstraction and what is not an abstraction A framework for abstractions.
1 Carnegie Mellon UniversitySPINFlavio Lerda SPIN An explicit state model checker.
CS 536 Spring Global Optimizations Lecture 23.
Modeling Software Systems Lecture 2 Book: Chapter 4.
1 Translating from LTL to automata Book: Chapter 6.
Specification Formalisms Book: Chapter 5. Properties of formalisms Formal. Unique interpretation. Intuitive. Simple to understand (visual). Succinct.
Review of the automata-theoretic approach to model-checking.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Testing an individual module
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
Describing Syntax and Semantics
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
Flavio Lerda 1 LTL Model Checking Flavio Lerda. 2 LTL Model Checking LTL –Subset of CTL* of the form: A f where f is a path formula LTL model checking.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
1 Functional Testing Motivation Example Basic Methods Timing: 30 minutes.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
15-820A 1 LTL to Büchi Automata Flavio Lerda A 2 LTL to Büchi Automata LTL Formulas Subset of CTL* –Distinct from CTL AFG p  LTL  f  CTL. f.
Regular Model Checking Ahmed Bouajjani,Benget Jonsson, Marcus Nillson and Tayssir Touili Moran Ben Tulila
CMSC 345 Fall 2000 Unit Testing. The testing process.
Basics of automata theory
Doron Peled, Bar Ilan University. Testing of black box finite state machine Know: lTransition relation lSize or bound on size Wants to know: lIn what.
Copyright , Doron Peled and Cesare Tinelli. These notes are based on a set of lecture notes originally developed by Doron Peled at the University.
Formal methods: Model Checking and Testing Prof. Doron A. Peled University of Warwick, UK and Bar Ilan University, Israel.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Translating from logic to automata (Book: Chapter 6)
1 Temporal logic. 2 Prop. logic: model and reason about static situations. Example: Are there truth values that can be assigned to x,y simultaneously.
Model Checking Lecture 1. Model checking, narrowly interpreted: Decision procedures for checking if a given Kripke structure is a model for a given formula.
Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
Today’s Agenda  Quiz 4  Temporal Logic Formal Methods in Software Engineering1.
Model Checking Lecture 1: Specification Tom Henzinger.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
1 Software Testing. 2 What is Software Testing ? Testing is a verification and validation activity that is performed by executing program code.
Automatic Verification
Objective of This Course
Alternating tree Automata and Parity games
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
An explicit state model checker
Presentation transcript:

Model Checking Doron A. Peled University of Warwick Coventry, UK

Modelling and specification for verification and validation How to specify what the software is supposed to do? How to model it in a way that allows us to check it?

Sequential systems. Perform some computational task. Have some initial condition, e.g.,  0  i  n A[i] integer. Have some final assertion, e.g.,  0  i  n-1 A[i]  A[i+1]. (What is the problem with this spec?) Are supposed to terminate.

Concurrent Systems Involve several computation agents. Termination may indicate an abnormal event (interrupt, strike). May exploit diverse computational power. May involve remote components. May interact with users (Reactive). May involve hardware components (Embedded).

Problems in modeling systems Representing concurrency: - Allow one transition at a time, or - Allow coinciding transitions. Granularity of transitions. Assignments and checks? Application of methods? Global (all the system) or local (one thread at a time) states.

Modeling. The states based model. V={v 0,v 1,v 2, …} - set of variables. p(v 0, v 1, …, v n ) - a parametrized assertion, e.g., v 0 =v 1 +v 2 /\ v 3 >v 4. A state is an assignment of values to the program variables. For example: s= For predicate (first order assertion) p: p (s ) is p under the assignment s. Example: p is x>y /\ y>z. s=. Then we have 4>3 /\ 3>5, which is false.

State space The state space of a program is the set of all possible states for it. For example, if V={a, b, c} and the variables are over the naturals, then the state space includes:,,, …

Atomic Transitions Each atomic transition represents a small piece of code such that no smaller piece of code is observable. Is a:=a+1 atomic? In some systems, e.g., when a is a register and the transition is executed using an inc command.

Non atomicity Execute the following when x=0 in two concurrent processes: P1:a=a+1 P2:a=a+1 Result: a=2. Is this always the case? Consider the actual translation: P1:load R1,a inc R1 store R1,a P2:load R2,a inc R2 store R2,a a may be also 1.

Scenario P1:load R1,a inc R1 store R1,a P2:load R2,a inc R2 store R2,a a=0 R1=0 R2=0 R1=1 R2=1 a=1

Representing transitions Each transition has two parts: The enabling condition: a predicate. The transformation: a multiple assignment. For example: a>b  (c,d):=(d,c) This transition can be executed in states where a>b. The result of executing it is switching the value of c with d.

Initial condition A predicate I. The program can start from states s such that I (s ) holds. For example: I (s )=a>b /\ b>c.

A transition system A (finite) set of variables V over some domain. A set of states . A (finite) set of transitions T, each transition e  t has an enabling condition e, and a transformation t. An initial condition I.

Example V={a, b, c, d, e}.  : all assignments of natural numbers for variables in V. T={c>0  (c,e):=(c-1,e+1), d>0  (d,e):=(d-1,e+1)} I : c=a /\ d=b /\ e=0 What does this transition system do?

The interleaving model An execution is a finite or infinite sequence of states s 0, s 1, s 2, … The initial state satisfies the initial condition, I.e., I (s 0 ). Moving from one state s i to s i+1 is by executing a transition e  t: E (s i ), I.e., s i satisfies e. s i+1 is obtained by applying t to s i.

Example: s0= s1= s2= s3= T={c>0  (c,e):=(c-1,e+1), d>0  (d,e):=(d-1,e+1)} I: c=a /\ d=b /\ e=0

L0:While True do NC0:wait(Turn=0); CR0:Turn=1 endwhile || L1:While True do NC1:wait(Turn=1); CR1:Turn=0 endwhile T0:PC0=L0  PC0:=NC0 T1:PC0=NC0/\Turn=0  PC0:=CR0 T2:PC0=CR0  (PC0,Turn):=(L0,1) T3:PC1=L1  PC1=NC1 T4:PC1=NC1/\Turn=1  PC1:=CR1 T5:PC1=CR1  (PC1,Turn):=(L1,0) Initially: PC0=L0/\PC1=L1 The transitions

The state graph:Successor relation between states. Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Some observations Executions : the set of maximal paths (finite or terminating in a node where nothing is enabled). Nondeterministic choice : when more than a single transition is enabled at a given state. We have a nondeterministic choice when at least one node at the state graph has more than one successor.

Always ¬(PC0=CR0/\PC1=CR1) (Mutual exclusion) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Always if Turn=0 the at some point Turn=1 Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Always if Turn=0 the at some point Turn=1 Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Interleaving semantics: Execute one transition at a time. Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=1 L0,CR1 Turn=1 L0,NC1 Need to check the property for every possible interleaving!

Interleaving semantics Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=1 L0,CR1 Turn=1 L0,NC1 Turn=0 L0,L1 Turn=0 L0,NC1

L0:While True do NC0:wait(Turn=0); CR0:Turn=1 endwhile || L1:While True do NC1:wait(Turn=1); CR1:Turn=0 endwhile T0:PC0=L0  PC0:=NC0 T1:PC0=NC0/\Turn=0  PC0:=CR0 T1’:PC0=NC0/\Turn=1  PC0:=NC0 T2:PC0=CR0  (PC0,Turn):=(L0,1) T3:PC1==L1  PC1=NC1 T4:PC1=NC1/\Turn=1  PC1:=CR1 T4’:PC1=NC1/\Turn=0  PC1:=N1 T5:PC1=CR1  (PC1,Turn):=(L1,0) Initially: PC0=L0/\PC1=L1 Busy waiting

Always when Turn=0 then sometimes Turn=1 Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1 Now it does not hold! (Red subgraph generates a counterexample execution.)

How can we check the model? The model is a graph. The specification should refer the the graph representation. Apply graph theory algorithms.

What properties can we check? Invariants: a property that need to hold in each state. Deadlock detection: can we reach a state where the program is blocked? Dead code: does the program have parts that are never executed.

How to perform the checking? Apply a search strategy (Depth first search, Breadth first search). Check states/transitions during the search. If property does not hold, report counter example!

If it is so good, why learn deductive verification methods? Model checking works only for finite state systems. Would not work with Unconstrained integers. Unbounded message queues. General data structures: queues trees stacks parametric algorithms and systems.

The state space explosion Need to represent the state space of a program in the computer memory. Each state can be as big as the entire memory! Many states: Each integer variable has 2^32 possibilities. Two such variables have 2^64 possibilities. In concurrent protocols, the number of states usually grows exponentially with the number of processes.

If it is so constrained, is it of any use? Many protocols are finite state. Many programs or procedure are finite state in nature. Can use abstraction techniques. Sometimes it is possible to decompose a program, and prove part of it by model checking and part by theorem proving. Many techniques to reduce the state space explosion.

Depth First Search Program DFS For each s such that Init(s) dfs(s) end DFS Procedure dfs(s) for each s’ such that R(s,s’) do If new(s’) then dfs(s’) end dfs.

How can we check properties with DFS? Invariants: check that all reachable states satisfy the invariant property. If not, show a path from an initial state to a bad state. Deadlocks: check whether a state where no process can continue is reached. Dead code: as you progress with the DFS, mark all the transitions that are executed at least once.

¬ (PC0=CR0/\PC1=CR1) is an invariant! Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Want to do more! Want to check more properties. Want to have a unique algorithm to deal with all kinds of properties. This is done by writing specification in more complicated formalisms. We will see that in the next lecture.

[](Turn=0 --> <>Turn=1) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1 init

Turn=0 L0,L1 Turn=1 L0,L1 init Add an additional initial node. Propositions are attached to incoming nodes. All nodes are accepting. Turn=1 L0,L1 Turn=0 L0,L1

Correctness condition We want to find a correctness condition for a model to satisfy a specification. Language of a model: L(Model) Language of a specification: L(Spec). We need: L(Model)  L(Spec).

Correctness All sequences Sequences satisfying Spec Program executions

How to prove correctness? Show that L(Model)  L(Spec). Equivalently: ______ Show that L(Model)  L(Spec) = Ø. Also: can obtain L(Spec) by translating from LTL!

What do we need to know? How to intersect two automata? How to complement an automaton? How to translate from LTL to an automaton?

Specification Formalisms

Properties of formalisms Formal. Unique interpretation. Intuitive. Simple to understand (visual). Succinct. Spec. of reasonable size. Effective. Check that there are no contradictions. Check that the spec. is implementable. Check that the implementation satisfies spec. Expressive. May be used to generate initial code. Specifying the implementation or its properties?

A transition system A (finite) set of variables V. A set of states . A (finite) set of transitions T, each transition e==>t has an enabling condition e and a transformation t. An initial condition I. Denote by R(s, s’) the fact that s’ is a successor of s.

The interleaving model An execution is a finite or infinite sequence of states s 0, s 1, s 2, … The initial state satisfies the initial condition, I.e., I (s 0 ). Moving from one state s i to s i+1 is by executing a transition e==>t: e(s i ), I.e., s i satisfies e. s i+1 is obtained by applying t to s i. Lets assume all sequences are infinite by extending finite ones by “stuttering” the last state.

Temporal logic Dynamic, speaks about several “worlds” and the relation between them. Our “worlds” are the states in an execution. There is a linear relation between them, each two sequences in our execution are ordered. Interpretation: over an execution, later over all executions.

LTL: Syntax  ::= (  ) | ¬  |  /\   \/  U   |O  | p  “box”, “always”, “forever”  “diamond”, “eventually”, “sometimes” O  “nexttime”  U  “until” Propositions p, q, r, … Each represents some state property (x>y+1, z=t, at-CR, etc.)

Semantics over suffixes of execution   O   U        

Combinations []<>p “p will happen infinitely often” <>[]p “p will happen from some point forever”. ([]<>p) --> ([]<>q) “If p happens infinitely often, then q also happens infinitely often”.

Some relations: [](a/\b)=([]a)/\([]b) But <>(a/\b)  (<>a)/\(<>b) <>(a\/b)=(<>a)\/(<>b) But [](a\/b)  ([]a)\/([]b) aba a abb b b a b

What about ([]<>A)/\([]<>B)=[]<>(A/\B)? ([]<>A)\/([]<>B)=[]<>(A\/B)? (<>[]A)/\(<>[]B)=<>[](A/\B)? (<>[]A)\/(<>[]B)=<>[](A\/B)? No, just <-- Yes!!! No, just -->

Can discard some operators Instead of <>p, write true U p. Instead of []p, we can write ¬<>¬p, or ¬(true U ¬p). Because []p=¬¬[]p. ¬[]p means it is not true that p holds forever, or at some point ¬p holds or <>¬p.

Formal semantic definition Let  be a sequence s 0 s 1 s 2 … Let  i be a suffix of  : s i s i+1 s i+2 … (  0 =  )  i |= p, where p a proposition, if s i |=p.  i |=  /\  if  i |=  and  i |= .  i |=  \/  if  i |=  or  i |= .  i |= <>  if for some j  i,  j |= .  i |= []  if for each j  i,  j |= .  i |=  U  if for some j  i,  j |=  and for each i  k<j,  k |= .

Then we interpret: s|=p as in propositional logic.  |=  is interpreted over a sequence, as in previous slide. P|=  holds if  |=  for every sequence  of P.

Spring Example s1s3s2 pull release extended malfunction extended r0 = s1 s2 s1 s2 s1 s2 s1 … r1 = s1 s2 s3 s3 s3 s3 s3 … r2 = s1 s2 s1 s2 s3 s3 s3 … …

LTL satisfaction by a single sequence malfunction s1s3s2 pull release extended r2 = s1 s2 s1 s2 s3 s3 s3 … r2 |= extended ?? r2 |= O extended ?? r2 |= O O extended ?? r2 |= <> extended ?? r2 |= [] extended ?? r2 |= <>[] extended ?? r2 |= ¬ <>[] extended ?? r2 |= (¬extended) U malfunction ?? r2 |= [](¬extended->O extended) ??

LTL satisfaction by a system malfunction s1s3s2 pull release extended P |= extended ?? P |= O extended ?? P |= O O extended ?? P |= <> extended ?? P|= [] extended ?? P |= <>[] extended ?? P |= ¬ <>[] extended ?? P |= (¬extended) U malfunction ?? P |= [](¬extended->O extended) ??

The state space Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

[] ¬(PC0=CR0/\PC1=CR1) (Mutual exclusion) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

[](Turn=0 --> <>Turn=1) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Interleaving semantics: Execute one transition at a time. Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=1 L0,CR1 Turn=1 L0,NC1 Need to check the property for every possible interleaving!

More specifications [](PC0=NC0  <> PC0=CR0) [](PC0=NC0 U Turn=0) Try at home: - The processes alternate in entering their critical sections. - Each process enters its critical section infinitely often.

Proof system ¬<>p []¬p [](p  q)  ([]p  []q) []p  (p/\O[]p) O¬p ¬Op [](p  Op)  (p  []p) (pUq) (q\/(p/\O(pUq))) (pUq)  <>q + propositional logic axiomatization. + axiom: p []p

Traffic light example Green --> Yellow --> Red --> Green Always has exactly one light: [](¬(gr/\ye)/\¬(ye/\re)/\¬(re/\gr)/\(gr\/ye\/re)) Correct change of color: []((grUye)\/(yeUre)\/(reUgr))

Another kind of traffic light Green-->Yellow-->Red-->Yellow-->Green First attempt: [](((gr\/re) U ye)\/(ye U (gr\/re))) Correct specification: []( (gr-->(gr U (ye /\ ( ye U re )))) /\(re-->(re U (ye /\ ( ye U gr )))) /\(ye-->(ye U (gr \/ re)))) Needed only when we can start with yellow

Properties of sequential programs init-when the program starts and satisfies the initial condition. finish-when the program terminates and nothing is enabled. Partial correctness: init/\[](finish   ) Termination: init/\<>finish Total correctness: init/\<>(finish/\  ) Invariant: init/\[] 

Automata over finite words A=  (finite) the alphabet, S (finite) the states.  : S x  x S is the transition relation I : S are the starting states F: S are the accepting states. a a b bS0 S1

The transition relation (S0, a, S0) (S0, b, S1) (S1, a, S0) (S1, b, S1) a a b bS0 S1

A run over a word A word over , e.g., abaab. A sequence of states, e.g. S0 S0 S1 S0 S0 S1. Starts with an initial state. Accepting if ends at accepting state. a a b bS0 S1

The language of an automaton The words that are accepted by the automaton. Includes aabbba, abbbba. Does not include abab, abbb. What is the language? a a b bS0 S1

Nondeterministic automaton Transitions: (S0,a,S0), (S0,b,S0), (S0,a,S1),(S1,a,S1). What is the language of this automaton? A,ba aS0 S1

Equivalent deterministic automaton a,ba aS0 S1 b a aS0 S1 b

Automata over infinite words Similar definition. Runs on infinite words over . Accepts when an accepting state occurs infinitely often in a run. a a b bS0 S1

Automata over infinite words Consider the word a b a b a b a b … There is a run S0 S0 S1 S0 S1 S0 S1 … This run in accepting, since S0 appears infinitely many times. A A B BS0 S1

Other runs For the word b b b b b … the run is S0 S1 S1 S1 S1… and is not accepting. For the word a a a b b b b b …, the run is S0 S0 S0 S0 S1 S1 S1 S1 … What is the run for a b a b b a b b b …? a a b bS0 S1

Nondeterministic automaton What is the language of this automaton? What is the LTL specification if b -- PC0=CR0, a=¬b? a,ba aS0 S1 Can you find a deterministic automaton with same language? Can you prove there is no such deterministic automaton?

Specification using Automata Let each letter correspond to some propositional property. Example: a -- P0 enters critical section, b -- P0 does not enter section. []<>PC0=CR0 a a b bS0 S1

Mutual Exclusion A -- PC0=CR0/\PC1=CR1 B -- ¬(PC0=CR0/\PC1=CR1) C -- TRUE []¬(PC0=CR0/\PC1=CR1) ba cS0 S1

L0:While True do NC0:wait(Turn=0); CR0:Turn=1 endwhile || L1:While True do NC1:wait(Turn=1); CR1:Turn=0 endwhile T0:PC0=L0==>PC0=NC0 T1:PC0=NC0/\Turn=0==> PC0:=CR0 T2:PC0=CR0==> (PC0,Turn):=(L0,1) T3:PC1==L1==>PC1=NC1 T4:PC1=NC1/\Turn=1==> PC1:=CR1 T5:PC1=CR1==> (PC1,Turn):=(L1,0) Initially: PC0=L0/\PC1=L1

The state space Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

[] ¬ (PC0=CR0/\PC1=CR1) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

[](Turn=0 --> <>Turn=1) Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1

Correctness condition We want to find a correctness condition for a model to satisfy a specification. Language of a model: L(Model) Language of a specification: L(Spec). We need: L(Model)  L(Spec).

Correctness All sequences Sequences satisfying Spec Program executions

Incorrectness All sequences Sequences satisfying Spec Program executions Counter examples

Intersecting M 1 =(S 1, ,T 1,I 1,A 1 ) and M 2 =(S 2, ,T 2,I 2,S 2 ) Run the two automata in parallel. Each state is a pair of states: S 1 x S 2 Initial states are pairs of initials: I 1 x I 2 Acceptance depends on first component: A 1 x S 2 Conforms with transition relation: (x 1,y 1 )-a->(x 2,y 2 ) when x 1 -a->x 2 and y 1 -a->y 2.

Example (all states of second automaton accepting!) a b ct0 t1 a a b,c s0 s1 States: (s0,t0), (s0,t1), (s1,t0), (s1,t1). Accepting: (s0,t0), (s0,t1). Initial: (s0,t0).

a b ct0 t1 a a b,c s0 s1 s0,t0 s0,t1 s1,t1 s1,t0 b b a c a c Also state (s0,t1) is unreachable but we will keep it for the time being!

More complicated when A 2  S 2 a b ct0 t1 a a b,c s0 s1 Should we have acceptance when both components accepting? I.e., {(s0,t1)}? No, consider (ba)  It should be accepted, but never passes that state. s0,t0 s0,t1 s1,t1 b a c a c

More complicated when A 2  S 2 a b ct0 t1 A a b,c s0 s1 Should we have acceptance when at least one components is accepting? I.e., {(s0,t0),(s0,t1),(s1,t1)}? No, consider b c  It should not be accepted, but here will loop through (s1,t1) s0,t0 s0,t1 s1,t1 b c a c a

Intersection - general case Also: propositions on nodes q0q0 q2q2 q3q3 q1q1 A/\¬B q 0,q3 ¬A/\¬B q 1,q 3 ¬A/\B q 1,q 2 A/\¬B ¬A ¬A/\B A\/¬B q 0 and q 2 give false.

Version 0: to catch q 0 Version 1: to catch q 2 A/\¬B q 0,q3 ¬A/\¬B q 1,q 3 ¬A/\B q 1,q 2 A/\¬B q 0,q3 ¬A/\¬B q 1,q 3 ¬A/\B q 1,q 2 Move when see accepting of left (q 0 )Move when see accepting of right (q 2 ) Version 0 Version 1

Make an accepting state in one of the version according to a component accepting state A/\¬B q 0,q3,0 ¬A/\¬B q 1,q 3,0 ¬A/\B q 1,q 2,0 A/\¬B q 0,q 3,1 ¬A/\¬B q 1,q 3,1 ¬A/\B q 1,q 2,1 Version 1 Version 0

How to check for emptiness? s0,t0 s0,t1 s1,t1 b b a c a c

Emptiness... Need to check if there exists an accepting run (passes through an accepting state infinitely often).

Finding accepting runs If there is an accepting run, then at least one accepting state repeats on it forever. This state appears on a cycle. So, find a reachable accepting state on a cycle.

Equivalently... A strongly connected component: a set of nodes where each node is reachable by a path from each other node. Find a reachable strongly connected component with an accepting node.

How to complement? Complementation is hard! Can ask for the negated property (the sequences that should never occur). Can translate from LTL formula  to automaton A, and complement A. But: can translate ¬  into an automaton directly!

Model Checking under Fairness Express the fairness as a property φ. To prove a property ψ under fairness, model check φ  ψ. Fair (φ) Bad (¬ψ)Program Counter example

Model Checking under Fairness Specialize model checking. For weak process fairness: search for a reachable strongly connected component, where for each process P either it contains on occurrence of a transition from P, or it contains a state where P is disabled.

Dekker’s algorithm P1::while true do begin non-critical section 1 c1:=0; while c2=0 do begin if turn=2 then begin c1:=1; wait until turn=1; c1:=0; end end critical section 1 c1:=1; turn:=2 end. P2::while true do begin non-critical section 2 c2:=0; while c1=0 do begin if turn=1 then begin c2:=1; wait until turn=2; c2:=0; end end critical section 2 c2:=1; turn:=1 end. boolean c1 initially 1; boolean c2 initially 1; integer (1..2) turn initially 1;

Dekker’s algorithm P1::while true do begin non-critical section 1 c1:=0; while c2=0 do begin if turn=2 then begin c1:=1; wait until turn=1; c1:=0; end end critical section 1 c1:=1; turn:=2 end. P2::while true do begin non-critical section 2 c2:=0; while c1=0 do begin if turn=1 then begin c2:=1; wait until turn=2; c2:=0; end end critical section 2 c2:=1; turn:=1 end. boolean c1 initially 1; boolean c2 initially 1; integer (1..2) turn initially 1; c1=c2=0, turn=1

Dekker’s algorithm P1::while true do begin non-critical section 1 c1:=0; while c2=0 do begin if turn=2 then begin c1:=1; wait until turn=1; c1:=0; end end critical section 1 c1:=1; turn:=2 end. P2::while true do begin non-critical section 2 c2:=0; while c1=0 do begin if turn=1 then begin c2:=1; wait until turn=2; c2:=0; end end critical section 2 c2:=1; turn:=1 end. boolean c1 initially 1; boolean c2 initially 1; integer (1..2) turn initially 1; c1=c2=0, turn=1

Dekker’s algorithm P1::while true do begin non-critical section 1 c1:=0; while c2=0 do begin if turn=2 then begin c1:=1; wait until turn=1; c1:=0; end end critical section 1 c1:=1; turn:=2 end. P2::while true do begin non-critical section 2 c2:=0; while c1=0 do begin if turn=1 then begin c2:=1; wait until turn=2; c2:=0; end end critical section 2 c2:=1; turn:=1 end. P1 waits for P2 to set c2 to 1 again. Since turn=1 (priority for P1), P2 is ready to do that. But never gets the chance, since P1 is constantly active checking c2 in its while loop. c1=c2=0, turn=1

What went wrong? The execution is unfair to P2. It is not allowed a chance to execute. Such an execution is due to the interleaving model (just picking an enabled transition. If it did, it would continue and set c2 to 0, which would allow P1 to progress. Fairness = excluding some of the executions in the interleaving model, which do not correspond to actual behavior of the system. while c1=0 do begin if turn=1 then begin c2:=1; wait until turn=2; c2:=0; end end

Recall: The interleaving model An execution is a finite or infinite sequence of states s 0, s 1, s 2, … The initial state satisfies the initial condition, I.e., I (s 0 ). Moving from one state s i to s i+1 is by executing a transition e  t: e(s i ), I.e., s i satisfies e. s i+1 is obtained by applying t to s i. Now: consider only “fair” executions. Fairness constrains sequences that are considered to be executions. Fair executions Sequences Executions

Some fairness definitions Weak transition fairness: It cannot happen that a transition is enabled indefinitely, but is never executed. Weak process fairness: It cannot happen that a process is enabled indefinitely, but non of its transitions is ever executed Strong transition fairness: If a transition is infinitely often enabled, it will get executed. Strong process fairness: If at least one transition of a process is infinitely often enabled, a transition of this process will be executed.

How to use fairness constraints? Assume in the negative that some property (e.g., termination) does not hold, because of some transitions or processes prevented from execution. Show that such executions are impossible, as they contradict fairness assumption. We sometimes do not know in reality which fairness constraint is guaranteed.

Example P1::x=1 P2: do :: y==0  if :: true :: x==1  y=1 fi od Initially: x=0; y=0; In order for the loop to terminate we need P1 to execute the assignment. But P1 may never execute, since P2 is in a loop executing true. Consequently, x==1 never holds, and y is never assigned a 1. pc1=l0-->(pc1,x):=(l1,1) /* x=1 */ pc2=r0/\y=0-->pc2=r1 /* y==0*/ pc2=r1  pc2=r0 /* true */ pc2=r1/\x=1  (pc2,y):=(r0,1) /* x==1 --> y:=1 */

Weak transition fairness P1::x=1 P2: do :: y==0  if :: true :: x==1  y=1 fi od Initially: x=0; y=0; Under weak transition fairness, P1 would assign 1 to x, but this does not guarantee that 1 is assigned to y and thus the P2 loop will terminates, since the transition for checking x==1 is not continuously enabled (program counter not always there). Weak process fairness only guarantees P2 to execute, but it can still choose to do the true. Strong process fairness: same.

Strong transition fairness P1::x=1P2: do :: y==0  if :: true :: x==1  y=1 fi od Initially: x=0; y=0; Under strong transition fairness, P1 would assign 1 to x. If the execution was infinite, the transition checking x==1 was infinitely often enabled. Hence it would be eventually selected. Then assigning y=1, the main loop is not enabled anymore.

Some fairness definitions Weak transition fairness: /\   T (<>[]en   []<>exec  ). Equivalently: /\ a  T ¬<>[](en  /\¬exec  ) Weak process fairness: /\ Pi (<>[]en Pi  []<>exec Pi ) Strong transition fairness: /\   T ([]<>en   []<>exec  ) Strong process fairness: /\ Pi ([]<>en Pi  []<>exec Pi ) exec   is executed. exec Pi some transition of Pi is executed. en   is enabled. en Pi some transition of process Pi is enabled. en Pi = \/   T en  exec Pi = \/   T exec 

“Weaker fairness condition” A is weaker than B if B  A. Consider the executions L(A) and L(B). Then L(B)  L(A). An execution is strong {process,transition} fair implies that it is also weak {process,transition} fair. There are fewer strong {process,transition} fair executions. Strong transition fair execs Weak process fair execs Weak transition fair execs Strong process fair execs

Model Checking under fairness Instead of verifying that the program satisfies , verify it satisfies fair   Problem: may be inefficient. Also fairness formula may involves special arrangement for specifying what exec means. May specialize model checking algorithm instead.

Model Checking under Fairness Specialize model checking. For weak process fairness: search for a reachable strongly connected component, where for each process P either it contains on occurrence of a transition from P, or it contains a state where P is disabled. Weak transition fairness: similar. Strong fairness: much more difficult algorithm.

Abstractions

Problems with software analysis Many possible outcomes and interactions. Not manageable by an algorithm (undecideable, complex). Requires a lot of practice and ingenuity (e.g., finding invariants).

More problems Testing methods fail to cover potential errors. Deductive verification techniques require too much time, mathematical expertise, ingenuity. Model checking requires a lot of time/space and may introduce modeling errors.

How to alleviate the complexity? Abstraction Compositionality Partial Order Reduction Symmetry

Abstraction Represent the program using a smaller model. Pay attention to preserving the checked properties. Do not affect the flow of control.

Main idea Use smaller data objects. x:= f(m) y:=g(n) if x*y>0 then … else … x, y never used again.

How to abstract? Assign values {-1, 0, 1} to x and y. Based on the following connection: sgn(x) = 1 if x>0, 0 if x=0, and -1 if x<0. sgn(x)*sgn(y)=sgn(x*y).

Abstraction mapping S - states, I - initial states. L(s) - labeling. R(S,S) - transition relation. h(s) maps s into its abstract image. Full model --h--> Abstract model I(s) --> I(h(s)) R(s, t) --> R(h(s),h(t)) L(h(s))=L(s)

Traffic light example go stop

go stop go stop

What do we preserve? go stop go stop Every execution of the full model can be simulated by an execution of the reduced one. Every LTL property that holds in the reduced model hold in the full one. But there can be properties holding for the original model but not the abstract one.

Preserved: [](go->O stop) go stop go stop Not preserved: []<>go Counterexamples need to be checked.

Symmetry A permutation is a one-one and onto function p:A->A. For example, 1->3, 2->4, 3->1, 4->5, 5->2. One can combine permutations, e.g., p1: 1->3, 2->1, 3->2 p2: 1->2, 2->1, 3->3 1->3, 2->2, 3->1 A set of permutations is called a symmetry group.

Using symmetry in analysis Want to find some symmetry group such that for each permutation p in it, R(s,t) if and only if R(p(s), p(t)) and L(p(s))=L(s). Let K(s) be all the states that can be permuted to s. This is a set of states such that each one can be permuted to the other.

Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 Turn=1 L0,CR1 Turn=1 NC0,CR1 Turn=1 L0,NC1 Turn=1 NC0,NC1 Turn=1 NC0,L1 Turn=1 L0,L1 init

Turn=0 L0,L1 Turn=0 L0,NC1 Turn=0 NC0,L1 Turn=0 CR0,NC1 Turn=0 NC0,NC1 Turn=0 CR0,L1 init The quotient model

What is preserved in the following buffer abstraction? What is not preserved? e empty quasi full q q q f

Translating from logic to automata Comment: the language of LTL is a proper subset of the language of Buchi Automata. For example, one cannot express in LTL the following automaton [Wolper]: a a,¬a

Why translating? Want to write the specification in some logic. Want model-checking tools to be able to check the specification automatically.

Preprocessing Convert into normal form, where negation only applies to propositional variables. ¬[]  becomes <>¬ . ¬<>  becomes [] ¬ . What about ¬ (  U  )? Define operator V such that ¬ (  U  ) = (¬  ) R (¬  ), ¬ (  R  ) = (¬  ) U (¬  ).

Semantics of pR q (the Release operator, dual to Until) p qqqqqqq q qq qqqq ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p ¬p¬p

Replace ¬ T by F, and ¬ F by T. Replace ¬ (  \/  ) by ( ¬  ) /\ ( ¬  ) and ¬ (  /\  ) by ( ¬  ) \/ ( ¬  )

Eliminate implications, <>, [] Replace  ->  by ( ¬  ) \/ . Replace <>  by (T U  ). Replace []  by (F R  ).

Example Translate ( []<>P )  ( []<>Q ) Eliminate implication ¬ ( []<>P ) \/ ( []<>Q ) Eliminate [], <>: ¬ ( F R ( T U P ) ) \/ ( F R ( T U Q ) ) Push negation inwards: (T U (F R ¬ P ) ) \/ ( F R ( T U Q ) )

The data structure Incoming NewOld Next Name

The main idea  U  =  \/ (  /\ O (  U  ) )  R  =  /\ (  \/ O (  R  ) ) This separates the formulas to two parts: one holds in the current state, and the other in the next state.

How to translate? Take one formula from “New” and add it to “Old”. According to the formula, either Split the current node into two, or Evolve the node into a new version.

Splitting Incoming New Old Next Incoming New Old Next Incoming New Old Next Copy incoming edges, update other field.

Evolving Incoming New Old Next Incoming New Old Next Copy incoming edges, update other field.

Possible cases:  U , split: Add  to New, add  U  to Next. Add  to New. Because  U  =  \/ (  /\ O (  U  )).  R , split: Add  to New. Add  to New,  R  to Next. Because  R  =  /\ (  \/ O (  R  )).

More cases:  \/ , split: Add  to New. Add  to New.  /\ , evolve: Add  to New. O , evolve: Add  to Next.

How to start? Incoming NewOld Next init aU(bUc)

Incoming init aU(bUc) Incoming aU(bUc) bUcbUc a init

Incoming aU(bUc)bUcbUc init Incoming aU(bUc) c (bUc) b init

When to stop splitting? When “New” is empty. Then compare against a list of existing nodes “Nodes”: If such a with same “Old”, “Next” exists, just add the incoming edges of the new version to the old one. Otherwise, add the node to “Nodes”. Generate a successor with “New” set to “Next” of father.

Incoming a,aU(bUc) aU(bUc) init Incoming aU(bUc) Creating a successor node.

How to obtain the automaton? There is an edge from node X to Y labeled with propositions P (negated or non negated), if X is in the incoming list of Y, and Y has propositions P in field “Old”. Initial nodes are those marked as init. Incoming NewOld Next X Node Y a, b, ¬c

The resulted nodes. a, aU(bUc) b, bUc, aU(bUc)c, bUc, aU(bUc) b, bUcc, bUc

a, aU(bUc) b, bUc, aU(bUc)c, bUc, aU(bUc) b, bUcc, bUc Initial nodes: All nodes with incoming edge from “init”.

Acceptance conditions Use “generalized Buchi automata”, where there are several acceptance sets F 1, F 2, …, F n, and each accepted infinite sequence must include at least one state from each set infinitely often. Each set corresponds to a subformula of form  U . Guarantees that it is never the case that  U  holds forever, without . Translate to Buchi acceptance condition (will be shown later).

Accepting w.r.t. bUc (green) a, aU(bUc) b, bUc, aU(bUc)c, bUc, aU(bUc) b, bUcc, bUc All nodes with c, or without bUc.

Acceptance w.r.t. aU(bUc) (blue) a, aU(bUc) b, bUc, aU(bUc)c, bUc, aU(bUc) b, bUcc, bUc All nodes with bUc or without aU(bUc).

Translating Multiple Buchi into Buchi automata For acceptance sets F 1, F 2, …, F n, generate n copies of the automaton. Move from the ith copy to the i+1th copy when passing a state of F i. In the first copy, the states of F 1 are accepting. Same intuition as the unrestricted intersection: Need to see each accepting set infinitely often, and it does not matter that we miss some of them (there must be an infinite supply anyway…).

Conclusions Model checking: automatic verification of finite state systems. Modeling: affects how we can view and check the system’s properties. Specification formalisms, e.g., LTL or Buchi automata Apply graph algorithms (or BDD, or BMD). Methods to alleviate state space explosion. Translate LTL into Buchi automata.

Algorithmic Testing and Black Box Checking

Why testing? Reduce design/programming errors. Can be done during development, before production/marketing. Practical, simple to do. Check the real thing, not a model. Scales up reasonably. Being state of the practice for decades.

Part 1: Testing of black box finite state machine Know: lTransition relation lSize or bound on size Wants to know: lIn what state we started? lIn what state we are? lTransition relation lConformance lSatisfaction of a temporal property

Finite automata (Mealy machines) S - finite set of states. (size n)  – set of inputs. (size d) O – set of outputs, for each transition. (s 0  S - initial state).    S    S - transition relation.   S    O – output on edge.

Why deterministic machines? Otherwise no amount of experiments would guarantee anything. If dependent on some parameter (e.g., temperature), we can determinize, by taking parameter as additional input. We still can model concurrent system. It means just that the transitions are deterministic. All kinds of equivalences are unified into language equivalence. Also: connected machine (otherwise we may never get to the completely separate parts).

Determinism When the black box is nondeterministic, we might never test some choices. b/1 a/1

Preliminaries: separating sequences s1s1 s3s3 s2s2 a/0 b/1 b/0 b/1 a/0 Start with one block containing all states {s 1, s 2, s 3 }.

A: separate to blocks of states with different output. s1s1 s3s3 s2s2 a/0 b/1 b/0 b/1 a/0 Two sets, separated using the string b {s 1, s 3 }, {s 2 }.

Repeat B: Separate blocks based on moving to different blocks. s1s1 s3s3 s2s2 a/0 b/1 b/0 b/1 a/0 Separate first block using b to three singleton blocks. Separating sequences: b, bb. Max rounds: n-1, sequences: n-1, length: n-1. For each pair of states there is a separating sequence.

Want to know the state of the machine (at end). Homing sequence. Depending on output, would know in what state we are. Algorithm: Put all the states in one block (initially we do not know what is the state). Then repeatedly partitions blocks of states, as long as they are not singletons, as follows: Take a non singleton block, append a distinguishing sequence  that separates at least two states. Update all blocks to the states after executing . Max length: (n-1) 2 (Lower bound: n(n-1)/2.)

Example (homing sequence) s1s1 s3s3 s2s2 a/0 b/1 b/0 b/1 a/0 {s 1, s 2, s 3 } {s 1, s 2 } {s 3 } {s 1 } {s 2 } {s 3 } b b On input b and output 1, still don’t know if was in s 1 or s 3, i.e., if currently in s 2 or s 1. So separate these cases with another b.

Synchronizing sequence One sequence takes the machine to the same final state, regardless of the initial state or the outputs. Not every machine has a synchronizing sequence. Can be checked whether exists and can be found in polynomial time.

State identification: Want to know in which state the system has started (was reset). Can be a preset distinguishing sequence (fixed), or a tree (adaptive). May not exist (PSPACE complete to check if preset exists, polynomial for adaptive). Best known algorithm: exponential length for preset, polynomial for adaptive [LY].

Sometimes cannot identify initial state b/1 a/1 s1s1 s3s3 s2s2 b/0 b/1 a/1 Start with a: in case of being in s 1 or s 3 we’ll move to s 1 and cannot distinguish. Start with b: In case of being in s 1 or s 2 we’ll move to s 2 and cannot distinguish. The kind of experiment we do affects what we can distinguish. Much like the Heisenberg principle in Physics.

Conformance testing Unknown deterministic finite state system B. Known: n states and alphabet . An abstract model C of B. C satisfies all the properties we want from B. C has m states. Check conformance of B and C. Another version: only a bound n on the number of states l is known. 

Check conformance with a given state machine Black box machine has no more states than specification machine (errors are mistakes in outputs, mistargeted edges). Specification machine is reduced, connected, deterministic. Machine resets reliably to a single initial state (or use homing sequence). s1s1 s3s3 s2s2 a/1 b/0 b/1 a/1 ?=?= b/1

Conformance testing [Ch,V] a/1 b/1 Cannot distinguish if reduced or not.   a/1 b/1 a/1 b/1 a/1 b/1 a/1 b/1

Conformance testing (cont.) a bb a a a a b b b a Need: bound on number of states of B.   a

Preparation: Construct a spanning tree b/1 a/1 s1s1 s3s3 s2s2 b/0 b/1 a/1 s1s1 s2s2 s3s3 b/1a/1

How the algorithm works? According to the spanning tree, force a sequence of inputs to go to each state. 1. From each state, perform the distinguishing sequences. 2. From each state, make a single transition, check output, and use distinguishing sequences to check that in correct target state. s1s1 s2s2 s3s3 b/1a/1 Reset or homing Distinguishing sequences

Comments 1. Checking the different distinguishing sequences (m-1 of them) means each time resetting and returning to the state under experiment. 2. A reset can be performed to a distinguished state through a homing sequence. Then we can perform a sequence that brings us to the distinguished initial state. 3. Since there are no more than m states, and according to the experiment, no less than m states, there are m states exactly. 4. Isomorphism between the transition relation is found, hence from minimality the two automata recognize the same languages.

Combination lock automaton Assume accepting states. Accepts only words with a specific suffix (cdab in the example). s1s1 s2s2 s3s3 s4s4 s5s5 bdca Any other input

When only a bound on size of black box is known… Black box can “pretend” to behave as a specification automaton for a long time, then upon using the right combination, make a mistake. b/1 a/1 s1s1 s3s3 s2s2 b/0 b/1 a/1 b/1 Pretends to be S 3 Pretends to be S 1 a/1

Conformance testing algorithm [VC] The worst that can happen is a combination lock automaton that behaves differently only in the last state. The length of it is the difference between the size n of the black box and the specification m. Reach every state on the spanning tree and check every word of length n-m+1 or less. Check that after the combination we are at the state we are supposed to be, using the distinguishing sequences. No need to check transitions: already included in above check. Complexity: m 2 n d n-m+1 Probabilistic complexity: Polynomial. Distinguishing sequences s1s1 s2s2 s3s3 b/1a/1 Words of length  n-m+1 Reset or homing

Model Checking Finite state description of a system B. LTL formula . Translate  into an automaton P. Check whether L(B)  L(P)= . If so, S satisfies . Otherwise, the intersection includes a counterexample. Repeat for different properties.  

Buchi automata (  -automata) S - finite set of states. (B has l  n states) S 0  S - initial states. (P has m states)  - finite alphabet. (contains p letters)    S    S - transition relation. F  S - accepting states. Accepting run: passes a state in F infinitely often. System automata: F=S, deterministic, one initial state. Property automaton: not necessarily deterministic.

Example: check a  a, a aa a <>  a

Example: check <>  a aa aa a a  <>  a

Example: check  <>a Use automatic translation algorithms, e.g., [Gerth,Peled,Vardi,Wolper 95] aa aa  a, a <>   a

System cb a

Every element in the product is a counter example for the checked property. cb a aa aa a a s1s1 s2s2 s3s3 q2q2 q1q1 s 1,q 1 s 1,q 2 s 3,q 2 s 2,q 1 a b c a Acceptance is determined by automaton P.  <>  a

Model Checking / Testing Given Finite state system B. Transition relation of B known. Property represent by automaton P. Check if L(B)  L(P)= . Graph theory or BDD techniques. Complexity: polynomial. Unknown Finite state system B. Alphabet and number of states of B or upper bound known. Specification given as an abstract system C. Check if B  C. Complexity: polynomial if number states known. Exponential otherwise.

Black box checking [PVY] Property represent by automaton P. Check if L(B)  L(P)= . Graph theory techniques. Unknown Finite state system B. Alphabet and Upper bound on Number of states of B known. Complexity: exponential.  

Experiments aa bb cc reset a a b b c c try b a a b b c c try c fail

Simpler problem: deadlock? Nondeterministic algorithm: guess a path of length  n from the initial state to a deadlock state. Linear time, logarithmic space. Deterministic algorithm: systematically try paths of length  n, one after the other (and use reset), until deadlock is reached. Exponential time, linear space.

Deadlock complexity Nondeterministic algorithm: Linear time, logarithmic space. Deterministic algorithm: Exponential (p n-1 ) time, linear space. Lower bound: Exponential time (use combination lock automata). How does this conform with what we know about complexity theory?

Modeling black box checking Cannot model using Turing machines: not all the information about B is given. Only certain experiments are allowed. We learn the model as we make the experiments. Can use the model of games of incomplete information.

Games of incomplete information Two players:  player,  player (here, deterministic). Finitely many configurations C. Including: Initial C i, Winning : W + and W -. An equivalence relation  on C (the  player cannot distinguish between equivalent states). Labels L on moves (try a, reset, success, fail). The  player has the moves labeled the same from configurations that are equivalent. Deterministic strategy for the  player: will lead to a configuration in W +  W -. Cannot distinguish between equivalent configurations. Nondeterministic strategy: Can distinguish between equivalent configurations..

Modeling BBC as games Each configuration contains an automaton and its current state (and more). Moves of the  player are labeled with try a, reset... Moves of the  -player with success, fail. c 1  c 2 when the automata in c 1 and c 2 would respond in the same way to the experiments so far.

A naive strategy for BBC Learn first the structure of the black box. Then apply the intersection. Enumerate automata with  n states (without repeating isomorphic automata). For a current automata and new automata, construct a distinguishing sequence. Only one of them survives. Complexity: O((n+1) p (n+1) /n!)

On-the-fly strategy Systematically (as in the deadlock case), find two sequences v 1 and v 2 of length <=m n. Applying v 1 to P brings us to a state t that is accepting. Applying v 2 to P brings us back to t. Apply v 1 v 2 n to B. If this succeeds, there is a cycle in the intersection labeled with v 2, with t as the P (accepting) component. Complexity: O(n 2 p 2mn m). v1v1 v2v2

Learning an automaton Use Angluin’s algorithm for learning an automaton. The learning algorithm queries whether some strings are in the automaton B. It can also conjecture an automaton M i and asks for a counterexample. It then generates an automaton with more states M i+1 and so forth.

A strategy based on learning Start the learning algorithm. Queries are just experiments to B. For a conjectured automaton M i, check if M i  P =  If so, we check conformance of M i with B ([VC] algorithm). If nonempty, it contains some v 1 v 2 . We test B with v 1 v 2 n. If this succeeds: error, otherwise, this is a counterexample for M i.

Complexity l - actual size of B. n - an upper bound of size of B. d - size of alphabet. Lower bound: reachability is similar to deadlock. O(l 3 d l + l 2 mn) if there is an error. O(l 3 d l + l 2 n d n-l+1 + l 2 mn) if there is no error. If n is not known, check while time allows. Probabilistic complexity: polynomial.

Some experiments Basic system written in SML (by Alex Groce, CMU). Experiment with black box using Unix I/O. Allows model-free model checking of C code with inter-process communication. Compiling tested code in SML with BBC program as one process.

Conclusions Black Box Checking: automatic verification of unspecified system. A hard problem, exponential in number of states, but polynomial on average. Implemented, tested. Another use: when the model is given, but is not exact.

The rest is not part of this lecture

Part 2: Software testing Testing is not about showing that there are no errors in the program. Testing cannot show that the program performs its intended goal correctly. So, what is software testing? Testing is the process of executing the program in order to find errors. A successful test is one that finds an error.

Some software testing stages Unit testing – the lowest level, testing some procedures. Integration testing – different pieces of code. System testing – testing a system as a whole. Acceptance testing – performed by the customer. Regression testing – performed after updates. Stress testing – checking the code under extreme conditions. Mutation testing – testing the quality of the test suite.

Some drawbacks of testing There are never sufficiently many test cases. Testing does not find all the errors. Testing is not trivial and requires considerable time and effort. Testing is still a largely informal task.

Black-Box (data-driven, input- output) testing The testing is not based on the structure of the program (which is unknown). In order to ensure correctness, every possible input needs to be tested - this is impossible! The goal: to maximize the number of errors found.

testing Is based on the internal structure of the program. There are several alternative criterions for checking “enough” paths in the program. Even checking all paths (highly impractical) does not guarantee finding all errors (e.g., missing paths!)

Some testing principles A programmer should not test his/her own program. One should test not only that the program does what it is supposed to do, but that it does not do what it is not supposed to. The goal of testing is to find errors, not to show that the program is errorless. No amount of testing can guarantee error-free program. Parts of programs where a lot of errors have already been found are a good place to look for more errors. The goal is not to humiliate the programmer!

Inspections and Walkthroughs Manual testing methods. Done by a team of people. Performed at a meeting (brainstorming). Takes minutes. Can find 30%-70% of errors.

Code Inspection Team of 3-5 people. One is the moderator. He distributes materials and records the errors. The programmer explains the program line by line. Questions are raised. The program is analyzed w.r.t. a checklist of errors.

Checklist for inspections Data declaration All variables declared? Default values understood? Arrays and strings initialized? Variables with similar names? Correct initialization? Control flow Each loop terminates? DO/END statements match? Input/output OPEN statements correct? Format specification correct? End-of-file case handled?

Walkthrough Team of 3-5 people. Moderator, as before. Secretary, records errors. Tester, play the role of a computer on some test suits on paper and board.

Selection of test cases (for white-box testing) The main problem is to select a good coverage criterion. Some options are: Cover all paths of the program. Execute every statement at least once. Each decision has a true or false value at least once. Each condition is taking each truth value at least once. Check all possible combinations of conditions in each decision.

Cover all the paths of the program Infeasible. Consider the flow diagram on the left. It corresponds to a loop. The loop body has 5 paths. If the loops executes 20 times there are 5^20 different paths! May also be unbounded!

How to cover the executions? IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END; Choose values for A,B,X. Value of X may change, depending on A,B. What do we want to cover? Paths? Statements? Conditions?

Statement coverage Execute every statement at least once By choosing A=2,B=0,X=3 each statement will be chosen. The case where the tests fail is not checked! IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END; Now x=1.5

Decision coverage Each decision has a true and false outcome at least once. Can be achieved using A=3,B=0,X=3 A=2,B=1,X=1 Problem: Does not test individual conditions. E.g., when X>1 is erroneous in second decision. IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END;

Decision coverage A=3,B=0,X=3  IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END; Now x=1

Decision coverage A=2,B=1,X=1  The case where A  1 and the case where x>1 where not checked! IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END;

Condition coverage Each condition has a true and false value at least once. For example: A=1,B=0,X=3 A=2,B=1,X=0 lets each condition be true and false once. Problem:covers only the path where the first test fails and the second succeeds. (A>1) IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END;

Condition coverage A=1,B=0,X=3  (A>1) IF (A>1) & (B=0) THEN X=X/A; END; IF (A=2) | (X>1) THEN X=X+1; END;

Condition coverage A=2,B=1,X=0  Did not check the first THEN part at all!!! Can use condition+decisio n coverage. (A>1) IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END;

Multiple Condition Coverage Test all combinations of all conditions in each test. A>1,B=0 A>1,B≠0 A  1,B=0 A  1,B≠0 A=2,X>1 A=2,X  1 A≠2,X>1 A≠2,X  1 IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END;

A smaller number of cases: A=2,B=0,X=4 A=2,B=1,X=1 A=1,B=0,X=2 A=1,B=1,X=1 Note the X=4 in the first case: it is due to the fact that X changes before being used! IF (A>1)&(B=0) THEN X=X/A; END; IF (A=2)|(X>1) THEN X=X+1; END; Further optimization: not all combinations. For C /\ D, check (C, D), (  C, D), (C,  D). For C \/ D, check (  C,  D), (  C, D), (C,  D).

Preliminary: Relativizing assertions  (B) : x1= y1 * x2 + y2 /\ y2 >= 0 Relativize  B) w.r.t. the assignment becomes  B) [Y\g(X,Y)]  e  (  B) expressed w.r.t. variables at A.)  (B) A =  x1=0 * x2 + x1 /\ x1>=0 Think about two sets of variables, before={x, y, z, …} after={x’,y’,z’…}. Rewrite  (B) using after, and the assignment as a relation between the set of variables. Then eliminate after. Here: x1’=y1’ * x2’ + y2’ /\ y2’>=0 /\ x1=x1’ /\ x2=x2’ /\ y1’=0 /\ y2’=x1 now eliminate x1’, x2’, y1’, y2’. (y1,y2)=(0,x1) A B A B Y=g(X,Y)

Verification conditions: tests  C)   B)= t(X,Y) /\  C)  D)   B)=  t(X,Y) /\  D)  B)=  D) /\  y2  x2 y2>=x2 B C D B C D t(X,Y) F T FT

How to find values for coverage? Put true at end of path. Propagate path backwards. On assignment, relativize expression. On “yes” edge of decision, add decision as conjunction. On “no” edge, add negation of decision as conjunction. Can be more specific when calculating condition with multiple condition coverage. A>1 & B=0 A=2 | X>1 X=X+1 X=X/A no yes true

How to find values for coverage? A>1 & B=0 A=2 | X>1 X=X+1 X=X/A no yes true A≠2 /\ X>1 (A≠2 /\ X/A>1) /\ (A>1 & B=0) A≠2 /\ X/A>1 Need to find a satisfying assignment: A=3, X=6, B=0 Can also calculate path condition forwards.

How to cover a flow chart? Cover all nodes, e.g., using search strategies: DFS, BFS. Cover all paths (usually impractical). Cover each adjacent sequence of N nodes. Probabilistic testing. Using random number generator simulation. Based on typical use. Chinese Postman: minimize edge traversal Find minimal number of times time to travel each edge using linear programming or dataflow algorithms. Duplicate edges and find an Euler path.

Test cases based on data-flow analysis Partition the program into pieces of code with a single entry/exit point. For each piece find which variables are set/used/tested. Various covering criteria: from each set to each use/test From each set to some use/test. X:=3 z:=z+x x>y t>y

Test case design for black box testing Equivalence partition Boundary value analysis Cause-effect graphs

Equivalence partition Goals: Find a small number of test cases. Cover as much possibilities as you can. Try to group together inputs for which the program is likely to behave the same. Specification condition Valid equivalence class Invalid equivalence class

Example: A legal variable Begins with A-Z Contains [A-Z0-9] Has 1-6 characters. Specification condition Valid equivalence class Invalid equivalence class Starting char Chars Length Starts A-ZStarts other [A-Z0-9]Has others 1-6 chars0 chars, >6 chars

Equivalence partition (cont.) Add a new test case until all valid equivalence classes have been covered. A test case can cover multiple such classes. Add a new test case until all invalid equivalence class have been covered. Each test case can cover only one such class. Specification condition Valid equivalence class Invalid equivalence class

Example AB36P (1,3,5) 1XY12 (2) A17#%X (4) Specification condition Valid equivalence class Invalid equivalence class Starting char Chars Length Starts A-ZStarts other [A-Z0-9]Has others 1-6 chars0 chars, >6 chars (6) VERYLONG (7)

Boundary value analysis In every element class, select values that are closed to the boundary. If input is within range -1.0 to +1.0, select values , -1.0, , 0.999, 1.0, If needs to read N data elements, check with N-1, N, N+1. Also, check with N=0.

Test case generation based on LTL specification Compiler Model Checker Path condition calculation First order instantiator Test monitoring Transitions Path Flow chart LTL  Aut

Goals Verification of software. Compositional verification. Only a unit of code. Parametrized verification. Generating test cases. A path found with some truth assignment satisfying the path condition. In deterministic code, this assignment guarantees to derive the execution of the path. In nondeterministic code, this is one of the possibilities. Can transform the code to force replying the path.

Divide and Conquer property automaton flow chart Intersect property automaton with the flow chart, regardless of the statements and program variables expressions. Add assertions from the property automaton to further restrict the path condition. Calculate path conditions for sequences found in the intersection. Calculate path conditions on-the-fly. Backtrack when condition is false. Thus, advantage to forward calculation of path conditions (incrementally).

Spec: ¬at l 2 U (at l 2 /\  ¬at l 2 /\(¬at l 2 U at l 2 )) ¬ at l 2 at l 2 ¬ at l 2 at l 2 l 2 :x:=x+z l 3 :x<t l1:…l1:… l 2 :x:=x+z l 3 :x<t l 2 :x:=x+z X =

Spec: ¬at l 2 U (at l 2 /\ x  y /\  (¬at l 2 /\(¬at l 2 U at l 2 /\ x  2  y ))) ¬ at l 2 at l 2 /\ x  y ¬ at l 2 at l 2 /\ x  2  y l 2 :x:=x+z l 3 :x<t l1:…l1:… l 2 :x:=x+z l 3 :x<t l 2 :x:=x+z X = xyxy x2yx2y

Example: GCD l 1 :x:=a l 5 :y:=z l 4 :x:=y l 3 :z:=x rem y l 2 :y:=b l 6 :z=0? yesno l0l0 l7l7

Example: GCD l 1 :x:=a l 5 :x:=y l 4 :y:=z l 3 :z:=x rem y l 2 :y:=b l 6 :z=0? yesno Oops … with an error (l4 and l5 were switched). l0l0 l7l7

Why use Temporal specification Temporal specification for sequential software? Deadlock? Liveness? – No! intuition Captures the tester’s intuition about the location of an error: “I think a problem may occur when the program runs through the main while loop twice, then the if condition holds, while t>17.”

Example: GCD l 1 :x:=a l 5 :x:=y l 4 :y:=z l 3 :z:=x rem y l 2 :y:=b l 6 :z=0? yesno l0l0 l7l7 a>0/\b>0/\at l 0 /\  at l 7 at l 0 /\ a>0/\ b>0 at l 7

Example: GCD l 1 :x:=a l 5 :x:=y l 4 :y:=z l 3 :z:=x rem y l 2 :y:=b l 6 :z=0? yesno l0l0 l7l7 a>0/\b>0/\at l 0 /\  at l 7 Path 1: l 0 l 1 l 2 l 3 l 4 l 5 l 6 l 7 a>0/\b>0/\a rem b=0 Path 2: l 0 l 1 l 2 l 3 l 4 l 5 l 6 l 3 l 4 l 5 l 6 l 7 a>0/\b>0/\a rem b  0

Potential explosion Bad point: potential explosion Good point: may be chopped on-the-fly

Drivers and Stubs Driver: represents the program or procedure that called our checked unit. Stub: represents a procedure called by our checked unit. In our approach: replace both of them with a formula representing the effect the missing code has on the program variables. Integrate the driver and stub specification into the calculation of the path condition. l 1 :x:=a l 5 :x:=y l 4 :y:=z l 3 :z ’ =x rem y /\x ’ =x/\y ’ =x l 2 :y:=b l 6 :z=0? yesno l0l0 l7l7

Conclusions Black box testing: Know transition relation, or bound on number of states, want to find initial state, structure, conformance, temporal property. Software testing: Unit testing, code inspection, coverage, test case generation. Model checking and testing have a lot in common.