Model Checking From Tools to Theory University of Pennsylvania

Slides:



Advertisements
Similar presentations
1 Verification by Model Checking. 2 Part 1 : Motivation.
Advertisements

Model Checking Lecture 4. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
The Quest for Correctness Joseph Sifakis VERIMAG Laboratory 2nd Sogeti Testing Academy April 29th 2009.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Automata Theory December 2001 NPDAPart 3:. 2 NPDA example Example: a calculator for Reverse Polish expressions Infix expressions like: a + log((b + c)/d)
Timed Automata Rajeev Alur University of Pennsylvania SFM-RT, Bertinoro, Sept 2004.
The Pumping Lemma for CFL’s
CS 267: Automated Verification Lecture 8: Automata Theoretic Model Checking Instructor: Tevfik Bultan.
Lecture 24 MAS 714 Hartmut Klauck
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
Equivalence of Extended Symbolic Finite Transducers Presented By: Loris D’Antoni Joint work with: Margus Veanes.
ECE Synthesis & Verification - L271 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Model Checking basics.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Hybrid Systems Presented by: Arnab De Anand S. An Intuitive Introduction to Hybrid Systems Discrete program with an analog environment. What does it mean?
Timed Automata.
A Fixpoint Calculus for Local and Global Program Flows Swarat Chaudhuri, U.Penn (with Rajeev Alur and P. Madhusudan)
Pushdown Systems Koushik Sen EECS, UC Berkeley Slide Source: Sanjit A. Seshia.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
The Language Theory of Bounded Context-Switching Gennaro Parlato (U. of Illinois, U.S.A.) Joint work with: Salvatore La Torre (U. of Salerno, Italy) P.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
Model Checking Lecture 5. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Marrying Words and Trees Rajeev Alur University of Pennsylvania CSR, September 2007.
Discrete Abstractions of Hybrid Systems Rajeev Alur, Thomas A. Henzinger, Gerardo Lafferriere and George J. Pappas.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
A temporal logic for calls and returns P. Madhusudan University of Pennsylvania Joint work with Rajeev Alur and Kousha Etessami Talk at HCES 2004, Philadelphia.
The Benefits of Exposing Calls and Returns Rajeev Alur University of Pennsylvania CONCUR/SPIN, August 2005.
Review of the automata-theoretic approach to model-checking.
Presenter: PCLee Design Automation Conference, ASP-DAC '07. Asia and South Pacific.
Normal forms for Context-Free Grammars
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Model Checking Lecture 5. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
1 Carnegie Mellon UniversitySPINFlavio Lerda Bug Catching SPIN An explicit state model checker.
Nested Words and Trees Rajeev Alur University of Pennsylvania Joint work with S.Chaudhuri & P.Madhusudan Games Workshop, Cambridge, UK, July 2006.
Regular Model Checking Ahmed Bouajjani,Benget Jonsson, Marcus Nillson and Tayssir Touili Moran Ben Tulila
Institute for Applied Information Processing and Communications 1 Karin Greimel Semmering, Open Implication.
Model Checking Lecture 4 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
Benjamin Gamble. What is Time?  Can mean many different things to a computer Dynamic Equation Variable System State 2.
Model Checking Lecture 3 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
Languages of nested trees Swarat Chaudhuri University of Pennsylvania (with Rajeev Alur and P. Madhusudan)
Visibly Pushdown Languages Philippe Giabbanelli CMPT 894 – Spring 2008.
CIS 842: Specification and Verification of Reactive Systems Lecture Specifications: Sequencing Properties Copyright , Matt Dwyer, John Hatcliff,
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Internal Talk, Oct Executable Specifications using Message Sequence Charts Abhik Roychoudhury School of Computing National University of Singapore.
2. Regular Expressions and Automata 2007 년 3 월 31 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.33 ~ 56.
Algorithmic Software Verification Rajeev Alur University of Pennsylvania ARO Review, May 2005.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Today’s Agenda  Quiz 4  Temporal Logic Formal Methods in Software Engineering1.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
CS 154 Formal Languages and Computability May 12 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Comp 411 Principles of Programming Languages Lecture 3 Parsing
University of Pennsylvania Joint work with S.Chaudhuri & P.Madhusudan
Adding Nesting Structure to Words
PDAs Accept Context-Free Languages
Turing Machines Acceptors; Enumerators
Alternating tree Automata and Parity games
CSE322 CONSTRUCTION OF FINITE AUTOMATA EQUIVALENT TO REGULAR EXPRESSION Lecture #9.
Generating Optimal Linear Temporal Logic Monitors by Coinduction
CS21 Decidability and Tractability
Presentation transcript:

Model Checking From Tools to Theory University of Pennsylvania Rajeev Alur University of Pennsylvania 25MC, FLOC, August 2006

Algorithms/Complexity CS Theory Automata/Logics Algorithms/Complexity Model Checking Databases Compilers Linguistic/NLP How has model checking influenced basic theory? Connecting tree automata, fixpoint logics, parity games Temporal logics and automata over infinite words Data structures: OBDDs Timed and hybrid automata Many more examples

Software Analysis Program P Specification S Logics/automata Ad-hoc patterns Implicit (built in tool) Program annotations Product M Analysis tool Model checking Static analysis Deductive reasoning Testing Runtime monitoring

Checking Structured Programs Classical model checking: Both program and specification define regular languages Control-flow requires stack, so program defines a context-free language Algorithms exist for checking regular specifications against context-free models Emptiness of pushdown automata is solvable Product of a regular language and a context-free language is context-free But, checking context-free spec against a context-free model is undecidable! Context-free languages are not closed under intersection Inclusion as well as emptiness of intersection undecidable Existing software model checkers: pushdown models (Boolean programs) and regular specifications SLAM, BLAST, F-SOFT, jMoped … Even in absence of recursion, hierarchical structure retained for analysis

Are Context-free Specs Interesting? Classical Hoare-style pre/post conditions If p holds when procedure A is invoked, q holds upon return Total correctness: every invocation of A terminates Integral part of emerging standard JML Stack inspection properties (security/access control) If setuuid bit is being set, root must be in call stack Interprocedural data-flow analysis All these need matching of calls with returns, or finding unmatched calls Recall: Language of words over [, ] such that brackets are well matched is not regular, but context-free

Checking Context-free Specs Many tools exist for checking specific properties Security research on stack inspection properties Annotating programs with asserts and local variables Inter-procedural data-flow analysis algorithms What’s common to checkable properties? Both model M and spec S have their own stacks, but the two stacks are synchronized, so product is possible As a generator, program should expose the matching structure of calls and returns Solution: Nested words and theory of regular languages over nested words

Nested Words Nested word: Positions classified as: Linear sequence + well-nested edges Positions labeled with symbols in S a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 Positions classified as: Call positions: both linear and hierarchical successors Return positions: both linear and hierarchical predecessors Internal positions: otherwise

Program Executions as Nested Words An execution as a word Symbols: w : write x r : read x s : other s w r 1 2 3 4 An execution as a nested word s w r 1 2 3 4 Summary edges from calls to returns global int x; bool P() { … x = 3; if Q x = 1 ; } bool Q () { local int y; x = y; return (x==0);

RNA as a Nested Word Primary structure: Linear sequence of nucleotides (A, C, G, U) Secondary structure: Hydrogen bonds between complementary nucleotides (A-U, G-C, G-U) A C G A U C G G A U G C U C C G In literature, this is modeled as trees. Algorithmic question: Find similarity between RNAs using edit distances

Model for Linear Hierarchical Data Nested words: both linear and hierarchical structure is made explicit. This seems natural in many applications Executions of structured program RNA: primary backbone is linear, secondary bonds are well-nested XML documents: matching of open/close tags Words: only linear structure is explicit Pushdown automata add/discover hierarchical structure Parantheses languages: implicit nesting edges Ordered Trees: only hierarchical structure is explicit Ordering of siblings imparts explicit partial order Linear order is implicit, and can be recovered by infix traversal

Nested Word Automata (NWA) q8 q7 q5 q4 q3 q2 q1 q0 q9=dr(q8,q29,a9) q6=di(q5,a6) (q2,q29)=dc(q1,a2) q29 q47 a1 a2 a3 a4 a5 a6 a7 a8 a9 States Q, initial state q0, final states F Starts in initial state, reads the word from left to right labeling edges with states, where states on the outgoing edges are determined from states of incoming edges Transition function: dc : Q x S -> Q x Q (for call positions) di : Q x S -> Q (for internal positions) dr : Q x Q x S -> Q (for return positions) Nested word is accepted if the run ends in a final state

Regular Languages of Nested Words A set of nested words is regular if there is a finite-state NWA that accepts it Nondeterministic automata over nested words Transition function: dc: QxS->2QxQ, di :Q x S -> 2Q, dr:Q x Q x S -> 2Q Can be determinized Appealing theoretical properties Effectively closed under various operations (union, intersection, complement, concatenation, projection, Kleene-* …) Decidable decision problems: membership, language inclusion, language equivalence … Alternate characterization: MSO, syntactic congruences

Determinization q->w q->w’ q’->w’’… q->q q’->q’… q->u q’->v… q->u’ q’->v’… u’->u’’ v’->v’’… u’->w u’->w’ v’->w’’… Goal: Given a nondeterministic automaton A with states Q, construct an equivalent deterministic automaton B Intuition: Maintain a set of “summaries” (pairs of states) State-space of B: 2QxQ Initially, state contains q->q, for each q At call, if state u splits into (u’,u’’), summary q->u splits into (q->u’,u’->u’’) At return, summaries q->u’ and u’->w join to give q->u Acceptance: must contain q->q’, where q is initial and q’ is final

MSO-based Characterization Monadic Second Order Logic of Nested Words First order variables: x,y,z; Set variables: X,Y,Z… Atomic formulas: a(x), X(x), x=y, x < y, x -> y Logical connectives and quantifiers Sample formula: For all x,y. ( (a(x) and x -> y) implies b(y)) Every call labeled a is matched by a return labeled b Thm: A language L of nested words is regular iff it is definable by an MSO sentence Robust characterization of regularity as in case of languages of words and languages of trees

Application: Software Analysis A program P with stack-based control is modeled by a set L of nested words it generates Choice of S depends on the intended application Summary edges exposing call/return structure are added (exposure can depend on what needs to be checked) If P has finite data (e.g. pushdown automata, Boolean programs, recursive state machines) then L is regular Specification S given as a regular language of nested words Verification: Does every behavior in L satisfy S ? Take product of P and complement of S and analyze Runtime monitoring: Check if current execution is accepted by S (compiled as a deterministic automaton) Model checking: Check if L is contained in S, decidable when P has finite data (no extra cost, as analysis still requires context-free reachability)

Temporal Logic of Nested Time: CaRet Global paths, Local paths, Caller paths Three versions of every temporal modality Sample CaRet formulas: (if p then local-next q) global-unless r if p then caller-eventually q Global-always (if p then local-eventually q)

Connection to Pushdown Automata Words Parse Trees Generator Nested Word Automata Nested words Acceptor Note: First formalization of our ideas led to capturing the shape by typing of input symbols as calls, internals, and returns, and the class of Visibly Pushdown Languages (STOC’04) as a subclass of deterministic context-free languages with the same closure/decidability properties.

New Theory Problems Congruences and minimization First-order Temporal logics for nested time Infinite nested words and w-regular languages Nested trees for branching-time verification (talk@CAV) Fixpoint logics

Application: Document Processing XML Document Query Processing <conference> <name> CAV 2006 </name> <location> <city> Seattle </city> <hotel> Sheraton </hotel> </location> <sponsor> MSR </sponsor> Cadence </conference> Model a document d as a nested word Nesting edges from <tag> to </tag> Sample Query: Find documents related to conferences sponsored by Cadence in Seattle Specify query as a regular language L of nested words Analysis: Membership question Does document d satisfy query L ? Use NWA instead of tree automata! (typically, no recursion, but only hierarchy)

Nested Words vs Ordered Trees a x y z b c <a> <b> x y z </b> <c> </c></a> Why not use tree encoding and tree automata ? Notion of regularity is basically the same in both views Nested words are more flexible: can take prefix, suffix, word concatenation, being well-matched not a pre-req Reading input from left to right for query processing is more natural with nested words Many versions of tree automata with different properties NWA have both top-down and bottom-up aspect, more succinct!

Recap Program executions have both linear and hierarchical structure Allow specification logics/automata to refer to both structures Robust foundation for more expressive specifications Analysis techniques for structured programs already based uopn computing summaries, so extra expressiveness comes for free Nested words are worth theoretical investigations New way of looking at pushdown automata Nested trees and fixpoint logics Temporal modalities for nested time … Modeling documents as nested words can be fruitful Automata have both top-down and bottom-up flavor Better ways of querying streaming documents? Talk based on joint work with P. Madhusudan S. Chaudhuri, K. Etessami, M. Viswanathan …