(One-Path) Reachability Logic Grigore Rosu, Andrei Stefanescu, Brandon Moore University of Illinois at Urbana-Champaign, USA Stefan Ciobaca University Alexadru Ioan Cuza, Romania
Long-Standing Dream Deductive program verifier Parser Interpreter Formal Language Definition (Syntax and Semantics) Model checker Symbolic execution (semantic) Debugger One long-standing dream of the programming language community is to have a framework that allows and encourages the language designers to formally define their languages once and for all, using an intuitive and attractive notation, and then obtain essentially for free implementations as well as analysis tools for the defined languages. Compiler
Language Frameworks PLT-Redex/Racket (Findler et al.) OTT (Sewell et al.) PLanComps (Mosses et al.) Raskal (Klint et al.) RLS-Maude (Meseguer et al.) K (Rosu et al.) … All based on operational semantics Defined semantics serve as language reference models of languages, but are close to useless for verification Takes 1-2 years to define a language
C Semantics (in K) C configuration … plus ~1200 user-defined rules To give an idea what it takes to define a large language, here is, for example, the configuration of C. It has more than 70 cells! The heap, which is the subject of so many debates in program verification, is just one of them. … plus ~1200 user-defined rules … plus ~1500 automatically generated rules
Operational Semantics Virtually all operational semantics can be defined with rewrite rules of the form We would like to reason about programs using precisely such operational semantics!
State-of-the-Art Many different program logics for “state” properties: FOL, HOL, Separation logic… Redefine the language using a different semantic approach (Hoare/separation/dynamic logic) Very language specific, error-prone; e.g.:
State-of-the-Art Thus, these semantics need to be proved sound, sometimes also relatively complete, wrt trusted, operational semantics of the language Verification tools developed using them So we have an inherent gap between trusted, operational semantics, and the semantics currently used for program verification
Our Proposal Use directly the trusted operational semantics! Has been done before (ACL2), but proofs are low-level (induction on the structure of program or on steps in transition system) and language-specific We give a language-independent proof system Takes unchanged operational semantics as axioms Derives reachability rules Both operational semantics rules and program properties stated as reachability rules Is sound (partially correct) and relatively complete
Need a means to specify static and dynamic program properties Deductive program verifier Parser Interpreter Formal Language Definition (Syntax and Semantics) Model checker Symbolic execution (semantic) Debugger Compiler
Matching Logic [Rosu, Ellison, Schulte 2010] Logic for specifying static properties about program configurations and reason with them Key insight: Configuration terms with variables are allowed to be used as predicates, called patterns Semantically, their satisfaction means matching Matching logic is parametric in a (first-order) configuration model: typically the underlying model of the operational semantics
Configurations For concreteness, assume configurations having the following syntax: (matching logic works with any configurations) Examples of concrete (ground) configurations:
Patterns Concrete configurations are already patterns, but very simple ones, ground patterns Example of more complex pattern Thus, patterns generalize both terms and [FOL]
Matching Logic Reasoning We can now prove (using [FOL] reasoning) properties about configurations, such as
Matching Logic vs. Separation Logic Matching logic achieves separation through matching at the structural (term) level, not through special logical connectives (*). Separation logic = Matching logic [heap] SL: ML: Matching logic realizes separation at all levels of the configuration, not only in the heap the heap was only 1 out of the 75 cells in C’s def. [OOPSLA’12]
Need a means to specify static and dynamic program properties Deductive program verifier Parser Interpreter Formal Language Definition (Syntax and Semantics) Model checker Symbolic execution (semantic) Debugger Compiler
Reachability Rules - Syntax “Rewrite” rules over matching logic patterns: Since patterns generalize terms, matching logic reachability rules capture term rewriting rules Moreover, deals naturally with side conditions: turn into
Conditional Reachability Rules The involved patterns can share free variables Generalize conditional rewrite rules
Reachability Rules - Semantics In the transition system generated by the operational semantics on the configuration model, any terminating configuration that matches reaches a configuration that matches (patterns can share free variables) That is, partial correctness
Expressivity of Reachability Rules Capture operational semantics rules: Capture Hoare Triples:
Hoare Triple = Syntactic Sugar This is a code fragment from a program reversing a singly-linked list verified by MatchC (which we will discuss later). The invariant, states that p points to the part of the list already reversed, and x points to the part of the list yet to be reversed. This Hoare-style invariant is just syntactic sugar for a reachability rule. The LHS combines the code of the loop (shown in red) and the invariant, while in the RHS the code has been executed, and the condition of the loop has been evaluated with the semantics and its negation added as a constraint (shown in blue).
Reachability Logic Language-independent proof system that derives reachability rules from other reachability rules: The main result of the paper is a language-independent proof system which derives reachability rules specifying program properties from trusted reachability rules. In the beginning the trusted rules are just the operational semantics rules. During the proof one can claim additional reachability rules, which cannot be used right away. They can be used only after taking at least one step with the trusted rules in A. // The rules in C are added to those on A only after taking at least one step with the trusted rules in A. Trusted reachability rules (starts with operational semantics) Target reachability rule Intuitively: symbolic execution with operational semantics + reasoning with cyclic behaviors Claimed reachability rules
7 Proof Rules for Reachability
Traditional Verification vs. Our Approach Traditional proof systems: language-specific Our proof system: language-independent
Results Soundness (partial correctness): Under weak well-definedness conditions on (see paper) Mechanized in Coq, for verification certificates Relative completeness: Under weak assumptions on the configuration model (e.g., it can express Godel’s beta predicate)
Implementation Being implemented within the K framework Symbolic execution using the operational semantic rules; custom solver for the matching part + Z3 solver for the model reasoning part (for the Consequence rule) Circularity steps given by user (via pre/post/inv annotations), everything else automatic Online interface available for fragment of C at http://matching-logic.org
Related Work and Limitations Hoare logic: already explained Dynamic logic: need to redefine language semantics (invariant rules, etc.), but more expressive: CTL*: expressive, but not clear how to integrate with operational semantics; maybe CTL* over ML patterns? Currently we only support one-path reachability for conditional rules. We have a similar proof system for all-path reachability, but only with unconditional rules Previous one-path attempts: [ICALP’12] , [OOPSLA’12]
Conclusion Program verification using the language operational semantics is possible and feasible Language-independent 7-rule reachability proof system, which is sound and complete Circularity generalizes the invariant rules Being implemented in the K programming language design framework