Spring 2016 Program Analysis and Verification

Slides:

Advertisements

Similar presentations

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?

Advertisements

CSCI 115 Chapter 6 Order Relations and Structures.

Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.

Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.

Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.

Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.

From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.

Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.

Administrative stuff Office hours: After class on Tuesday.

Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.

1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.

Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.

1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.

Sets, POSets, and Lattice © Marcelo d’Amorim 2010.

Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.

Solving fixpoint equations

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 2: Operational Semantics I Roman Manevich Ben-Gurion University.

1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.

Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 14: Numerical Abstractions Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.

CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.

Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.

Program Analysis and Verification

Program Analysis and Verification

Program Analysis and Verification Noam Rinetzky Lecture 5: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:

Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.

Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University

Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.

Program Analysis and Verification

Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.

Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.

Program Analysis and Verification Spring 2016 Program Analysis and Verification Lecture 5: Axiomatic Semantics II Roman Manevich Ben-Gurion University.

Spring 2017 Program Analysis and Verification

Chapter 6 Order Relations and Structures

Spring 2017 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2017 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2016 Program Analysis and Verification

Spring 2017 Program Analysis and Verification Operational Semantics

Program Analysis and Verification

Sungho Kang Yonsei University

IS 2150 / TEL 2810 Introduction to Security

Fall Compiler Principles Lecture 10: Global Optimizations

Data Flow Analysis Compiler Design

Lecture 20: Dataflow Analysis Frameworks 11 Mar 02

Lecture 10: Fixed Points ad Infinitum M.C. Escher, Moebius Ants

Background material.

Background material.

Spring 2016 Program Analysis and Verification Operational Semantics

IS 2150 / TEL 2810 Introduction to Security

IS 2150 / TEL 2810 Information Security & Privacy

Presentation transcript:

Spring 2016 Program Analysis and Verification Lecture 8: Abstract Interpretation I Semantic Domains Roman Manevich Ben-Gurion University

Tentative syllabus Program Verification Program Analysis Basics Operational semantics Hoare Logic Applying Hoare Logic Weakest Precondition Calculus Proving Termination Data structures Automated Verification Program Analysis Basics From Hoare Logic to Static Analysis Control Flow Graphs Equation Systems Collecting Semantics Using Soot Abstract Interpretation fundamentals Lattices Fixed-Points Chaotic Iteration Galois Connections Domain constructors Widening/ Narrowing Analysis Techniques Numerical Domains Alias analysis Interprocedural Analysis Shape Analysis CEGAR

Collecting semantics in equational form A vector of variables R[0, …, k] one per input/output of a node R[0] is for entry For node n with multiple predecessors add equation R[n] = {R[k] | k is a predecessor of n} For an atomic operation node R[m] S R[n] add equation R[n] = S R[m] Transform if b then S1 else S2 to (assume b; S1) or (assume b; S2) entry R[0] R[1] if x > 0 R[3] R[2] R[4] exit x := x-1

Agenda Semantic domains Preorders Partial orders (posets) Appendix A. Semantic domains Preorders Partial orders (posets) Pointed posets Ascending/descending chains The height of a poset Join and Meet operators Complete lattices Constructing new lattices from old

Abstract interpretation Theory [1977] By Rama (Own work) [CC-BY-SA-2.0-fr (http://creativecommons.org/licenses/by-sa/2.0/fr/deed.en)], via Wikimedia Commons

Abstract Interpretation [CC77] A very general mathematical framework for approximating semantics Generalizes Hoare Logic Generalizes weakest precondition calculus Allows designing sound static analysis algorithms Usually compute by iterating to a fixed-point Not specific to any programming language style Results of an abstract interpretation are (loop) invariants Can be interpreted as axiomatic verification assertions and used for verification

Annotating programs { P’ } S { Q’ } { P } S { Q } [consp] Annotate(P, S) = case S is x:=aexpr return {P} x:=aexpr {F*[x:=aexpr] P} case S is S1; S2 let Annotate(P, S1) be {P} A1 {Q1} let Annotate(Q1, S2) be {Q1} A2 {Q2} return {P} A1; {Q1} A2 {Q2} case S is if bexpr then S1 else S2 let Pt = F[assume bexpr] P let Pf = F[assume bexpr] P let Annotate(Pt, S1) be {Pt} A1 {Q1} let Annotate(Pf, S2) be {Pf} A2 {Q2} return {P} if bexpr then {Pt} A1 {Q1} else {Pf} A2 {Q2} {Q1  Q2} case S is while bexpr do S N := Nc := P // Initialize repeat let Pt = F[assume bexpr] Nc let Annotate(Pt, S) be {Nc} Abody {N} Nc := Nc  N until N = Nc return {P} INV= {N} while bexpr do {Pt} Abody {F[assume bexpr](N)} Approximates concrete semantics sp(x:=aexpr, P)  F*[x:=aexpr] Approximates disjunction { P’ } S { Q’ } { P } S { Q } [consp] if PP’ and Q’Q

representation of sets of states representation of sets of states The big picture Use semantic domains to define both concrete semantics and abstract semantics Relate semantics in a sound way Interpret program over abstract semantics abstract representation of sets of states abstract representation of sets of states statement S abstract semantics abstraction meaning abstraction meaning set of states set of states set of states statement S  collecting semantics

A theory of semantic domains 1. Approximating elements 2. Approximating sets of elements By Brett Jordan David Macdonald [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Overall idea A semantic domain can be used to define properties (representations of predicates) Also called abstract states We called them assertions in axiomatic semantics Common representations Logical formulas Automata Specialized graphs

A taxonomy of semantic domain types Complete Lattice (D, , , , , ) Lattice (D, , , , , ) Join semilattice (D, , , ) Meet semilattice (D, , , ) Complete partial order (CPO) (D, , ) Partial order (poset) (D, ) Preorder (D, )

preorders

Preorder Let D (for semantic domain) be a set of elements We say that a binary order relation  over D is a preorder if the following conditions hold for every d, d’, d’’  D Reflexive: d  d Transitive: d  d’ and d’  d’’ implies d  d’’ There may exist d, d’ such that d  d’ and d’  d yet d  d’

Preorder examples SAV-predicates SAV-factoids  = { x = y | x, y  Var }  { x = y + z | x, y, z  Var } SAV-predicates  = 2 Order relation 1: P1 set P2 iff P1  P2 Order relation 2: P1 imp P2 iff P1  P2 Which order relation is stronger (contains more pairs)? Which order relation is easier to check? What if both P1 and P2 are in the image of reduce?

SAV preorder 1: P1 set P2 iff P1  P2 Hasse diagram Var = {x, y} {} {x=y} {y=x} {x=x+x} {y=y+y} {y=x+y} {y=y+x} {x=x+y} {x=y+x} … {x=y, y=x} {x=y, x=x+x} {x=x+y, x=y+x} … {x=y, x=x+x, x=x+y} {x=y, x=x+x, x=x+y} {x=y, y=x, x=x+x, y=y+y, y=x+y, y=y+x, x=x+y, x=y+x}

SAV preorder 2: P1 imp P2 iff P1  P2 Var = {x, y} {} {x=y} {y=x} {x=x+x} {y=y+y} {y=x+y} {y=y+x} {x=x+y} {x=y+x} … {x=y, y=x} {x=y, x=x+x} {x=x+y, x=y+x} … … {x=y, x=x+x, x=x+y} {x=y, x=x+x, x=x+y} {x=y, y=x, x=x+x, y=y+y, y=x+y, y=y+x, x=x+y, x=y+x}

Preorder examples CP-predicates CP-factoids  = { x = c | x  Var, c  Z } CP-predicates  = 2 Order relation 1: P1 set P2 iff P1  P2 Order relation 2: P1 imp P2 iff P1  P2 Is there a difference? {x=5, x=7, x=9}  {x=5, x=7} {x=5, x=7, x=9}  {x=5, x=7} {x=5, x=7}  {x=5, x=7, x=9}

CP preorder example … … Var = {x} {} {x=-3} {x=-2} {x=-1} {x=0} {x=1}

CP preorder example … … … Var = {x, y} {} {x=-3} {x=0} {x=3} {y=-5}

The problem with preorders Equivalent elements have different representations {x=y, x=a+b} S {Q} {x=y, y=a+b} S {Q’} Leads to unpredictability Which result should our static analysis give?

The problem with preorders Equivalent elements have different representations {x=y, x=a+b} assume ya+b {x=y, x=a+b} {x=y, y=a+b} assume ya+b {false} Leads to unpredictability Which result should our static analysis give?

The problem with preorders Equivalent elements have different representations {x=y, x=a+b} assume xa+b {false} {x=y, y=a+b} assume xa+b {x=y, x=a+b} Leads to unpredictability Which result should our static analysis give? May turn a terminating analysis into a non-terminating one Hasse diagram contains cycles In practice some static analyses still use preorders (taking extreme care to ensure termination)

Partial orders

Partially ordered sets (partial orders) A partially ordered set (Poset for short) is a pair (D , )  : D  D has the following properties, for all d, d’, d’’ in D Reflexive: d  d Transitive: d  d’ and d’  d’’ implies d  d’’ Anti-symmetric: d  d’ and d’  d implies d = d’ If d  d’ and d  d’ we write d  d’ Makes it easier to choose the best element

Partially ordered sets (partial orders) A partially ordered set (Poset for short) is a pair (D , )  : D  D has the following properties, for all d, d’, d’’ in D Reflexive: d  d Transitive: d  d’ and d’  d’’ implies d  d’’ Anti-symmetric: d  d’ and d’  d implies d = d’ If d  d’ and d  d’ we write d  d’

SAV partial order SAV-predicates SAV-factoids  = { x = y | x, y  Var }  { x = y + z | x, y, z  Var } SAV-predicates  = 2 Order relation 1: P1 set P2 iff P1  P2 Is this a partial order? Order relation 2: P1 imp P2 iff P1  P2 that is models(P1)  models(P2) Is this a partial order? Order relation 3: P1 set* P2 iff reduce(P1) set reduce(P2) Is this a partial order?

Can we define a more precise partial order? CP partial order CP-predicates CP-factoids  = { x = c | x  Var, c  Z } CP-predicates  = 2 Order relation 1: P1 set P2 iff P1  P2 Is it a partial order? Order relation 2: P1 imp P2 iff P1  P2 Is it a partial order? Can we define a more precise partial order?

CP partial order CP-predicates CP-factoids false = { x = c | x  Var, c  Z } CP-predicates  = 2  {false} Define reduce : 2  2 reduce(P) = if exists {x=c1, x=c2}P then {false} else P false = { P2 | P=reduce(P) }  {false} Order relation: P1  P2 if P1  P2 or P1={false}

Pointed poset A poset (D, ) with a least element  is called a pointed poset For all dD we have that   d The pointed poset is denoted by (D , , ) We can always transform a poset (D, ) into a pointed poset by adding a special bottom element (D  {},   {d | dD}, ) Example: false = { P2 | P=reduce(P) }  {false}

chains

Chains If d  d’ and d  d’ we write d  d’ Similarly define d  d’ Let (D, ) be a poset An ascending chain is a sequence x1  x2  …  xk … A descending chain is a sequence x1  x2  …  xk … The height of a poset is the length of the maximal ascending chain What is the height of the SAV poset? What is the height of the CP poset?

Ascending chain example true x0 x0 x<0 x=0 x>0 false

Joining elements By Viviana Pastor (originally posted to Flickr as Harbour Bridge 1) [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Bounds Let (D , ) be a poset Let X  D be a set of elements from D An element dD is an upper bound (ub) of X iff for every xD we have that xd An element dD is a lower bound (lb) of X iff for every xD we have that dx

Bounds Let (D , ) be a poset Let X  D be a set of elements from D An element dD is the least upper bound (lub) of X iff d is the minimal of all upper bounds of X An element dD is the greatest lower bound (glb) of X iff d is the maximal of all lower bounds of X

Bounds example true false the signs lattice (for variable x) x0 x0

x0 and true are upper bounds false

x0 is the least upper bound true x0 x0 x<0 x=0 x>0 false

Join (confluence) operator Assume a poset (D, ) Let X  D be a subset of D (finite/infinite) The join of X is defined as X = the least upper bound (LUB) of all elements in X if it exists X = min{ b | forall xX we have that xb} The supremum of the elements in X A kind of abstract union (disjunction) operator Properties of a join operator Commutative: x  y = y  x Associative: (x  y)  z = x  (y  z) Idempotent: x  x = x x  y = y iff x  y

Properties of join Can be used to define partial order x  y = y iff x  y Monotone: if y  z then (x  y)  (x  z)   x = x   x = 

Meet operator Assume a poset (D, ) Let X  D be a subset of D (finite/infinite) The meet of X is defined as X = the greatest lower bound (GLB) of all elements in X if it exists X = max{ b | forall xX we have that bx} The infimum of the elements in X A kind of abstract intersection (conjunction) operator Properties of a join operator Commutative: x  y = y  x Associative: (x  y)  z = x  (y  z) Idempotent: x  x = x

Complete partial orders

Complete partial order (CPO) A CPO is a partial order where each ascending chain has a supremum

CPO example Is there a join here? x0 x0 x<0 x=0 x>0 false

lattices

Complete lattice A complete lattice (D, , , , , ) is A set of elements D A partial order x  y A join operator  A meet operator 

Join semilattice A complete lattice (D, , , ) is A set of elements D with  A partial order x  y A join operator 

Meet semilattice A complete lattice (D, , , ) is A set of elements D with  A partial order x  y A meet operator 

Powerset lattices For a set of elements X we define the powerset lattice for X as (2X, , , , , X) Notice it is a complete lattice For a set of program states State, we define the collecting lattice (2State, , , , , State)

Composing lattices

One lattice per variable true true x0 x0 y0 y0 x<0 x=0 x>0 y<0 y=0 y>0 false false How can we compose them?

Cartesian product

Cartesian product of complete lattices For two complete lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) Define the poset Lcart = (D1D2, cart, cart, cart, cart, cart) as follows: (x1, x2) cart (y1, y2) iff x1 1 y1 and x2 2 y2 cart = ? cart = ? cart = ? cart = ? Lemma: L is a complete lattice Define the Cartesian constructor Lcart = Cart(L1, L2)

Cartesian product example (true, true) true x0, true x0, true true, y0 true, y0 x0,y0 x0,y0 x0,y0 x0,y0 … … x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 x>0,y0 x>0,y0 … x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0 … … … x<0, false false, y>0 How does it represent (x<0y<0)  (x>0y>0)? false (false, false)

Disjunctive completion

Disjunctive completion For a complete lattice L = (D, , , , , ) Define the Powerset lattice L = (2D, , , , , )  = ?  = ?  = ?  = ?  = ? Lemma: L is a complete lattice L contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates Define the disjunctive completion constructor L = Disj(L)

The base lattice CPfalse true … … {x=-2} {x=-1} {x=0} {x=1} {x=2} false

The disjunctive completion of CPfalse What is the height of this lattice? true … {x is even} {x is odd} {x is prime} … … … {x=-1 x=1x=-2} {x=0 x=1x=2} … … … {x=-2x=-1} {x=-2x=0} {x=-2x=1} {x=1x=2} … … {x=-2} {x=-1} {x=0} {x=1} {x=2} false

Relational product

Relational product of lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) Lrel = (2D1D2, rel, rel, rel, rel, rel) as follows: Lrel = ?

Relational product of lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2) Lrel = (2D1D2, rel, rel, rel, rel, rel) as follows: Lrel = Disj(Cart(L1, L2)) Lemma: L is a complete lattice What does it buy us?

Cartesian product example true x0, true x0, true true, y0 true, y0 x0,y0 x0,y0 x0,y0 x0,y0 … … x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 x>0,y0 x>0,y0 … x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0 … … … x<0, false false, y>0 How does it represent (x<0y<0)  (x>0y>0)? What is the height of this lattice? false

Relational product example true x0 x0 y0 y0 (x<0y<0)(x>0y>0) (x<0y<0)(x>0y=0) (x<0y0)(x<0y0) … false How does it represent (x<0y<0)  (x>0y>0)? What is the height of this lattice?

A lattice for collecting semantics

Collecting semantics … … … … label0: if x <= 0 goto label1 x := x – 1 goto label0 label1: 1 2 … 3 [x3] [x2] [x1] entry 4 5 … [x3] [x2] [x2] [x-1] [x0] [x1] 2 if x > 0 … [x-2] [x-1] exit [x0] [x1] x := x - 1 3 [x3] [x2] …

Defining the collecting semantics How should we represent the set of states at a single control-flow node by a lattice? How should we represent the sets of states at all control-flow nodes by a lattice?

Finite maps For a complete lattice L = (D, , , , , ) and finite set V Define the poset LVL = (VD, VL, VL, VL, VL, VL) as follows: f1 VL f2 iff for all vV f1(v)  f2(v) VL = ? VL = ? VL = ? VL = ? Lemma: L is a complete lattice Define the map constructor LVL = Map(V, L)

The collecting lattice Lattice for a given control-flow node v: ? Lattice for entire control-flow graph with nodes V: ? We will use this lattice as a baseline for static analysis and define abstractions of its elements

The collecting lattice Lattice for a given control-flow node v: Lv=(2State, , , , , State) Lattice for entire control-flow graph with nodes V: LCFG = Map(V, Lv) We will use this lattice as a baseline for static analysis and define abstractions of its elements

Equational definition of the semantics Define variables of type set of states for each control-flow node Define constraints between them R[entry] entry R[2] 2 if x > 0 R[exit] R[3] exit x := x - 1 3

Equational definition of the semantics R[entry] = State R[2] = R[entry]  x:=x-1 R[3] R[3] = assume x>0 R[2] R[exit] = assume x0 R[2] A recursive system of equations How can we approximate it using what we have learned so far? R[entry] entry R[2] 2 if x > 0 R[exit] R[3] exit x := x - 1 3

An abstract semantics R[entry] =  R[2] = R[entry]  x:=x-1# R[3] Abstract transformer for x:=x-1 R[entry] =  R[2] = R[entry]  x:=x-1# R[3] R[3] = assume x>0# R[2] R[exit] = assume x0# R[2] A recursive system of equations R[entry] entry R[2] 2 if x > 0 R[exit] R[3] exit x := x - 1 3

The meaning of sound analysis result R[entry] =  R[2]  R[entry]  x:=x-1# R[3] R[3]  assume x>0# R[2] R[exit]  assume x0# R[2] A recursive system of inequations R[entry] entry R[2] 2 if x > 0 R[exit] R[3] exit x := x - 1 3

see you next time