Lecture 01 - Introduction Eran Yahav 1. Goal  Understand program analysis & synthesis  apply these techniques in your research  understand jargon/papers.

Slides:



Advertisements
Similar presentations
Copyright 2000 Cadence Design Systems. Permission is granted to reproduce without modification. Introduction An overview of formal methods for hardware.
Advertisements

Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } P1() Challenge: Correct and Efficient Synchronization { ……………………………
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } T1() Challenge: Correct and Efficient Synchronization { ……………………………
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 1: Introduction Roman Manevich Ben-Gurion University.
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture 05.
BLAST-A Model Checker for C Developed by Thomas A. Henzinger (EPFL) Rupak Majumdar (UC Los Angeles) Ranjit Jhala (UC San Diego) Dirk Beyer (Simon Fraser.
Formal Methods in Software Engineering Credit Hours: 3+0 By: Qaisar Javaid Assistant Professor Formal Methods in Software Engineering1.
Lecture 15 – Dataflow Analysis Eran Yahav 1
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
1 Lecture 08(a) – Shape Analysis – continued Lecture 08(b) – Typestate Verification Lecture 08(c) – Predicate Abstraction Eran Yahav.
Abstractions. Outline Informal intuition Why do we need abstraction? What is an abstraction and what is not an abstraction A framework for abstractions.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Program analysis Mooly Sagiv html://
Eran Yahav Technion Joint work with Nurit Dor, Stephen Fink, Satish Chandra, Marco Pistoia, Ganesan Ramalingam, Sharon Shoham, Greta Yorsh.
CS 267: Automated Verification Lectures 14: Predicate Abstraction, Counter- Example Guided Abstraction Refinement, Abstract Interpretation Instructor:
Program analysis Mooly Sagiv html://
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Synthesis of Interface Specifications for Java Classes Rajeev Alur University of Pennsylvania Joint work with P. Cerny, G. Gupta, P. Madhusudan, W. Nam,
A Type System for Expressive Security Policies David Walker Cornell University.
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
Overview of program analysis Mooly Sagiv html://
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
Describing Syntax and Semantics
C++ Programming: From Problem Analysis to Program Design, Third Edition Chapter 1: An Overview of Computers and Programming Languages C++ Programming:
Synthesis of Loop-free Programs Sumit Gulwani (MSR), Susmit Jha (UC Berkeley), Ashish Tiwari (SRI) and Ramarathnam Venkatesan(MSR) Susmit Jha 1.
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
CS 267: Automated Verification Lecture 13: Bounded Model Checking Instructor: Tevfik Bultan.
Overview of program analysis Mooly Sagiv html://
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Lazy Abstraction Lecture 3 : Partial Analysis Ranjit Jhala UC San Diego With: Tom Henzinger, Rupak Majumdar, Ken McMillan, Gregoire Sutre.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Program Analysis and Verification Noam Rinetzky Lecture 1: Introduction & Overview 1 Slides credit: Tom Ball, Dawson Engler, Roman Manevich,
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
Synthesis with the Sketch System D AY 1 Armando Solar-Lezama.
C++ Programming: From Problem Analysis to Program Design, Third Edition Chapter 1: An Overview of Computers and Programming Languages.
Inferring Synchronization under Limited Observability Martin Vechev, Eran Yahav, Greta Yorsh IBM T.J. Watson Research Center (work in progress)
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Program Analysis and Verification Spring 2013 Program Analysis and Verification Lecture 1: Introduction Roman Manevich Ben-Gurion University.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
Semantics In Text: Chapter 3.
A Tool for Pro-active Defense Against the Buffer Overrun Attack D. Bruschi, E. Rosti, R. Banfi Presented By: Warshavsky Alex.
Verification & Validation By: Amir Masoud Gharehbaghi
Lecture 01 - Introduction Eran Yahav 1. 2 Who? Eran Yahav Taub 734 Tel: Monday 13:30-14:30
The Hashemite University Computer Engineering Department
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
Ranjit Jhala Rupak Majumdar Interprocedural Analysis of Asynchronous Programs.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
SS 2017 Software Verification Bounded Model Checking, Outlook
Program Synthesis is a Game
Advanced Compilation and Automatic Programming
Threads and Memory Models Hal Perkins Autumn 2011
Objective of This Course
Predicate Abstraction
Presentation transcript:

Lecture 01 - Introduction Eran Yahav 1

Goal  Understand program analysis & synthesis  apply these techniques in your research  understand jargon/papers  conduct research in this area  We will cover some areas in more depth than others  What will help us  TA: Nimrod Partush  lecture summaries  3-5 homework assignments  Small lightweight project  No exam 2

December 31,

Zune Bug 1 while (days > 365) { 2 if (IsLeapYear(year)) { 3 if (days > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 } 4

Zune Bug 1 while (366 > 365) { 2 if (IsLeapYear(2008)) { 3 if (366 > 366) { 4 days -= 366; 5 year += 1; 6 } 7 } else { 8 days -= 365; 9 year += 1; 10 } 11 } Suggested solution: wait for tomorrow 5

February 25,

Patriot Bug - Rounding Error  Time measured in 1/10 seconds  Binary expansion of 1/10:  24-bit register  error of  binary, or ~ decimal  After 100 hours of operation error is ×100×3600×10=0.34  A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time Suggested solution: reboot every 10 hours 7

August 13, 2003 I just want to say LOVE YOU SAN!! (W32.Blaster.Worm) 8

Windows Exploit(s) Buffer Overflow void foo (char *x) { char buf[2]; strcpy(buf, x); } int main (int argc, char *argv[]) { foo(argv[1]); }./a.out abracadabra Segmentation fault Stack grows this way Memory addresses Previous frame Return address Saved FP char* x buf[2] … ab ra ca da br (YMMV) 9

(In)correct Usage of APIs  Application Trend: Increasing number of libraries and APIs – Non-trivial restrictions on permitted sequences of operations  Typestate: Temporal safety properties – What sequence of operations are permitted on an object? – Encoded as DFA e.g. “Don’t use a Socket unless it is connected” initconnectedclosed err connect()close() getInputStream() getOutputStream() getInputStream() getOutputStream() getInputStream() getOutputStream() close() * 10

Challenges class SocketHolder { Socket s; } Socket makeSocket() { return new Socket(); // A } open(Socket l) { l.connect(); } talk(Socket s) { s.getOutputStream()).write(“hello”); } main() { Set set = new HashSet (); while(…) { SocketHolder h = new SocketHolder(); h.s = makeSocket(); set.add(h) } for (Iterator it = set.iterator(); …) { Socket g = it.next().s; open(g); talk(g); }

Testing is Not Enough  Observe some program behaviors  What can you say about other behaviors?  Concurrency makes things worse  Smart testing is useful  requires the techniques that we will see in the course 12

Program Analysis & Synthesis* 13 High-level Language / Specification Low-level language / Implementation * informally speaking analysis synthesis

Static Analysis Reason statically (at compile time) about the possible runtime behaviors of a program “The algorithmic discovery of properties of a program by inspection of its source text 1 ” -- Manna, Pnueli 1 Does not have to literally be the source text, just means w/o running it 14

Static Analysis x = ? if (x > 0) { y = 42; } else { y = 73; foo(); } assert (y == 42);  Bad news: problem is generally undecidable 15

universe Static Analysis  Central idea: use approximation Under Approximation Exact set of configurations/ behaviors Over Approximation 16

Over Approximation x = ? if (x > 0) { y = 42; } else { y = 73; foo(); } assert (y == 42);  Over approximation: assertion may be violated 17

 Lose precision only when required  Understand where precision is lost Precision main(…) { printf(“assertion may be violated\n”); } 18

Static Analysis  Formalize software behavior in a mathematical model (semantics)  Prove properties of the mathematical model  Automatically, typically with approximation of the formal semantics  Develop theory and tools for program correctness and robustness 19

Static Analysis  Spans a wide range  type checking … up to full functional verification  General safety specifications  Security properties (e.g., information flow)  Concurrency correctness conditions (e.g., progress, linearizability)  Correct use of libraries (e.g., typestate)  Under-approximations useful for bug-finding, test-case generation,… 20

Static Analysis: Techniques  Abstract Interpretation  Dataflow analysis  Constraint-based analysis  Type and effect systems  (we will not be able to cover all in depth) 21

Static Analysis for Verification program specification Abstract counter example Analyzer Valid

Verification Challenge I main(int i) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y } Determine what states can arise during any execution Challenge: set of states is unbounded 23

Abstract Interpretation main(int i) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y } Recipe 1) Abstraction 2) Transformers 3) Exploration Challenge: set of states is unbounded Solution: compute a bounded representation of (a superset) of program states Determine what states can arise during any execution 24

1) Abstraction main(int i) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y }  concrete state  abstract state (sign)  : Var  Z  # : Var  {+, 0, -, ?} xyi 31 7 xyi xyi … 25

2) Transformers main(int i) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y }  concrete transformer  abstract transformer xyi ++ 0 xyi 31 0 y = y + 1 xyi 32 0 xyi ? ? 0 +? 0 26

3) Exploration ++ ? ++ ? xyi main(int i) { int x=3,y=1; do { y = y + 1; } while(--i > 0) assert 0 < x + y } ++ ? ++ ? ?? ? xyi ++ ? ++ ? ++ ? ++ ? ++ ? ++ ? 27

Incompleteness 28  main(int i) { int x=3,y=1; do { y = y - 2; y = y + 3; } while(--i > 0) assert 0 < x + y } +? ? +? ? xyi +? ? ++ ? ?? ? xyi +? ? +? ? +? ?

Parity Abstraction 29 challenge: how to find “the right” abstraction while (x !=1 ) do { if (x % 2) == 0 { x := x / 2; } else { x := x * 3 + 1; assert (x %2 ==0); }

Finding “the right” abstraction?  pick an abstract domain suited for your property  numerical domains  domains for reasoning about the heap  …  combination of abstract domains  another approach  abstraction refinement 30

Example: Shape (Heap) Analysis t x n x t n x t n n x t n n x t t x n t t n t x t x t x emp void stack-init(int i) { Node* x = null; do { Node  t = malloc(…) t->n = x; x = t; } while(--i>0) Top = x; } assert(acyclic(Top)) t x nn x t n n x t n n n x t n n n x t n n n top 31

Following the Recipe (In a Nutshell) 1) Abstraction Concrete stateAbstract state x t n nn x t n 2) Transformers n x t n t n x n t->n = x 32

x t n n t x n x t n x t n n x t t x n t t n t x t x t x emp x t n n x t n n n x t n t n x n x t n n 3) Exploration void stack-init (int i) { Node* x = null; do { Node  t = malloc(…) t->n = x; x = t; } while(--i>0) Top = x; } assert(acyclic(Top)) x t n Top n n t x t x x t n n 33

Example: Polyhedra (Numerical) Domain proc MC(n:int) returns (r:int) var t1:int, t2:int; begin if (n>100) then r = n-10; else t1 = n + 11; t2 = MC(t1); r = MC(t2); endif; end var a:int, b:int; begin b = MC(a); end What is the result of this program? 34

McCarthy 91 function proc MC (n : int) returns (r : int) var t1 : int, t2 : int; begin /* (L6 C5) top */ if n > 100 then /* (L7 C17) [|n-101>=0|] */ r = n - 10; /* (L8 C14) [|-n+r+10=0; n-101>=0|] */ else /* (L9 C6) [|-n+100>=0|] */ t1 = n + 11; /* (L10 C17) [|-n+t1-11=0; -n+100>=0|] */ t2 = MC(t1); /* (L11 C17) [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0|] */ r = MC(t2); /* (L12 C16) [|-n+t1-11=0; -n+100>=0; -n+t2-1>=0; t2-91>=0; r-t2+10>=0; r-91>=0|] */ endif; /* (L13 C8) [|-n+r+10>=0; r-91>=0|] */ end var a : int, b : int; begin /* (L18 C5) top */ b = MC(a); /* (L19 C12) [|-a+b+10>=0; b-91>=0|] */ end if (n>=101) then n-10 else 91 35

Some things that should trouble you  does a result always exist?  does the recipe always converge?  is the result always “the best”?  how do I pick my abstraction?  how do come up with abstract transformers? 36

Change the abstraction to match the program Abstraction Refinement program specification Abstract counter example abstraction Abstraction Refinement  Abstract counter example Verify Valid 37

Recap: program analysis  Reason statically (at compile time) about the possible runtime behaviors of a program  use sound over-approximation of program behavior  abstract interpretation  abstract domain  transformers  exploration (fixed-point computation)  finding the right abstraction? 38

Program Synthesis  Automatically synthesize a program that is correct-by-construction from a (higher-level) specification 39 programspecification Synthesizer

Program Synthesis: Techniques  Gen/Test  Theorem Proving  Games  SAT/SMT Solvers  Transformational Synthesis  Abstract Interpretation  …  (we will not be able to cover all in depth) 40

Synthesis Challenge I signum(int x) { if (x>0) return 1; else if (x<0) return -1; else return 0; } 41 Challenge: Generate efficient assembly code for “signum” # x in d0 add.l d0, d0 | add d0 to itself subx.l d1,d1 | subtract (d1+carry) from d1 negx.l d0 | put (0-d0-carry) into d0 addx.l d1, d1 | add (d1+carry) to d1 # signum(x) is now in d1

Superoptimizer [Massalin, 1987]  exhaustive search over assembly programs  order search by increasing program length  check input/output “equivalence” with original code  boolean test – construct boolean formula for functions and compare them  not practical  probabilistic test – run many times on some inputs and check if the outputs of both programs are the same  expensive, only applied to critical pieces of code (e.g., common libraries) 42

Denali Superoptimizer [Joshi, Nelson, Randall, 2001] “a refutation-based automatic theorem-prover is in fact a general-purpose goal-directed search engine, which can perform a goal-directed search for anything that can be specified in its declarative input language. Successful proofs correspond to unsuccessful searches, and vice versa.” 43 (more details later in the course…) Turn the search of a program into a search of counter-example in a theorem prover

{ ……………… …… …………………. ……………………. ………………………… } P1() Synthesis of Atomic Sections 44 { …………………………… ……………………. … } P2() atomic { ………………….. …… ……………………. ……………… …………………… } P3() atomic Safety Specification: S

{ ……………… …… …………………. ……………………. ………………………… } P1() { …………………………… ……………………. … } P2() { ………………….. …… ……………………. ……………… …………………… } P3() Safety Specification: S 45 Synthesis of Atomic Sections

less atomic more atomic 46 Semantic Optimized Search [vechev, yahav, bacon and rinetzky, 2007]

unsigned int got_lock = 0;... 1: while(*) {... 2: if (*) { 3: lock(); 4: got_lock++; }... 5: if (got_lock != 0){ 6: unlock(); } 7: got_lock--;... } lock() { lock: LOCK:=1;} unlock(){ unlock: LOCK:=0;} Specification P1: do not acquire a lock twice P2: do not call unlock without holding the lock P1: always( line=lock implies next( line!=lock w-until line=unlock )) P2: ( line!=unlock w-until line=lock )) and always( line=unlock implies next( line!=unlock w-until line=lock )) 47 (slide adapted with permission from Barbara Jobstmann) Program Repair as a Game [Jobstmann et. al. 2005]

How to Repair a Reactive System? 1. Add freedom  choice for the system, space of permitted modifications to the system 2. Source code ➝ transition system (game)  non-determinism in the program (demonic)  non-determinism in permitted modification (angelic) 3. Specification ➝ monitor acceptance 4. Check if we can find system choices s.t. model is accepted by monitor  product of trans. system and monitor  search for winning strategy in game 48 (slide adapted with permission from Barbara Jobstmann)

unsigned int got_lock = 0;... 1: while(*) {... 2: if (*) { 3: lock(); 4: got_lock = 1; }... 5: if (got_lock != 0){ 6: unlock(); } 7: got_lock = 0;... } lock() { lock: LOCK:=1;} unlock(){ unlock: LOCK:=0;} Specification P1: do not acquire a lock twice P2: do not call unlock without holding the lock P1: always( line=lock implies next( line!=lock w-until line=unlock )) P2: ( line!=unlock w-until line=lock )) and always( line=unlock implies next( line!=unlock w-until line=lock )) 49 (slide adapted with permission from Barbara Jobstmann) Repaired Program

Partial Programs and SKETCH [aLisp: Andre et al 2002, Sketch: Solar-Lezama et al 2006]  partial program freedom in games  defines a space of program  Given a partial program P with control variables C (“holes”), a specification S, the goal is to find an assignment for C such that P[C]  S 50 double(x) { return 2 * x; } double(x) { return x + x; } Synthesizer double(x) { return ?? * x; }

SKETCH: isolate rightmost 0 51 bit[W] isolate0 (bit[W] x) { // W: word size bit[W] ret=0; for (int i = 0; i < W; i++) if (!x[i]) { ret[i] = 1; break; } return ret; } bit[W] isolate0Fast (bit[W] x) implements isolate0 { return ~x & (x+1); } bit[W] isolate0Sketched(bit[W] x) implements isolate0 { return ~(x + ??) & (x + ??); } (Hacker’s Delight, H.S. Warren)

Synthesis as generalized SAT  The sketch synthesis problem  c  x spec(x) = sketch(x,c)  Counter-example driven solver I =  x = random-input() do I = I  {x} find c such that  i I (spec(i)=sketch(c,i)) if cannot find c then exit(“non-satisfiable sketch'') find x such that spec(x)  sketch(x,c) while x != nil return c

SMARTEdit [Lau et al, 2000]  synthesize editor macros (programs) from examples  behind the scenes: machine learning techniques 53

54

55

56

Recap: program synthesis  Automatically synthesize a program that is correct-by-construction from a (higher-level) specification  many techniques  games  games with abstraction (abstract interpretation) 57

Coming up (extremely optimistic! more likely, we’ll cover half of it)  principle of program analysis  overview of dataflow  why dataflow works?  abstract interpretation basics  a taste of operational semantics  numerical domains  heap domains  shape analysis  approaches to program synthesis  program synthesis using games  abstraction-guided synthesis  games with abstraction  synthesis with machine learning techniques  a tiny bit on SAT/SMT based synthesis 58

References  Patriot bug:   Patrick Cousot’s NYU lecture notes  Zune bug:   Blaster worm:  resources/malwarefaq/w32_blasterworm.php resources/malwarefaq/w32_blasterworm.php  Interesting CACM article  code-later/fulltext code-later/fulltext  MSC19_05%2FS a.pdf&code=d5af66869c188 1e b90c07d0c MSC19_05%2FS a.pdf&code=d5af66869c188 1e b90c07d0c 59