Finding bugs: Analysis Techniques & Tools Comparison of Program Analysis Techniques CS161 Computer Security Cho, Chia Yuan.

Slides:



Advertisements
Similar presentations
Static Analysis for Security
Advertisements

Runtime Techniques for Efficient and Reliable Program Execution Harry Xu CS 295 Winter 2012.
For(int i = 1; i
Introduction to Computer Science Robert Sedgewick and Kevin Wayne Recursive Factorial Demo pubic class Factorial {
Notes on “AMOEBA-RT: Run-Time Verification of Adaptive Software” Jozef Hooman Embedded Systems Institute, Eindhoven Radboud University Nijmegen The Netherlands.
K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond, WA, USA 15 January 2009 Séminaire Digiteo Orsay, France.
Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
Type checking © Marcelo d’Amorim 2010.
HCSSAS Capabilities and Limitations of Static Error Detection in Software for Critical Systems S. Tucker Taft CTO, SofCheck, Inc., Burlington, MA, USA.
Precise Inter-procedural Analysis Sumit Gulwani George C. Necula using Random Interpretation presented by Kian Win Ong UC Berkeley.
1 Scheme Scheme is a functional language. Scheme is based on lambda calculus. lambda abstraction = function definition In Scheme, a function is defined.
Synergy: A New Algorithm for Property Checking
Program Analysis for Security Suhabe Bugrara Stanford University.
DART Directed Automated Random Testing Patrice Godefroid, Nils Klarlund, and Koushik Sen Syed Nabeel.
Overview of program analysis Mooly Sagiv html://
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Overview of program analysis Mooly Sagiv html://
Automated Tools for Software Reliability Suhabe Bugrara Stanford University.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
DART: Directed Automated Random Testing Koushik Sen University of Illinois Urbana-Champaign Joint work with Patrice Godefroid and Nils Klarlund.
 Protect customers with more secure software  Reduce the number of vulnerabilities  Reduce the severity of vulnerabilities  Address compliance requirements.
Data Flow in Static Profiling Cathal Boogerd, Delft University, The Netherlands Leon Moonen, Simula Research Lab, Norway ?
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
1 Testing, Abstraction, Theorem Proving: Better Together! Greta Yorsh joint work with Thomas Ball and Mooly Sagiv.
computer
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Advanced Computer Architecture Lab University of Michigan USENIX Security ’03 Slide 1 High Coverage Detection of Input-Related Security Faults Eric Larson.
(Static) Program Analysis 동아대학교 컴퓨터공학과 조장우. Motivation u 컴퓨터 기술자는 무엇을 해야 하는가 ? The production of reliable software, its maintenance, and safe evolution.
Lab 1 – Data Types "Using C code, determine the size of signed and unsigned char, int, and long integral data types. Demonstrate the problems of overflow.
Characters and Strings. Characters  New primitive char  char letter; letter = ‘a’; char letter2 = ‘C’;  Because computers can only represent numbers,
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
A: A: double “4” A: “34” 4.
PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences.
1 Languages and Compilers (SProg og Oversættere) Compiler Optimizations Bent Thomsen Department of Computer Science Aalborg University With acknowledgement.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
The Potential of Sampling for Dynamic Analysis Joseph L. GreathouseTodd Austin Advanced Computer Architecture Laboratory University of Michigan PLAS, San.
Combining Static and Dynamic Reasoning for Bug Detection Yannis Smaragdakis and Christoph Csallner Elnatan Reisner – April 17, 2008.
Static and Integration Testing. Static Testing vs Dynamic Testing  To find defects  This testing includes verification process  without executing.
Secure Programming with Static Analysis Brian Chess, Ph.D.
Static Analysis Introduction Emerson Murphy-Hill.
Abstraction and Abstract Interpretation. Abstraction (a simplified view) Abstraction is an effective tool in verification Given a transition system, we.
CS 5150 Software Engineering Lecture 21 Reliability 2.
Lab 1 – Data Types "Using C code, determine the size of signed and unsigned char, int, and long integral data types. Demonstrate the problems of overflow.
Software Dependability
Modern Programming Tools And Techniques-I
APEx: Automated Inference of Error Specifications for C APIs
Ik-Soon Kim December 18, 2010 Embedded Software Platform Team
Programming Language Concepts (CIS 635)
runtime verification Brief Overview Grigore Rosu
Symbolic Implementation of the Best Transformer
High Coverage Detection of Input-Related Security Faults
Introduction to Computer Programming
Review Operation Bingo
null, true, and false are also reserved.
أنماط الإدارة المدرسية وتفويض السلطة الدكتور أشرف الصايغ
Over-Approximating Boolean Programs with Unbounded Thread Creation
Software Verification and Validation
Software Verification and Validation
Operational Security Games
Understand the interaction between computer hardware and software
Test Process “V” Diagram
The Zoo of Software Security Techniques
Software Verification and Validation
Ras Bodik WF 11-12:30 slides adapted from Mooly Sagiv
Types of Errors And Error Analysis.
SOFTWARE ENGINEERING INSTITUTE
Presentation transcript:

Finding bugs: Analysis Techniques & Tools Comparison of Program Analysis Techniques CS161 Computer Security Cho, Chia Yuan

CompleteIncomplete Sound Unsound Reports all errors Reports no false alarms Reports all errors May report false alarms UndecidableDecidable May not report all errors May report false alarms Decidable May not report all errors Reports no false alarms Testing Dynamic Analysis Static Analysis Manual Program Verification Symbolic Execution Abstract Interpretation Dynamic Symbolic Execution

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? Manual Program Verification Symbolic Execution Abstract Interpretation Dynamic Symbolic Execution

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? Q1. What assertions need to be valid for memory safety? - Where should they be inserted? Q2. What are the inputs (free variables) to the program?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? Q1. What assertions need to be valid for memory safety? - Where should they be inserted? Q2. What are the inputs (free variables) to the program? Ans: s, n, s[0.. n-1] assert(0 <= j < n); assert(0 <= i < n); Objective: (1) prove assertions valid or (2) produce an input that violates an assertion

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } Manual Program Analysis 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: What’s the: Precondition at line 1? Loop invariant (i.e., invariant at line 4) Postconditions at lines 6, 9, 10? Postconditions at lines 3 & 4?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Abstract Interpretation Let’s assume: Abstract domain: Value Set Intervals (i.e., [x, y] ) n == 2 Q: What’s the value set intervals of i and j after: Lines 2, 6, 9, 10, over iterations 1, 2 & 3? Q: What if n is 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Dynamic Symbolic Execution Let’s assume: n == 2, s != NULL, sizeof(s) >= 2 Q: How many paths are there in a full exploration? Q: Start exploring from an initial test case s = “??”. For each path, write down the obtained formula to solve to get a new input down a new path Q: What if n is 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Symbolic Execution Let’s assume: n == 2 Q: Transform the program (with assertions) into a single formula Q: What if n == 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe?

Answers

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } Manual Program Analysis 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: What’s the: Precondition at line 1? Loop invariant (i.e., invariant at line 4)? Postconditions at lines 6, 9, 10? Postconditions at lines 3 & 4?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? assert(0 <= j < n); assert(0 <= i < n);

Manual Program Analysis 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } Requires: s != NULL && 0 <= n <= sizeof(s) s != NULL && 0 <= n <= sizeof(s) && 0 < i <= n && 0 < j <= n s != NULL && 0 <= n <= sizeof(s) && 0 <= i < n && 0 < j <= n s != NULL && 0 <= n <= sizeof(s) && 0 <= i <= n && 0 < j <= n assert(0 <= j < n); assert(0 <= i < n); s != NULL && 0 <= n <= sizeof(s) && 0 <= i < n && 0 <= j < n

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Abstract Interpretation Let’s assume: Abstract domain: Value Set Intervals (i.e., [x, y] ) n == 2 Q: What’s the value set intervals of i and j after: Lines 2, 6, 9, 10, over iterations 1, 2 & 3? Q: What if n is 1000?

1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Abstract Interpretation int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } i: [0, 0] j: [0, 0] First pass … i: [1, 1] j: [1, 1] i: [0, 0] j: [1, 1] i: [0, 1] j: [1, 1]

1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Abstract Interpretation int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } i: [0, 1] j: [0, 1] Second pass … i: [1, 2] j: [1, 2] i: [0, 1] j: [1, 2] i: [0, 2] j: [1, 2] Third pass? What if n == 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Dynamic Symbolic Execution Let’s assume: n == 2, s != NULL, sizeof(s) >= 2 Q: How many paths are there in a full exploration? Q: Start exploring from an initial test case s = “??”. For each path, write down the obtained formula to solve to get a new input down a new path Q: What if n is 1000?

1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Dynamic Symbolic Execution int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } Q: How many paths are there in a full exploration? Ans: 4 paths! 1.Input : S0 == “??” Get Formula1: i1 == 0 && j1 == 0 && issafe(S0[j1]) Get Formula2: i1 == 0 && j1 == 0 && !issafe(S0[j1]) && j2 == j1 + 1 && issafe(S0[j2]) 2. Solve Formula1. Get Input: e.g., S0 == “a?” Get Formula3: i1 == 0 && j1 == 0 && issafe(S0[j1]) && S1 == S0 WITH S0[i1] == S0[j1] && i2 == i1 + 1 && j2 == j1 + 1 && issafe(S1[j2]) 3. Solve Formula2. Get Input: e.g., S0 == “?a” 4. Solve Formula3. Get Input: e.g., S0 == “aa” What if n == 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Symbolic Execution Let’s assume: n == 2 Q: Transform the program (with assertions) into a single formula Q: What if n == 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Symbolic Execution i1 == 0 && j1 == 0 && branchcond1 == issafe(S0[j1]) && S1 == S0 WITH S0[i1] == S0[j1] && i2 == i1 + 1 && j2 == j1 + 1 && j3 == j1 + 1 && S2 == branchcond1 ? S1 : S0 && i3 == branchcond1 ? i2 : i1 && j4 == branchcond1 ? j2: j3 && ( !(0 <= i1 < 2) || !(0 <= j1 < 2) !(0 <= i3 < 2) || !(0 <= j4 < 2) ) What if n == 1000?

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? Manual Program Verification Symbolic Execution Abstract Interpretation Dynamic Symbolic Execution

int issafe(char c) { return ('a' <= c && c <= 'z') || ('0' <= c && c <= '9') || (c == '_'); } 1: int sanitize(char s[], size_t n) { 2: size_t i = 0, j = 0; 3: while (j < n) { 4: if (issafe(s[j])) { 5: s[i] = s[j]; 6: i++; j++; 7: } 8: else { 9: j++; 10: } 11: } 12: return i; 13: } Q: Is this code Memory Safe? Q1. What assertions need to be valid for memory safety? - Where should they be inserted? Q2. What are the inputs (free variables) to the program? Ans: s, n, s[0.. n-1] assert(0 <= j < n); assert(0 <= i < n); Objective: (1) prove assertions valid or (2) produce an input that violates an assertion