CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor (TAU), Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU)

Slides:



Advertisements
Similar presentations
Advanced programming tools at Microsoft
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
De necessariis pre condiciones consequentia sine machina P. Consobrinus, R. Consobrinus M. Aquilifer, F. Oratio.
Intermediate Code Generation
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Programming Languages and Paradigms The C Programming Language.
What is a pointer? First of all, it is a variable, just like other variables you studied So it has type, storage etc. Difference: it can only store the.
SPLINT STATIC CHECKING TOOL Sripriya Subramanian 10/29/2002.
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
1 Lecture 05 – Pointer Analyses & CSSV Eran Yahav.
6/10/2015C++ for Java Programmers1 Pointers and References Timothy Budd.
Chapter 10.
1 Chapter 7: Runtime Environments. int * larger (int a, int b) { if (a > b) return &a; //wrong else return &b; //wrong } int * larger (int *a, int *b)
Program analysis Mooly Sagiv html://
Assume/Guarantee Reasoning using Abstract Interpretation Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv.
 2006 Pearson Education, Inc. All rights reserved Arrays.
1 Chapter 6 Looping Dale/Weems/Headington. 2 l Physical order vs. logical order l A loop is a repetition control structure based on a condition. l it.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Axiomatic Semantics ICS 535.
1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation –The new operator –The delete operator –Dynamic.
Testing a program Remove syntax and link errors: Look at compiler comments where errors occurred and check program around these lines Run time errors:
Statically Detecting Likely Buffer Overflow Vulnerabilities David Larochelle David Evans University of Virginia Department of Computer Science Supported.
1 CISC181 Introduction to Computer Science Dr. McCoy Lecture 19 Clicker Questions November 3, 2009.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
Dagstuhl Seminar "Applied Deductive Verification" November Symbolically Computing Most-Precise Abstract Operations for Shape.
1 Chapter 9 Scope, Lifetime, and More on Functions.
 2006 Pearson Education, Inc. All rights reserved Arrays.
Chapter 6 Buffer Overflow. Buffer Overflow occurs when the program overwrites data outside the bounds of allocated memory It was one of the first exploited.
Chapter 9 Character Strings 9.1 Character String Constants A character string constant is a sequence of characters enclosed in double quotation mark. Examples.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 12: Pointers continued, C strings.
Defining and Converting Data Copyright Kip Irvine, 2003 Last Update: 11/4/2003.
CS 363 Comparative Programming Languages Semantics.
Character Arrays Based on the original work by Dr. Roger deBry Version 1.0.
CS Midterm Study Guide Fall General topics Definitions and rules Technical names of things Syntax of C++ constructs Meaning of C++ constructs.
Reasoning about programs March CSE 403, Winter 2011, Brun.
Static Analysis of Memory Errors Mooly Sagiv Tel Aviv University.
What is exactly Exploit writing?  Writing a piece of code which is capable of exploit the vulnerability in the target software.
Pointers *, &, array similarities, functions, sizeof.
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
CSSV – C String Static Verifier Nurit Dor Michael Rodeh Mooly Sagiv Greta Yorsh Tel-Aviv University
CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor, Michael Rodeh, Mooly Sagiv PLDI’2003 DAEDALUS project.
Slides by Kent Seamons and Tim van der Horst Last Updated: Nov 11, 2011.
Pointer Lecture 2 Course Name: High Level Programming Language Year : 2010.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
1 Recall that... char str [ 8 ]; str is the base address of the array. We say str is a pointer because its value is an address. It is a pointer constant.
1 Chapter 15-1 Pointers, Dynamic Data, and Reference Types Dale/Weems.
1 Dynamic Memory Allocation. 2 In everything we have done so far, our variables have been declared at compile time. In these slides, we will see how to.
1988 Morris Worm … estimated 10% penetration 2001 Code Red … 300,00 computers breached 2003 Slammer/Sapphire … 75,00 infections in 10 min Zotob …
1988 Morris Worm … estimated 10% penetration 2001 Code Red … 300,00 computers breached 2003 Slammer/Sapphire … 75,00 infections in 10 min Zotob …
Chapter 2 Variables and Constants. Objectives Explain the different integer variable types used in C++. Declare, name, and initialize variables. Use character.
Array in C# Array in C# RIHS Arshad Khan
Computer Science 210 Computer Organization
C Basics.
Methods The real power of an object-oriented programming language takes place when you start to manipulate objects. A method defines an action that allows.
Computer Science 210 Computer Organization
Design by Contract Fall 2016 Version.
CS 465 Buffer Overflow Slides by Kent Seamons and Tim van der Horst
7 Arrays.
Popping Items Off a Stack Lesson xx
Pointers, Dynamic Data, and Reference Types
Chapter 6 Intermediate-Code Generation
7 Arrays.
CS150 Introduction to Computer Science 1
C++ Pointers and Strings
ENERGY 211 / CME 211 Lecture 8 October 8, 2008.
C++ Pointers and Strings
Variables and Constants
SOFTWARE ENGINEERING INSTITUTE
SPL – PS2 C++ Memory Handling.
Presentation transcript:

CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor (TAU), Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU) Greta Yorsh (TAU)? Seminar in Program Analysis for Cyber-Security Ittay Eyal, March 2011

High-Level Structure 2

Example void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText) { INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return; } 3

Core C Control-flow statements: if, goto, break, or continue Expressions are side-effect free and cannot be nested All assignments are statements Declarations do not have initializations Address-of formal variables is not allowed 4

void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText) { INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return; } void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ‘\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ‘\0’; } 5

Contracts Describe input, side-effects and output: Requires Modifies Ensures 6

void SkipLine(int NbLine, char** PtrEndText) requires is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0 modifies *PtrEndText *PtrEndText.is_nullt *PtrEndText.strlen ensures *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText] pre + NbLine; void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; } 7

void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } 8

Requires: is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0 Modifies: *PtrEndText, *PtrEndText.is_nullt, *PtrEndText.strlen Ensures: *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText] pre + NbLine; void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; } void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } 9

10

11

void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } void SkipLine(int NbLine, char** PtrEndText) 12

P  inline(P) Function Entry point: Assume pre-conditions. Store inputs ([x] pre ) in temporary variables for post-conditions check. Return: Set return_value P. Function exit: Assert post-conditions. Function call and its result assertion: Assert pre-conditions. Assume post-conditions (possibly w.r.t. inputs). 13

Pointer Analysis The target – determine which objects may be updated through a pointer. Whole program points-to state is calculated. Then per-procedure. 14

Pointer Analysis foo(char *p, char *q) { char local[100]; … p = local; *q = 0; … } main() { char s[10], t[20], r[30]; char *temp; foo(s,t); foo(s,r); … temp = s … } str temp local pq 15

Pointer Analysis foo(char *p, char *q) { char local[100]; … p = local; *q = 0; … } main() { char s[10], t[20], r[30]; char *temp; foo(s,t); foo(s,r); … temp = s … } PARAM #1 local pq Parametrization for foo PARAM #2 16

C to Integer Program 17

C2IP Inline(P) Pointer info Integer Program l.val: possible values. l.offset: w.r.t. base address. l.aSize: Allocation size. l.is_nullt: Null terminated? l.len: String length (with \0) 18

C to Integer Program Expression Check 19

C to Integer Program Constructs to Statements 20

C to Integer Program Notation V: the number of variables and allocation sites. S: the number of C expressions. Integer Program Complexity O(V) constraint variables Each pointer may point to O(V) locations Total complexity: O(S V) 21

Integer Analysis Calculates the inequalities that hold at each point. Conservative. Each assertion is verified against the inequalities. 22

Integer Analysis *PtrEndText.alloc > NbLine void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); } 23

Integer Analysis - Contracts To optimize the contracts, do the following: 1.Assume True preconditions Use ASPost [1] to calculate the linear inequalities at the exit point Deduce the postconditions. 1.Use AWPre to calculate backwards the most liberal preconditions. [6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang.,

Implementation C  CoreC: Based on the AST-Toolkit [32] Points-to analysis: Golf [8, 9] Integer analysis: Polyhedra library [6, 19] [6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang., [8] M. Das. Unification-based pointer analysis with directional assignments. In SIGPLAN Conf. on Prog. Lang. Design and Impl., [9] M. Das, B. Liblit, M. F¨hndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Static Analysis Symp., [19] B. Jeannet. New polka library. Available at “ [32] Microsoft Research. AST-toolkit

Empirical Results Source from two real-world projects: String manipulation library from EADS Airbus code. 11 procedures, 400 lines. Part of the WEB2c converter. 8 procedures, 460 lines. 26

Empirical Results 27

Empirical Results 28

Empirical Results 29

Conclusion Not easy to analyze C. Plenty of techniques and tools. High false positive ratio - without hand-crafted contracts. Experimental results section slim. High variance for little data. (They had to write all contracts…) What would happen to normal code? 30