CETS: Compiler-Enforced Temporal Safety for C

Slides:



Advertisements
Similar presentations
Dynamic memory allocation
Advertisements

Dynamic Memory Management
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
1 Memory Allocation Professor Jennifer Rexford COS 217.
Pointer applications. Arrays and pointers Name of an array is a pointer constant to the first element whose value cannot be changed Address and name refer.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2007 Exterminator: Automatically Correcting Memory Errors with High Probability Gene.
DIEHARDER: SECURING THE HEAP. Previously in DieHard…  Increase Reliability by random positioning of data  Replicated Execution detects invalid memory.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
OS Fall’02 Memory Management Operating Systems Fall 2002.
Pointers and Memory Allocation – part 2 -L. Grewe.
C and Data Structures Baojian Hua
ARRAYS AND POINTERS Although pointer types are not integer types, some integer arithmetic operators can be applied to pointers. The affect of this arithmetic.
Run-time Environment and Program Organization
1 CSE 303 Lecture 11 Heap memory allocation ( malloc, free ) reading: Programming in C Ch. 11, 17 slides created by Marty Stepp
1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation –The new operator –The delete operator –Dynamic.
HARDBOUND: ARCHITECURAL SUPPORT FOR SPATIAL SAFETY OF THE C PROGRAMMING LANGUAGE Kyle Yan Yu Xing 2014/10/15.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
Pointers Applications
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Safety in the C programming Language Peter Wihl May 26 th, 2005 CS 297 Security and Programming Languages.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
CS3012: Formal Languages and Compilers The Runtime Environment After the analysis phases are complete, the compiler must generate executable code. The.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Presentation of Failure- Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th, 2006.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Dynamic Memory Allocation The process of allocating memory at run time is known as dynamic memory allocation. C does not Inherently have this facility,
David Notkin Autumn 2009 CSE303 Lecture 12 October 24, 2009: Space Needle.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Pointers and Dynamic Memory Allocation Copyright Kip Irvine 2003, all rights reserved. Revised 10/28/2003.
Copyright 2005, The Ohio State University 1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
CSCI Rational Purify 1 Rational Purify Overview Michel Izygon - Jim Helm.
Buffer Overflow Attack Proofing of Code Binary Gopal Gupta, Parag Doshi, R. Reghuramalingam, Doug Harris The University of Texas at Dallas.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
+ Dynamic memory allocation. + Introduction We often face situations in programming where the data is dynamics in nature. Consider a list of customers.
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
Announcements Partial Credit Due Date for Assignment 2 now due on Sat, Feb 27 I always seem to be behind and get tons of daily. If you me and.
Software Security. Bugs Most software has bugs Some bugs cause security vulnerabilities Incorrect processing of security related data Incorrect processing.
Overview Working directly with memory locations is beneficial. In C, pointers allow you to: change values passed as arguments to functions work directly.
Memory-Related Perils and Pitfalls in C
Dynamic Allocation in C
Chapter 17 Free-Space Management
Object Lifetime and Pointers
Dynamic Storage Allocation
Data Types In Text: Chapter 6.
YAHMD - Yet Another Heap Memory Debugger
Debugging Memory Issues
ENEE150 Discussion 07 Section 0101 Adam Wang.
Concepts of programming languages
Checking Memory Management
Chapter 10: Pointers Starting Out with C++ Early Objects Ninth Edition
Chapter 9: Virtual-Memory Management
Memory Management III: Perils and pitfalls Mar 13, 2001
Pointers, Dynamic Data, and Reference Types
Dynamic Memory Allocation
Understanding Program Address Space
Effective and Efficient memory Protection Using Dynamic Tainting
Pointers The C programming language gives us the ability to directly manipulate the contents of memory addresses via pointers. Unfortunately, this power.
Code-Pointer Integrity
Dynamic Memory A whole heap of fun….
Data Structures and Algorithms Memory allocation and Dynamic Array
Understanding and Preventing Buffer Overflow Attacks in Unix
Dynamic Memory – A Review
Pointers, Dynamic Data, and Reference Types
Presentation transcript:

CETS: Compiler-Enforced Temporal Safety for C Santosh Nagarakatte, Jianzhou Zhao, Milo M. K. Martin, Steve Zdancewic Lara Khamisy, Kevin Matthews, and Chris Pratt

Introduction Description: Compiler-enforced temporal safety (CERT) for C programs. Background: Temporal memory safety errors are a prevalent source of software bugs in unmanaged languages such as C. Existing schemes that attempt to retrofit temporal safety for such languages have high runtime overheads and/or are incomplete, thereby limiting their effectiveness as debugging aid. Solution: CETS is a pass that will instrument IR to detect all temporal safety violations. Background: Temporal memory safety errors, such as dangling pointer dereferences and double frees, are a prevalent source of software bugs in unmanaged languages such as C. Existing schemes that attempt to retrofit temporal safety for such languages have high runtime overheads and/or are incomplete, thereby limiting their effectiveness as debugging aids

Overview of Memory Violations Spatial - the pointer refers to the wrong place in the address space Buffer overflow Dereference uninitialized pointer Temporal - the place in the address space is no longer valid Dangling pointer dereference Double free Both types of memory errors can result in crashes, silent data corruption, and severe security vulnerabilities.

Examples of Temporal Safety Violations Heap-based int *a, *b; a = malloc(8); … b = a; ... free(a); … = *b; Stack-based int *a; void foo() { int b; a = &b; } int main() { foo(); … = *a; Both examples are dereferencing a deallocated object Explanation in figure 1 of paper

Motivation for Detecting Temporal Safety Violations C/C++ provide low-level control/management of memory in OS, embedded software, etc Lack of bounds checking and manual memory management leads to temporal safety violations Temporal safety violations lead to crashes, silent data corruption, and severe security vulnerabilities Temporal safety violations include: Dangling pointer references (dereferencing a deallocated object that hasn’t been set to nullptr) Double frees (calling free on a dangling pointer) Invalid frees (calling free with a non-heap address)

Issues with Other Methods of Detecting Temporal Violations (e. g Issues with Other Methods of Detecting Temporal Violations (e.g. Valgrind Memcheck) High runtime overhead High memory overhead Failure to detect all temporal errors To the stack Reallocated heap addresses Arbitrary casts Requiring annotations inserted by the programmer

Program Instrumentation for Detecting Temporal Safety Violation Binary Instrumentation Hardware-assisted Instrumentation Source-level Instrumentation Compiler-based Instrumentation Binary instrumentation operates on an executable and produces a new one - operates on already linked libraries and even when source code is not available, but can lead to high runtime overhead Hardware-assisted would have the same benefits as binary level instrumentation without the runtime overhead, but major drawback is that it requires entirely new hardware Source-level happens before compilation and thus is independent of any specific compiler or instruction set, but makes it harder for compiler to optimize the additional memory operations Compiler-based allows compiler to make its standard optimizations before inserting checks and then again after the checks have been inserted

CETS Approach Compiler-based Instrumentation Apply Optimizations Apply pass to insert checks Apply Optimizations Again C Program Optimized IR with Checks IR of Program IR with Checks Binary instrumentation operates on an executable and produces a new one - operates on already linked libraries and even when source code is not available, but can lead to high runtime overhead Hardware-assisted would have the same benefits as binary level instrumentation without the runtime overhead, but major drawback is that it requires entirely new hardware Source-level happens before compilation and thus is independent of any specific compiler or instruction set, but makes it harder for compiler to optimize the additional memory operations Compiler-based allows compiler to make its standard optimizations before inserting checks and then again after the checks have been inserted

Quick Look at CETS Functionality and Limitations CETS uses compiler based instrumentation. Identifier based scheme, which assigns a unique key for each allocation region to identify dangling pointers Pointers are tracked using disjoint shadowspace (in order not to affect the program memory layout) Limitations The method does not detect spatial violations Must be combined with existing spatial safety mechanisms for 100% memory safety. When you compile your program and you allocate malloc and so on, your program occupies memory and allocate memory There's a layout in which addresses occupy your memory if you instrument your program and the instrumentation you added needs to allocate memory to your array for heap address. if your program allocate memory in heap and instrumentation code allocate memory on the heap, both of them are using the memory, you want to keep the layout exactly the same for every pointer they will create an array instead they do it in a different memory space so they don’t corrupt the memory space you can tell OS occupy memory from x to y. for the instrument action code allocate memory in a different space they're completely disjoint.

Lock-and-Key Identifier Based Approach CETS augments each pointer with two additional word-sized fields: (1) a unique allocation key and (2) a lock address that points to a lock location data Per Pointer Metadata Heap Allocation ptr1 = malloc(size); ptr1 ptr1_key ptr1_key lock address ptr1_key = counter++; ptr1_lock_addr = allocate_lock(); *(ptr1_lock_addr) = ptr1_key; freeable_ptrs_map.insert(ptr1_key, ptr1); lock ptr1_key The shaded area shows the instrumentation code added by the compiler pass. Each pointer object is associated with two fields - a key field and lock-address field. We keep a 64 bit counter which is incremented with each allocation so that each pointer has a unique key. When memory is allocated with a malloc call a unique key is assigned, and a lock location is allocated. The lock location is initialized with the ptr_key. CETS maintains mappings of keys that are freeable. On allocation requests malloc instructions are inserted to the list. Memory

Lock-and-Key Identifier Based Approach - Cont. CETS propagates metadata to newly allocated pointer to the same memory location data Per Pointer Metadata Pointer metadata propagation ptr2 = ptr1 + offset; ptr1 ptr1_key ptr1_key lock address ptr2_key = ptr1_key ptr2_lock_addr = ptr1_lock_addr; ptr1_key lock address ptr2 ptr1_key lock Key1 Shaded area shows instrumentation code added by the compiler pass Spatial check insures ptr1+offset is checked not to exceed bounds Ptr2 points to the same memory allocated previously and assigned to ptr1 Memory

Lock-and-Key Identifier Based Approach - Cont. CETS performs dangling pointer check data Per Pointer Metadata Dangling pointer check ptr1_key lock address if (ptr1_key != *(ptr1_lock_addr) { abort(); } ptr1 ptr1_key ptr1_key lock address ptr2 ptr1_key value = *ptr1; // original load lock ptr1_key Shaded area shows instrumentation code added by the compiler pass When memory is referenced we check if the key associated with the pointer is equal to the key stored in the lock location Memory

Lock-and-Key Identifier Based Approach - Cont. Heap deallocation actions data Per Pointer Metadata Heap deallocation ptr1_key lock address if (freeable_ptrs_map.lookup(ptr1_key) != ptr1) { abort(); } freeable_ptrs_map.remove(ptr1_key); ptr1 ptr1_key ptr1_key lock address ptr2 ptr1_key free(ptr1); *(ptr1_lock_addr) = INVALID_KEY; deallocate_lock(ptr1_lock_addr); lock ptr1_key Shaded area shows instrumentation code added by the compiler pass Check if ptr1 being freed is in freeable pointer map, and if so remove it from the map Set the lock location associated with the pointer being freed to invalid Memory

Call Stack Based Allocation / Deallocation void func() { // Function prologue To detect dangling pointers to the call stack, a key and corresponding lock address is also associated with each stack frame This key and lock address pair is given to any pointer derived from the stack pointer (and thus points to an object on the stack). local_key = next_key++; local_lock_addr++; // allocate lock address *(local_lock_addr) = local_key; int var; // local variable ptr = &var; // ptr defined in main ptr_key = local_key; ptr_lock_addr = local_lock_addr; // Function epilogue *(local_lock_addr) = INVALID_KEY; local_lock_addr--; // deallocate lock address }

Benchmarks The key advantage of compiler based is being able to perform optimizations before and after we insert instrumentation code No temporal checks are required for any pointer that is directly derived from the stack pointer within the corresponding function call, because the stack frame is guaranteed to live until the function exits Functional correctness: CETS successfully detected all temporal errors without false violations. Performance: Overall runtime overhead of 48% Compared to 77% with alternate implementation 116% when checking for spatial and temporal 122% overhead from other checker, such as Valgrind Memcheck

CETS Optimizations Unnecessary checks Redundant checks Pointers to globals Stack pointers within a function call Redundant checks

Conclusion CETS is a compiler based temporal safety detection method It allows for optimizations pre and post instrumentation code addition It doesn’t change the memory layout of the original program (compared with source code instrumentation methods) It was run correctly on NIST-SAMATE benchmark and was able to find all temporal errors By doing post IR instrumentation optimizations passes it was shown to produce 48% overhead, compared with existing methods of 77%.

Thank You Questions?