SAFECode Memory Safety Without Runtime Checks or Garbage Collection By Dinakar Dhurjati Joint work with Sumant Kowshik, Vikram Adve and Chris Lattner University.

Slides:



Advertisements
Similar presentations
Memory Management Chapter FourteenModern Programming Languages, 2nd ed.1.
Advertisements

Dynamic Memory Management
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
CS-1030 Dr. Mark L. Hornick 1 Pointers And Dynamic Memory.
Bounding Space Usage of Conservative Garbage Collectors Ohad Shacham December 2002 Based on work by Hans-J. Boehm.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Memory allocation CSE 2451 Matt Boggus. sizeof The sizeof unary operator will return the number of bytes reserved for a variable or data type. Determine:
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
CS 536 Spring Automatic Memory Management Lecture 24.
CSC321: Programming Languages 11-1 Programming Languages Tucker and Noonan Chapter 11: Memory Management 11.1 The Heap 11.2 Implementation of Dynamic Arrays.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University of Illinois at Urbana-Champaign Joint work with Sumant.
CSE 2501 Review Declaring a variable allocates space for the type of datum it is to store int x; // allocates space for an int int *px; // allocates space.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
Run-Time Storage Organization
Run time vs. Compile time
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
1 CSE 303 Lecture 11 Heap memory allocation ( malloc, free ) reading: Programming in C Ch. 11, 17 slides created by Marty Stepp
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Names and Bindings Introduction Names Variables The concept of binding Chapter 5-a.
Automatic Pool Allocation for Disjoint Data Structures Presented by: Chris Lattner Joint work with: Vikram Adve ACM.
Safety in the C programming Language Peter Wihl May 26 th, 2005 CS 297 Security and Programming Languages.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
Chapter TwelveModern Programming Languages1 Memory Locations For Variables.
Secure Virtual Architecture: A Safe Execution Environment for Commodity Operating Systems John Criswell, University of Illinois Andrew Lenharth, University.
Secure Virtual Architecture John Criswell, Arushi Aggarwal, Andrew Lenharth, Dinakar Dhurjati, and Vikram Adve University of Illinois at Urbana-Champaign.
CS3012: Formal Languages and Compilers The Runtime Environment After the analysis phases are complete, the compiler must generate executable code. The.
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
CSC 253 Lecture 2. Some differences between Java and C  Compiled C code is machine specific, whereas Java compiles for a virt. machine.  Virtual machines.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
Basic Semantics Associating meaning with language entities.
Semantics of Arrays and Pointers By: M. Reza Heydarian Introduction Pointers Arrays Semantics of Arrays Semantics of Pointers.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R F I V E Memory Management.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
COMP3190: Principle of Programming Languages
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
1 CS Programming Languages Class 09 September 21, 2000.
ISBN Chapter 6 Data Types Pointer Types Reference Types Memory Management.
Transparent Pointer Compression for Linked Data Structures June 12, 2005 MSP Chris Lattner Vikram Adve.
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo.
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Automatic Pool Allocation for better Memory System Performance Presented by: Chris Lattner Joint work with: Vikram Adve
The Execution System1. 2 Introduction Managed code and managed data qualify code or data that executes in cooperation with the execution engine The execution.
Records type city is record -- Ada Name: String (1..10); Country : String (1..20); Population: integer; Capital : Boolean; end record; struct city { --
Data Types Chapter 6: Data Types Lectures # 13. Topics Chapter 6: Data Types 2 Introduction Primitive Data Types Character String Types Array Types Associative.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Automatic Memory Management Without Run-time Overhead Brian Brooks.
Object Lifetime and Pointers
Data Types In Text: Chapter 6.
YAHMD - Yet Another Heap Memory Debugger
Types for Programs and Proofs
Dynamic Memory Allocation
Storage.
Closure Representations in Higher-Order Programming Languages
CETS: Compiler-Enforced Temporal Safety for C
Implementation and Evaluation of a Safe Runtime in Cyclone
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
Presentation transcript:

SAFECode Memory Safety Without Runtime Checks or Garbage Collection By Dinakar Dhurjati Joint work with Sumant Kowshik, Vikram Adve and Chris Lattner University of Illinois at Urbana-Champaign

SAFECode Motivation Upgrade := new software modules in to host application Same address space Need to protect the host from Buggy/Untrusted modules Need to ensure each module is memory safe Memory safe := Never access a memory location outside its data area Never execute instructions outside its code area Secure Online Upgrades in Embedded Systems Need Language and Compiler support for memory safety

SAFECode Existing Solutions Problems Bad Casts Unintialized pointers Array bound violations Dangling pointers to stack locations Dangling pointers to freed memory Current Solutions e.g. Java, RT-Java,Modula -3 Runtime null pointer checks Not expressible No free + Garbage Collection [ + scoped regions + runtime checks] Runtime array bound checks Runtime checks or garbage collection unattractive for Embedded Code Type checker disallows

SAFECode Our Approach 1.Minimal semantic restrictions to enable static checking –No new syntax or annotations 2.Aggressive compiler techniques (old and new) Goal : 100 % Static Checking

SAFECode Our Previous Work [CASES2002] Problems Bad Casts Unintialized pointers Array bound violations Dangling pointers to stack locations (stack safety) Dangling pointers to freed memory (heap safety) Our Previous Solutions Type checker disallows Initialize to reserved address range Language rule + Compiler checks ??? Restrict index to be affine in terms of size Static safety for real time control programs

SAFECode Contributions of this Work 100% static technique for ensuring heap safety for “type safe” C programs –No Runtime Checks –No Garbage Collection : allow explicit deallocation! –No Programmer annotations Evaluate our approach on 17 embedded benchmarks –Array Safety –Heap safety –Stack safety

SAFECode Talk Outline Introduction Our Solution Results Related Work Conclusion

SAFECode Methodology Do not prevent uses of dangling pointers to freed memory Ensure that they cannot cause memory safety violation –Builds on a compiler transformation called “Automatic Pool Allocation”

SAFECode Dangling pointer problem struct S *p, *q; …. p = q …. free(p) … q->next->val = … // dangling pointer usage p q q r = malloc(sizeof(struct T)) q q->next r q r

SAFECode q Making Dangling Pointers Safe struct S *p, *q; …. p = q …. free(p) … q->next->val = … // dangling pointer usage s = malloc(sizeof(struct S)) s ->next q q->next s Principle : If freed memory is reallocated to any object of the same type with same alignment, then dereferencing pointers to freed memory is safe.

SAFECode Exploiting the principle First simple solution –N different heaps based on type, N = #types –Never move memory from one heap to other BUT : Increased memory consumption A more sophisticated solution : Using previously developed compiler transformation called Automatic Pool Allocation

SAFECode Automatic Pool Allocation Identifies logical data structures not reachable from outside a function Creates a separate pool (region) for nodes of that data structure. At the function exit, entire pool is deallocated Advantages : Fine grained pools Small life times Type homogenous pools Explicit deallocation

SAFECode Pool Allocation Example f() { … p = g(); … p->next->val = ….. } g() { … p = create_list_10_nodes(p); h(p); free_everything_but_head(p); … return p; } p h(struct S *p) { … for (j = 0; j < ; j++) { tmp = malloc(sizeof(struct s)) insert_tmp_to_list(tmp,p); …. q = least_useful_member(p) free(q); } … }

SAFECode Pool Allocation Example f() { PP = poolinit(struct S); … p = g(PP); … p->next->val = ….. pooldestroy(PP) } g(PoolPointer *PP) { … p = create_list_10_nodes(PP); h(p, PP); free_everything_but_head(p, PP); … return p; } p h(struct S *p, PoolPointer *PP) { … for (j = 0; j < ; j++) { tmp = poolalloc(PP); insert_tmp_to_list(tmp,p); …. q = least_useful_member(p); poolfree(q, PP); } … } Not Memory Safe PP

SAFECode Using Pool Allocation for Safety Pools are type homogenous Restriction : Do not release memory from a pool until pooldestroy => The principle is satisfied : “Accessible freed memory is reallocated only to objects of the same type”. Memory Safety guaranteed Problem –Could lead to increased memory consumption –Need to identify when it happens

SAFECode Identifying increased memory usage f() { PP = poolinit(struct S); … g(p, PP); … p->next->val = ….. pooldestoy(PP); } g(Struct S *p, PoolPointer *PP) { … create_list_10_nodes(p, PP ); h(p, PP ); free_everything_but_head(p, PP ); … } h(struct S *p, PoolPointer *PP) { … //no free after allocation for (j = 0; j < 10; j++) { …. q = least_useful_member(p) poolfree(q,PP); } … } Case 1 : No Reuse h(struct S *p, PoolPointer *PP) { … for (j = 0; j < ; j++) { tmp = poolalloc(PP); insert_tmp_to_list(tmp,p); …. q = least_useful_member(p) poolfree(q,PP); } … } Case 2 : Self Reuse h(struct S *p, PoolPointer *PP) { … for (j = 0; j < ; j++) { tmp = poolalloc(PP); tmp2 = poolalloc(PP2); insert_tmp_to_list(tmp,p); …. q = least_useful_member(p) poolfree(q,PP); } … } Case 2 : Self Reuse Case 3 : Cross Reuse

SAFECode Algorithm On all control flow paths interprocedurally For every poolfree(…, P) on the path, if before the subsequent pooldestroy(P) there is No poolalloc : P is Case 1 poolalloc only from the same pool : P is Case 2 poolalloc from a different pool : P is Case 3

SAFECode Implementation CGCC LLVM Linker LLVM Object code Source Language Independence Link – time Analysis => whole program analysis Pool Allocation Categorizing pools Safe Code with no checks Type Safety Uninit. Variables Stack Safety Array Safety C++

SAFECode Evaluation 17 applications in MediaBench and MIBench suite of bench marks Studied how easy it is to port them Results for 6 of them ProgramLines of Code No of Lines Modified Basicmath5794 Epic35244 Dijkstra3480 Gsm60380 Mpeg98390 Rasta Total for 17 pgms

SAFECode Results : Heap Safety All 17 programs are proven heap safe! 15 Programs had only Case 1 or Case 2 pools Memory safety without increase in memory consumption 2 programs with Case 3 pools –Rasta : 5 Case-3 pools out of 14 –Epic : 1 Case-3 pool out of 14 –Further Analysis can convert some Case 3 pools to Case 2

SAFECode Results - II Stack Safety –16 codes passed the compiler –1 code needs restructuring to pass Array Safety –Only 8 codes passed –Indirection vectors caused 5 codes to fail –Detected 4 bugs in benchmarks Array Bounds Checking remaining bottleneck for 100% static checking How do our previous techniques work ?

SAFECode Related Work : Static Heap Safety Checking Linear and alias type systems : Vault –Severely restrict aliasing in programs –Require lot of annotations Region based Schemes : TofteTalpin[TOPLAS98], Aiken[PLDI98], Cyclone, RT-Java, Boyapati[PLDI03] –No deallocation within a region –Manual region annotations in most cases

SAFECode Conclusions Result : Guarantee heap safety statically for “type safe” C. => New State of the Art : 100 % Static Checking for all C codes that –Are “type safe” –Use only affine array references

SAFECode URLs SAFECode Static Analysis For safe Execution of Code LLVM

SAFECode Pool Allocation Example f() { … g(p); … p->next->val = ….. } g(Struct S *p) { … create_list_10_nodes(p); h(p); free_everything_but_head(p); … } PP = poolinit(Struct S); h(struct S *p) { … for (j = 0; j < ; j++) { insert_tmp_to_list(tmp); …. q = least_useful_member(p) free(q); } … } poolfree(PP,q) tmp = malloc(sizeof(Struct S)) pooldestroy(PP); tmp = poolalloc(PP)

SAFECode Results : Heap Safety All 17 are guaranteed heap safe 15 programs had only Case 1 or Case 2 poolsMemory safety without increase in memory consumption 2 programs with Case 3 pools rasta -- 5 Case 3 out of 14 pools epic -- 1 Case 3 out of 13 pools Further analysis can convert some case 3 to case 2 pools

SAFECode Conclusions and Future Work 100 % static checking for heap safety Future Work –Improve the array bounds checker –Evaluate the actual increase, if any, in the few case 3 pools. –Detect illegal system calls/illegal sequences of system calls