Cyclone: Safe Programming at the C Level of Abstraction Dan Grossman Cornell University May 2002 Joint work with: Trevor Jim (AT&T), Greg Morrisett, Michael.

Slides:



Advertisements
Similar presentations
Cyclone: A Memory-Safe C-Level Programming Language Dan Grossman University of Washington Joint work with: Trevor JimAT&T Research Greg MorrisettHarvard.
Advertisements

Chapter 17 vector and Free Store John Keyser’s Modifications of Slides By Bjarne Stroustrup
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Two alternatives of C: Cyclone and Vault Keami Hung February 01, 2007.
Chapter 7:: Data Types Programming Language Pragmatics
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Ross Tate, Juan Chen, Chris Hawblitzel. Typed Assembly Languages Compilers are great but they make mistakes and can introduce vulnerabilities Typed assembly.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University of Illinois at Urbana-Champaign Joint work with Sumant.
Summer School on Language-Based Techniques for Integrating with the External World Types for Safe C-Level Programming Part 3: Basic Cyclone-Style Region-
Type-Safe Programming in C George Necula EECS Department University of California, Berkeley.
Typed Memory Management in a Calculus of Capabilities David Walker (with Karl Crary and Greg Morrisett)
Laboratory for Computer Science Massachusetts Institute of Technology Ownership Types for Safe Region-Based Memory Management in Real-Time Java Chandrasekhar.
Cyclone, Regions, and Language-Based Safety CS598e, Princeton University 27 February 2002 Dan Grossman Cornell University.
Run time vs. Compile time
Existential Types for Imperative Languages Dan Grossman Cornell University Eleventh European Symposium on Programming April 2002.
A Type System for Expressive Security Policies David Walker Cornell University.
Region-Based Memory Management in Cyclone Dan Grossman Cornell University June 2002 Joint work with: Greg Morrisett, Trevor Jim (AT&T), Michael Hicks,
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To.
Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.
Cyclone: Safe C-Level Programming (With Multithreading Extensions) Dan Grossman Cornell University October 2002 Joint work with: Trevor Jim (AT&T), Greg.
Playing With Fire: Mutation and Quantified Types CIS670, University of Pennsylvania 2 October 2002 Dan Grossman Cornell University.
Cyclone: Safe Programming at the C Level of Abstraction Dan Grossman Cornell University Joint work with: Trevor Jim (AT&T), Greg Morrisett, Michael Hicks.
CSE341: Programming Languages Lecture 11 Type Inference Dan Grossman Winter 2013.
Safety in the C programming Language Peter Wihl May 26 th, 2005 CS 297 Security and Programming Languages.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes.
Chapter TwelveModern Programming Languages1 Memory Locations For Variables.
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 14: Pointers, Classes, Virtual Functions, and Abstract Classes.
1 Exceptions (Section 8.5) CSCI 431 Programming Languages Fall 2003 A compilation of material developed by Felix Hernandez-Campos and Michael Scott.
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Names Variables Type Checking Strong Typing Type Compatibility 1.
CSE 425: Data Types II Survey of Common Types I Records –E.g., structs in C++ –If elements are named, a record is projected into its fields (e.g., via.
Bindings and scope  Bindings and environments  Scope and block structure  Declarations Programming Languages 3 © 2012 David A Watt, University.
Compiler Construction
10/16/2015IT 3271 All about binding n Variables are bound (dynamically) to values n values must be stored somewhere in the memory. Memory Locations for.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Chapter 0.2 – Pointers and Memory. Type Specifiers  const  may be initialised but not used in any subsequent assignment  common and useful  volatile.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
Experience with Safe Memory Management in Cyclone Michael Hicks University of Maryland, College Park Joint work with Greg Morrisett - Harvard Dan Grossman.
Basic Semantics Associating meaning with language entities.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
C++ Memory Overview 4 major memory segments Key differences from Java
Pointers and Dynamic Memory Allocation Copyright Kip Irvine 2003, all rights reserved. Revised 10/28/2003.
Cyclone Memory Management Greg Morrisett Cornell University & Microsoft Research.
Combining Garbage Collection and Safe Manual Memory Management Michael Hicks University of Maryland, College Park Joint work with Greg Morrisett - Harvard,
Object-Oriented Programming Chapter Chapter
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 10 – C: the heap and manual memory management.
Efficient Detection of All Pointer and Array Access Errors Todd M.Austin Scott E.Breach Gurindar S.Sohi Computer Sciences Department University of Wisconsin-Madison.
CS 330 Programming Languages 10 / 23 / 2007 Instructor: Michael Eckmann.
Types and Programming Languages Lecture 10 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Dr. M. Al-Mulhem Introduction 1 Chapter 6 Type Systems.
Object Lifetime and Pointers
Names and Attributes Names are a key programming language feature
CSE 374 Programming Concepts & Tools
CSE 374 Programming Concepts & Tools
CS 215 Final Review Ismail abumuhfouz Fall 2014.
Cyclone A Very Short Introduction Dan Grossman
Names, Binding, and Scope
Programming with Regions
Dan Grossman University of Washington 26 July 2007
Pointers C#, pointers can only be declared to hold the memory addresses of value types int i = 5; int *p; p = &i; *p = 10; // changes the value of i to.
Cyclone: A safe dialect of C
Implementation and Evaluation of a Safe Runtime in Cyclone
SPL – PS2 C++ Memory Handling.
Presentation transcript:

Cyclone: Safe Programming at the C Level of Abstraction Dan Grossman Cornell University May 2002 Joint work with: Trevor Jim (AT&T), Greg Morrisett, Michael Hicks, James Cheney, Yanling Wang (Cornell)

7 May 2002Cyclone, 3rd Annual PL Day2 A disadvantage of C Lack of memory safety means code cannot enforce modularity/abstractions: void f(){ *((int*)0xBAD) = 123; } What might address 0xBAD hold? Memory safety is crucial for your favorite policy No desire to compile programs like this

7 May 2002Cyclone, 3rd Annual PL Day3 Safety violations rarely local void g(void**x,void*y); int y = 0; int *z = &y; g(&z,0xBAD); *z = 123; Might be safe, but not if g does *x=y Type of g enough for separate code generation Type of g not enough for separate safety checking

7 May 2002Cyclone, 3rd Annual PL Day4 Some other problems One safety violation can make your favorite policy extremely difficult to enforce So prohibit: incorrect casts, array-bounds violations, misused unions, uninitialized pointers, dangling pointers, null-pointer dereferences, dangling longjmp, vararg mismatch, not returning pointers, data races, …

7 May 2002Cyclone, 3rd Annual PL Day5 What to do? Stop using C –YFHLL is usually a better choice Compile C more like Scheme –type fields, size fields, live-pointer table, … –fail-safe for legacy whole programs Static analysis –very hard, less modular Restrict C –not much left

7 May 2002Cyclone, 3rd Annual PL Day6 Cyclone in brief A safe, convenient, and modern language at the C level of abstraction Safe: memory safety, abstract types, no core dumps C-level: user-controlled data representation and resource management, easy interoperability, “manifest cost” Convenient: may need more type annotations, but work hard to avoid it Modern: add features to capture common idioms “New code for legacy or inherently low-level systems”

7 May 2002Cyclone, 3rd Annual PL Day7 The plan from here Not-null pointers Type-variable examples –parametric polymorphism –region-based memory management Dataflow analysis Status Related work I will skip many very important features

7 May 2002Cyclone, 3rd Annual PL Day8 Subtyping: < t * but t < t  Downcast via run-time check, often avoided via flow analysis Not-null pointers t*t* pointer to a t value or NULL pointer to a t value /

7 May 2002Cyclone, 3rd Annual PL Day9 Example FILE* fopen(const const int int void g() { FILE* f = fopen(“foo”, “r”); while(fgetc(f) != EOF) {…} fclose(f); } Gives warning and inserts one null-check Encourages a hoisted check

7 May 2002Cyclone, 3rd Annual PL Day10 The same old moral FILE* fopen(const const int int Richer types make interface stricter Stricter interface make implementation easier/faster Exposing checks to user lets them optimize Can’t check everything statically (e.g., close-once)

7 May 2002Cyclone, 3rd Annual PL Day11 “Change void* to alpha” struct Lst { void* hd; struct Lst* tl; }; struct Lst* map( void* f(void*), struct Lst*); struct Lst* append( struct Lst*, struct Lst*); struct Lst { `a hd; struct Lst * tl; }; struct Lst * map( `b f(`a), struct Lst *); struct Lst * append( struct Lst *, struct Lst *);

7 May 2002Cyclone, 3rd Annual PL Day12 Not much new here Closer to C than ML: less type inference allows first-class polymorphism and polymorphic recursion data representation may restrict α to pointers, int (why not structs? why not float ? why int ?) Not C++ templates

7 May 2002Cyclone, 3rd Annual PL Day13 Existential types Programs need a way for “call-back” types: struct T { void (*f)(void*, int); void* env; }; We use an existential type (simplified for now): struct T { void int); `a env; }; more C-level than baked-in closures/objects

7 May 2002Cyclone, 3rd Annual PL Day14 The plan from here Not-null pointers Type-variable examples (α, , , λ) –parametric polymorphism –region-based memory management Dataflow analysis Status Related work I will skip many very important features

7 May 2002Cyclone, 3rd Annual PL Day15 Regions a.k.a. zones, arenas, … Every object is in exactly one region Allocation via a region handle All objects in a region are deallocated simultaneously (no free on an object) An old idea with recent support in languages (e.g., RC) and implementations (e.g., ML Kit)

7 May 2002Cyclone, 3rd Annual PL Day16 Cyclone regions heap region: one, lives forever, conservatively GC’d stack regions: correspond to local-declaration blocks: {int x; int y; s} dynamic regions: scoped lifetime, but growable: region r {s} allocation: rnew(r,3), where r is a handle handles are first-class –caller decides where, callee decides how much –no handles for stack regions

7 May 2002Cyclone, 3rd Annual PL Day17 That’s the easy part The implementation is really simple because the type system statically prevents dangling pointers void f() { int* x; if(1) { int y = 0; x = &y; // x not dangling } *x; // x dangling }

7 May 2002Cyclone, 3rd Annual PL Day18 The big restriction Annotate all pointer types with a region name (a type variable of region kind) means “pointer into the region created by the construct that introduces `r ” –heap introduces `H – L:… introduces `L – region r {s} introduces `r r has type region_t

7 May 2002Cyclone, 3rd Annual PL Day19 Region polymorphism Apply what we did for type variables to region names (only it’s more important and could be more onerous) void x, y){ int tmp = *x; *x = *y; *y = tmp; } sumptr(region_t r,int x,int y){ return rnew(r) (x+y); }

7 May 2002Cyclone, 3rd Annual PL Day20 Type definitions struct ILst { hd; struct ILst *`r2 tl; };

7 May 2002Cyclone, 3rd Annual PL Day21 Region subtyping If p points to an int in a region with name `r1, is it ever sound to give p type int* `r2 ? If so, let int* `r1 < int* `r2 Region subtyping is the outlives relationship region r1 {… region r2 {…}…} LIFO makes subtyping common

7 May 2002Cyclone, 3rd Annual PL Day22 Soundness Ignoring , scoping prevents dangling pointers int*`L f() { L: int x; return &x; } End of story if you don’t use  For , we leak a region bound: struct T { :regions(`a) > `r void int); `a env; }; A powerful effect system is there in case you want it

7 May 2002Cyclone, 3rd Annual PL Day23 Regions summary Annotating pointers with region names (type variables) makes a sound, simple, static system Polymorphism, type constructors, and subtyping recover much expressiveness Inference and defaults reduce burden Other chapters future features: –array bounds: void f(tag_t,int*`i); default 1 –mutual exclusion: void f(lock_t,int*`L); default thread-local

7 May 2002Cyclone, 3rd Annual PL Day24 The plan from here Not-null pointers Type-variable examples –parametric polymorphism –region-based memory management Dataflow analysis Status Related work I will skip many very important features

7 May 2002Cyclone, 3rd Annual PL Day25 Example int*`r* f(int*`r q) { int **p = malloc(sizeof(int*)); // p not NULL, points to malloc site *p = q; // malloc site now initialized return p; } Harder than in Java because of pointers Analysis includes must-points-to information Interprocedural annotation: “initializes” a parameter

7 May 2002Cyclone, 3rd Annual PL Day26 Flow-analysis strategy Current uses: definite-assignment, null- checks, array-bounds checks, must return When invariants are too strong, program- point information is more useful Sound with respect to aliases (programmers can make copies) Checked interprocedural annotations keep analysis local

7 May 2002Cyclone, 3rd Annual PL Day27 Status Cyclone really exists –110KLOC, including bootstrapped compiler, web server, multimedia overlay network, … –gcc back-end (Linux, Cygwin, OSX, …) –user’s manual, mailing lists, … –still a research vehicle –more features: exceptions, tagged unions, varargs, … Publications –overview: USENIX 2002 –regions: PLDI 2002 –existentials: ESOP 2002

7 May 2002Cyclone, 3rd Annual PL Day28 Related work: higher and lower Adapted/extended ideas: –polymorphism [ML, Haskell, …] –regions [Tofte/Talpin, Walker et al., …] –safety via dataflow [Java, …] –existential types [Mitchell/Plotkin, …] –controlling data representation [Ada, Modula-3, …] Safe lower-level languages [TAL, PCC, …] –engineered for machine-generated code Vault: stronger properties via restricted aliasing

7 May 2002Cyclone, 3rd Annual PL Day29 Related work: making C safer Compile to make dynamic checks possible –Safe-C [Austin et al., …] –Purify, Stackguard, Electric Fence, … –CCured [Necula et al.] performance via whole-program analysis more on array-bounds, less on memory management and polymorphism RC [Gay/Aiken]: reference-counted regions separate from stack and heap LCLint [Evans]: unsound-by-design, but very useful SLAM: checks user-defined property w/o annotations; assumes no bounds errors

7 May 2002Cyclone, 3rd Annual PL Day30 Plenty left to do Beyond LIFO memory management Resource exhaustion (e.g., stack overflow) More annotations for aliasing properties More “compile-time arithmetic” (e.g., array initialization) Better error messages (not a beginner’s language)

7 May 2002Cyclone, 3rd Annual PL Day31 Summary Memory safety is essential for your favorite policy C isn’t safe, but the world’s software-systems infrastructure relies on it Cyclone combines advanced types, flow analysis, and run-time checks, to create a safe, usable language with C-like data, resource management, and control best to write some code