Implementation and Evaluation of a Safe Runtime in Cyclone Matthew Fluet Cornell University Greg Morrisett Harvard University Daniel Wang Princeton University
Introduction Web-based services and applications Require secure and robust implementations Often written in high-level languages C#, Java, Perl, PHP, Python, Tcl Provide memory safety Immune to common security problems Provide automatic memory management Immune to memory leaks But ……
Introduction Web-based services and applications are hosted on servers and executed by interpreters, both written in low-level languages
Introduction Web-based services and applications are hosted on servers and executed by interpreters, both written in low-level languages Application
Introduction Web-based services and applications are hosted on servers and executed by interpreters, both written in low-level languages Interpreter Application
Introduction Web-based services and applications are hosted on servers and executed by interpreters, both written in low-level languages Web-server Interpreter Application
Web-server Interpreter Runtime system Introduction Web-based services and applications are hosted on servers and executed by interpreters, both written in low-level languages Application Memory Management
Introduction Long-term goal a complete web-application platform written in a high-level language Short-term goal a complete interpreter written in a high-level language Implementing the core of an interpreter is not in itself a significant challenge Implementing the runtime system is a challenge
Technical achievement Advances in type systems have made it possible to implement a runtime system that provides garbage-collection services using a programming language that guarantees memory safety. Significantly reduces the TCB necessary to support web-application platforms Incurs an acceptable performance penalty
Outline A Scheme interpreter in Cyclone Why Scheme Key Features of Cyclone Core Scheme Interpreter Garbage Collector Performance Evaluation Conclusion
Why Scheme? Ease of implementation Core interpreter loop is only ~500 lines Use an external Scheme front-end to expand the full Scheme language into a core Scheme subset Features desirable for web programming But, no technical results depend on Scheme as the interpreted language
Key Features of Cyclone Safe, C-like language Static type- and control-flow analysis Intended for systems programming Data representation Resource management Region-based memory management Static, lexical, dynamic, heap, unique, …
Key Features of Cyclone Pointers Nullable: t* Non-null: Fat: Regions Region names: `r Pointers: t*`r Polymorphism:
Cyclone: Regions Region variety Allocation (objects) Deallocation Aliasing (objects) (what)(when) Stackstatic whole region exit of lexical scope unrestricted Lexical dynamic Dynamicmanual Heap ( `H ) single objects automatic (BDW GC) Unique ( `U ) manualrestricted Ref-counted ( `RC )
Cyclone: Regions Region variety Allocation (objects) Deallocation Aliasing (objects) (what)(when) Stackstatic whole region exit of lexical scope unrestricted Lexical dynamic Dynamicmanual Heap ( `H ) single objects automatic (BDW GC) Unique ( `U ) manualrestricted Ref-counted ( `RC )
Cyclone: Regions Region variety Allocation (objects) Deallocation Aliasing (objects) (what)(when) Stackstatic whole region exit of lexical scope unrestricted Lexical dynamic Dynamicmanual Heap ( `H ) single objects automatic (BDW GC) Unique ( `U ) manualrestricted Ref-counted ( `RC )
Cyclone: Regions Region variety Allocation (objects) Deallocation Aliasing (objects) (what)(when) Stackstatic whole region exit of lexical scope unrestricted Lexical dynamic Dynamicmanual Heap ( `H ) single objects automatic (BDW GC) Unique ( `U ) manualrestricted Ref-counted ( `RC )
Cyclone: Regions Region variety Allocation (objects) Deallocation Aliasing (objects) (what)(when) Stackstatic whole region exit of lexical scope unrestricted Lexical dynamic Dynamicmanual Heap ( `H ) single objects automatic (BDW GC) Unique ( `U ) manualrestricted Ref-counted ( `RC )
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); }
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); }
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H
H Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } exp H
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Core Scheme Interpreter void scheme(prog_t prog) { heap_t H = new_heap (); // load the program into the GC-ed heap exp_t exp = initial_exp(prog, H); while (true) { // take a step bool done = step(H, &exp); // check for termination if (done) { goto Finished; } // allow a garbage collection maybeGC(&H, &exp); } Finished: // free the final heap destroy_heap(H); } H exp
Simple Copying Collector From-space and To-space Forwarding pointers
Simple Copying Collector From-space and To-space Natural correspondence with regions LIFO discipline of lexical regions insufficient Dynamic regions appear to be sufficient Forwarding pointers
Dynamic Regions Non-nested lifetimes Manual creation and deallocation Represented by unique pointer ( key ) Unique pointer ≡ Capability Access the region
Dynamic Regions Operations new : create a fresh dynamic region Produces unique key open : open a dynamic region for allocation Temporarily consumes key free : deallocate a dynamic region Permanently consumes key
GC and Dynamic Regions... // create the to-space’s key let NewDynamicRegion { to_key} = new_ukey(); state_t to_state; // open the from-space’s key { region from_r = open_ukey(from_key); // open the to-space’s key { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } } // free the from-space free_ukey(from_key);...
GC and Dynamic Regions... // create the to-space’s key let NewDynamicRegion { to_key} = new_ukey(); state_t to_state; // open the from-space’s key { region from_r = open_ukey(from_key); // open the to-space’s key { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } } // free the from-space free_ukey(from_key);...
GC and Dynamic Regions... // create the to-space’s key let NewDynamicRegion { to_key} = new_ukey(); state_t to_state; // open the from-space’s key { region from_r = open_ukey(from_key); // open the to-space’s key { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } } // free the from-space free_ukey(from_key);...
GC and Dynamic Regions... // create the to-space’s key let NewDynamicRegion { to_key} = new_ukey(); state_t to_state; // open the from-space’s key { region from_r = open_ukey(from_key); // open the to-space’s key { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } } // free the from-space free_ukey(from_key);...
GC and Dynamic Regions... // create the to-space’s key let NewDynamicRegion { to_key} = new_ukey(); state_t to_state; // open the from-space’s key { region from_r = open_ukey(from_key); // open the to-space’s key { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } } // free the from-space free_ukey(from_key);...
Forwarding Pointers What is the type of a forwarding pointer?
Forwarding Pointers What is the type of a forwarding pointer? A pointer to a Value in To-space
Forwarding Pointers What is the type of a forwarding pointer? A pointer to a Value in To-space, whose forwarding pointer is a pointer to a Value in To- space’s To-space
Forwarding Pointers What is the type of a forwarding pointer? A pointer to a Value in To-space, whose forwarding pointer is a pointer to a Value in To- space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To- space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To- space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space’s To- space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space, whose forwarding pointer is a pointer to a Value in To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To-space’s To- space’s To-space’s To-space’s To-space, …
Dynamic Region Sequences Introduce a new type constructor mapping region names to region names typedef _::R next_rgn Although the region names ρ and next_rgn are related, the lifetimes of their corresponding regions are not
Dynamic Region Sequences Operations new, open, free : as for dynamic regions next : create next_rgn from ρ
Dynamic Region Sequences Operations next : create next_rgn from ρ Have an infinite supply of region names next will create a fresh dynamic region key Need a linear supply of keys Use Cyclone’s unique pointers
Dynamic Region Sequences Operations next : create next_rgn from ρ A dynamic region sequence is a pair key : a dynamic region key gen : a unique pointer Unique pointer ≡ Capability Produce the next_rgn key and gen Consumed by next
Dynamic Region Sequences Operations new : create a fresh dynamic region sequence Produces unique key and gen next : creates next dynamic region sequence Produces unique key and gen Permanently consumes gen
GC and Dynamic Region Sequences gcstate_t doGC(gcstate_t gcs) { // unpack the gc state let GCState{ DRSeq {from_key, from_gen}, from_state} = gcs; // generate the to-space let DRSeq{to_key, to_gen} = next_drseq(from_gen); state_t > to_state; // open the from-space { region from_r = open_ukey(from_key); // open the to-space { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } // pack the new gc state gcs = GCState{DRSeq{to_key, to_gen}, to_state}; } // free the from space free_ukey(from_key); return gcs; }
GC and Dynamic Region Sequences gcstate_t doGC(gcstate_t gcs) { // unpack the gc state let GCState{ DRSeq {from_key, from_gen}, from_state} = gcs; // generate the to-space let DRSeq{to_key, to_gen} = next_drseq(from_gen); state_t > to_state; // open the from-space { region from_r = open_ukey(from_key); // open the to-space { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } // pack the new gc state gcs = GCState{DRSeq{to_key, to_gen}, to_state}; } // free the from space free_ukey(from_key); return gcs; }
GC and Dynamic Region Sequences gcstate_t doGC(gcstate_t gcs) { // unpack the gc state let GCState{ DRSeq {from_key, from_gen}, from_state} = gcs; // generate the to-space let DRSeq{to_key, to_gen} = next_drseq(from_gen); state_t > to_state; // open the from-space { region from_r = open_ukey(from_key); // open the to-space { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } // pack the new gc state gcs = GCState{DRSeq{to_key, to_gen}, to_state}; } // free the from space free_ukey(from_key); return gcs; }
GC and Dynamic Region Sequences gcstate_t doGC(gcstate_t gcs) { // unpack the gc state let GCState{ DRSeq {from_key, from_gen}, from_state} = gcs; // generate the to-space let DRSeq{to_key, to_gen} = next_drseq(from_gen); state_t > to_state; // open the from-space { region from_r = open_ukey(from_key); // open the to-space { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } // pack the new gc state gcs = GCState{DRSeq{to_key, to_gen}, to_state}; } // free the from space free_ukey(from_key); return gcs; }
GC and Dynamic Region Sequences gcstate_t doGC(gcstate_t gcs) { // unpack the gc state let GCState{ DRSeq {from_key, from_gen}, from_state} = gcs; // generate the to-space let DRSeq{to_key, to_gen} = next_drseq(from_gen); state_t > to_state; // open the from-space { region from_r = open_ukey(from_key); // open the to-space { region to_r = open_ukey(to_key); // copy the state and reachable data to_state = copy_state(to_r, from_state); } // pack the new gc state gcs = GCState{DRSeq{to_key, to_gen}, to_state}; } // free the from space free_ukey(from_key); return gcs; }
GC and Dynamic Region Sequences Comparison with type-preserving GCs Interpreter can be written in a trampoline, rather than continuation passing, style Intuitive typing of forwarding pointers
Performance Evaluation InterpreterRuntime Cyclone (Copying GC) Checked Cyclone (BDW GC) CheckedUnchecked SISC (Sun JVM) CheckedUnchecked MzScheme (BDW GC) Unchecked
Performance Evaluation
Size of Unchecked Code Interpreter (lines of code) Runtime System (lines of code) Cyclone (Copying GC) Cyclone (BDW GC) SISC (Sun JVM) 0229,100 MzScheme (BDW GC) 31,
Conclusion Significantly reduce amount of unchecked code needed to implement an interpreter May incur a performance penalty for extra degree of security Future Work Reduce performance penalty Per thread regions providing customization