A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

Slides:



Advertisements
Similar presentations
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Advertisements

1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Goal: Write Programs in Assembly
The University of Adelaide, School of Computer Science
The Semantic Soundness of a Type System for Interprocedural Register Allocation and Constructor Registration Torben Amtoft Kansas State University joint.
Dependent Types in Practical Programming Hongwei Xi Oregon Graduate Institute.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Chapter 10- Instruction set architectures
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
1 Dependent Types for Termination Verification Hongwei Xi University of Cincinnati.
The University of Adelaide, School of Computer Science
Recursion vs. Iteration The original Lisp language was truly a functional language: –Everything was expressed as functions –No local variables –No iteration.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Dynamic Typing COS 441 Princeton University Fall 2004.
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Compiler Construction
F28PL1 Programming Languages Lecture 5: Assembly Language 4.
1 CPS Transform for Dependent ML Hongwei Xi University of Cincinnati and Carsten Schürmann Yale University.
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
May 1, 2003May 1, Imperative Programming with Dependent Types Hongwei Xi Boston University.
Facilitating Program Verification with Dependent Types Hongwei Xi Boston University.
CS 536 Spring Code generation I Lecture 20.
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
STAL David Walker (joint work with Karl Crary, Neal Glew and Greg Morrisett)
A Type System for Expressive Security Policies David Walker Cornell University.
The Practice of Type Theory in Programming Languages Robert Harper Carnegie Mellon University August, 2000.
Data Transfer & Decisions I (1) Fall 2005 Lecture 3: MIPS Assembly language Decisions I.
1 The Problem o Fluid software cannot be trusted to behave as advertised unknown origin (must be assumed to be malicious) known origin (can be erroneous.
Compiling with Dependent Types Hongwei Xi University of Cincinnati.
1 CS Programming Languages Random Access Machines Jeremy R. Johnson.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Secure Virtual Architecture John Criswell, Arushi Aggarwal, Andrew Lenharth, Dinakar Dhurjati, and Vikram Adve University of Illinois at Urbana-Champaign.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
Cosc 2150: Computer Organization
Runtime Environments Compiler Construction Chapter 7.
Compiler Construction
Lecture 2 Foundations and Definitions Processes/Threads.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
CS41B MACHINE David Kauchak CS 52 – Fall Admin  Midterm Thursday  Review question sessions tonight and Wednesday  Assignment 3?  Assignment.
Lecture 8. MIPS Instructions #3 – Branch Instructions #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
9/29: Lecture Topics Conditional branch instructions
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
Prof. Necula CS 164 Lecture 171 Operational Semantics of Cool ICOM 4029 Lecture 10.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
Code Generation Part I Chapter 8 (1st ed. Ch.9)
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Computer Architecture & Operations I
Types for Programs and Proofs
Subroutines and the Stack
Imperative Programming with Dependent Types
Code Generation Part I Chapter 9
Lecture 4: MIPS Instruction Set
Code Generation.
Unit IV Code Generation
Code Generation Part I Chapter 8 (1st ed. Ch.9)
Code Generation Part I Chapter 9
8 Code Generation Topics A simple code generator algorithm
Subroutines and the Stack
Computer Architecture
Foundations and Definitions
Review: What is an activation record?
CSc 453 Interpreters & Interpretation
Presentation transcript:

A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)

The general motivation Q: Why do we want to type low level languages? A: We want to reap the benefits of type systems at low levels.

Advantages of type systems Capturing program errors at compile- time (well-known) Enabling aggressive compiler optimizations (recent) Supporting sophisticated module systems (SML) Facilitating program verification (NuPRL, Coq, PVS) Serving as program documentation

The goal of this work The goal is to capture memory safety of assembly code through a type system Memory Safety = Type Safety + Safe Array Access

Array bounds checking problem Array bounds checking refers to determining whether the value of an expression is within the bounds of an array when the expression is used to index the array.

Byte copy: A version in C void bcopy(int src[], int dst[]) { int i; if (length(src) != length(dst) { printf “bcopy: unequal lengths\n”; exit(1); } for(i=1; i < length(src), i++) i dst[i] = src[i]; }

Dynamic array bounds checking is required for safe languages such as Java, Modula-3, ML, Pascal can be expensive in practice (e.g. numerical computation) bounds violation is a rich source of program errors in unsafe languages such as C, C++ (e.g. off-by-one error)

Some experimental results

Static array bounds checking Flow Analysis –no annotations required (fully automatic) –limited to simple cases and sensitive to program structures –little or no feedback for detecting program errors

Static array bounds checking Type-based approaches –ML type system is too coarse –full dependent types is too fine –dependent ML provides an intermediate type system with practical type-checking it is adequate for array bounds checking elimination the programmer must write some type annotations

What are dependent types? Dependent types depend on the values of language expressions. For instance, type : dependent type array : array(x) int : int(x) stack : stack(x)

Byte copy: A version in de Caml let bcopy src dst = begin for i = 0 to vect_length(src) - 1 do dst..(i) <- src..(i) done end withtype {n:nat} int vect(n) -> int vect(n) -> unit

Array bounds checking in mobile code It needs to be enforced for safety concerns It is difficult to eliminate since the machine which executes the code may not trust the source of the code It is time-consuming to be compiled away

Some key applications of DTAL Compiler verification Mobile code security Mobile code efficiency

Increment: A flow chart i:int l:label sp: pop r1 l:label sp: r1 = i add r1, r1, 1 r1 = i + 1 sp: l:label pop r2r1 = i + 1 r2 = l i+1:int sp: r2 = l push r1 jmp r2 continue...

Increment: An assembly version inc: pop r1 add r1, r1, 1 pop r2 push r1 jmp r2

State types A state type corresponds to code continuation. It records the type information about register file and stack. For instance, [r1: int(i), r2: int array(i)] (‘a)[r1: ‘a, r2: [r1: ‘a]] (‘a,‘b)[r1: ‘a, r2: ‘b, r3: [r1: ‘a, r2: ‘b]] {n:nat}[sp: [sp: stack(n)] :: stack(n)]

Register file We use an array representation for register file. r 1 : tau 1 r 2 : tau 2 r n : tau n

Stack We use a list representation for stack. i1i1 tau 1 i2i2 tau 2 inin tau n itau

Increment: A version in DTAL inc: {i:int}{n:nat} [sp: int(i):: [sp: int(i+1)::stack(n)]::stack(n)] pop r1 // r1: int(i) add r1, r1, 1 // r1: int(i+1) pop r2 // r2: [sp:int(i+1)::stack(n)] push r1 // sp: int(i+1)::stack(n) jmp r2

Type index objects index i,j ::= a | c | i+j | i-j | i*j | i/j index prop P ::= false | true | i =j| i>j | not P | P1 and P2 | P1 or P2 index sort gamma ::= int | {a: gamma | P} For instance, nat is a shorthand for {a: int | a >= 0}

Types types tau ::= alpha | sigma | int(x) | tau array(x) | stack(x) | prod(tau 1,…,tau n ) | {a:gamma}.tau state types sigma ::= [(alpha 1,…,alpha n ){a 1 :gamma 1,…,a n :gamma n }. state ] state state ::= (register file,stack) Note: int is for {a:int}.int(a) nat is for {a:nat}.int(a)

Instructions in DTAL instructions ins ::= aop rd, rs, v | bop r, v | jmp v | load rd, rs(v) | store r | newtuple[tau] r | newarray[tau] r | mov r, v | pop r | push r values v ::= () | i | l | r instruction sequences I ::= halt | ins; I | l; I

Programs in DTAL label maps Lambda ::= (l 1 : sigma 1, …, l n : sigma n ) programs ::= (Lambda, I)

Memory allocation Tuple of type prod(tau 1,…,tau n ) Array of type tau array(n) tau 1 tau 2 tau n-1 tau n tau

Memory allocation: an example A tuple of type prod(int, prod(int, int)): int

Array types are non-variant tau1 <= tau2 does not implies tau1 array(n) <= tau2 array(n) Here is a counterexample : r1: nat array(2) r2: int array(2) 0 0 r1 = = r2

State types are contra-variant state state’ implies [state] <= [state’]

Typing unconditional jumps state V : [ state’ ] state state’ I state jmp v; I

Typing conditional jumps r = 0? assumption;state assumption; state beq r,v; I r:int(x) assumption;state v:[state’] assumption,x!=0; state I assumption,x=0;statestate’

Byte copy: A flow chart r4 <- 0 loop bcopy r5 <- r1 - r4 r5 > 0? finish r5 <- r2(r4)r3(r4) <- r5 r4 <- r4 + 1 r1: array size r2: src r3: dst

Byte copy: A version in DTAL bcopy : {i:nat} [r1: int(i), r2: int array(i), r3: int array(i)] mov r4, 0 // r4 <- 0 jmp loop // start loop

Byte copy: a version in DTAL loop : {i:nat, k:nat} [r1: int(i), r2: int array(i) r3: int array(i), r4: int(k)] sub r5, r1, r4 // r5 = r1 - r4 blte r5, finish // r5 <= 0 ? load r5, r2(r4) // safe load store r3(r4), r5 // safe store add r4, r4, 1 // r4 <- r4 + 1 jmp loop // loop again finish:[] halt

Operational semantics of DTAL We use a standard abstract machine for assigning operational semantics to DTAL programs. The machine consists of three finite maps: (Heap, Register File, Stack)

Soundness The execution of a DTAL program can either –terminate normally, or –run forever, or –stall. A well-typed DTAL program can never stall.

Related work Here is a (partial) list of some closely related work. –Dependent types in practical programming (Xi & Pfenning) –TALC Compiler (Morrisett et al at Cornell) –Safe C compiler (Necula & Lee) –TIL compiler (the Fox project at CMU)

Current status & Future work We have finished the following. –Theoretical development of DTAL –A prototype implementation of a type- checker for DTAL We are working on the following. –Designing a dependent type system of JVML (de JVML) –Compiling (a subset of Java) into de JVML

Conclusion We have demonstrated some uses of dependent types at assembly level. –It can help compiler debugging and verification –It can certify the memory safety property of mobile code –It can lead to safer programs without compromising efficiency