The Practice of Type Theory in Programming Languages Robert Harper Carnegie Mellon University August, 2000.

Slides:



Advertisements
Similar presentations
Sml2java a source to source translator Justin Koser, Haakon Larsen, Jeffrey Vaughan PLI 2003 DP-COOL.
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Type Analysis and Typed Compilation Stephanie Weirich Cornell University.
Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
Certified Typechecking in Foundational Certified Code Systems Susmit Sarkar Carnegie Mellon University.
- Vasvi Kakkad.  Formal -  Tool for mathematical analysis of language  Method for precisely designing language  Well formed model for describing and.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
March 4, 2005Susmit Sarkar 1 A Cost-Effective Foundational Certified Code System Susmit Sarkar Thesis Proposal.
1 Dependent Types for Termination Verification Hongwei Xi University of Cincinnati.
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker)
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
Type Checking.
Compiler Construction
Ross Tate, Juan Chen, Chris Hawblitzel. Typed Assembly Languages Compilers are great but they make mistakes and can introduce vulnerabilities Typed assembly.
CLF: A Concurrent Logical Framework David Walker Princeton (with I. Cervesato, F. Pfenning, K. Watkins)
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
Type-Safe Programming in C George Necula EECS Department University of California, Berkeley.
Data Abstraction COS 441 Princeton University Fall 2004.
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
Programmability with Proof-Carrying Code George C. Necula University of California Berkeley Peter Lee Carnegie Mellon University.
Modular Verification of Assembly Code with Stack-Based Control Abstractions Xinyu Feng Yale University Joint work with Zhong Shao, Alexander Vaynberg,
Language-Based Security Proof-Carrying Code Greg Morrisett Cornell University Thanks to G.Necula & P.Lee.
A Dependently Typed Assembly Language Robert Harper (Joint work with Robert Harper)
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
A Type System for Expressive Security Policies David Walker Cornell University.
Describing Syntax and Semantics
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
1 The Problem o Fluid software cannot be trusted to behave as advertised unknown origin (must be assumed to be malicious) known origin (can be erroneous.
Cormac Flanagan University of California, Santa Cruz Hybrid Type Checking.
A Formal Model of Modularity in Aspect-Oriented Programming Jonathan Aldrich : Objects and Aspects Carnegie Mellon University.
ECE 720T5 Winter 2014 Cyber-Physical Systems Rodolfo Pellizzoni.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
Mathematical Modeling and Formal Specification Languages CIS 376 Bruce R. Maxim UM-Dearborn.
Containment and Integrity for Mobile Code Security policies as types Andrew Myers Fred Schneider Department of Computer Science Cornell University.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
Proof-Carrying Code & Proof-Carrying Authentication Stuart Pickard CSCI 297 June 2, 2005.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Towards Automatic Verification of Safety Architectures Carsten Schürmann Carnegie Mellon University April 2000.
May 31, May 31, 2016May 31, 2016May 31, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Semantics In Text: Chapter 3.
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)
COP4020 Programming Languages Introduction to Axiomatic Semantics Prof. Robert van Engelen.
Lecture 5 1 CSP tools for verification of Sec Prot Overview of the lecture The Casper interface Refinement checking and FDR Model checking Theorem proving.
CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.
Formal Specification: a Roadmap Axel van Lamsweerde published on ICSE (International Conference on Software Engineering) Jing Ai 10/28/2003.
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
PROGRAMMING PRE- AND POSTCONDITIONS, INVARIANTS AND METHOD CONTRACTS B MODULE 2: SOFTWARE SYSTEMS 13 NOVEMBER 2013.
This Week Lecture on relational semantics Exercises on logic and relations Labs on using Isabelle to do proofs.
CSSE501 Object-Oriented Development. Chapter 10: Subclasses and Subtypes  In this chapter we will explore the relationships between the two concepts.
Programming Language Design Issues Programming Languages – Principles and Practice by Kenneth C Louden.
Dr. M. Al-Mulhem Introduction 1 Chapter 6 Type Systems.
1 Problem Solving  The purpose of writing a program is to solve a problem  The general steps in problem solving are: Understand the problem Dissect the.
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Type Checking and Type Inference
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
Types for Programs and Proofs
CS 326 Programming Languages, Concepts and Implementation
CS 326 Programming Languages, Concepts and Implementation
State your reasons or how to keep proofs while optimizing code
The ConCert Project Trustless Grid Computing
Language-based Security
Computer Science 340 Software Design & Testing
Presentation transcript:

The Practice of Type Theory in Programming Languages Robert Harper Carnegie Mellon University August, 2000

Acknowledgements Thanks to Reinhard Wilhelm for inviting me to speak! Thanks to my colleagues, former, and current students at Carnegie Mellon.

An Old Story Once upon a time (es war einmal), there were those who thought that typed high-level programming languages would save the world. –Ensure safety of executed code. –Support reasoning and verification. –Run efficiently (enough) on stock hardware. “If we all programmed in Pascal (or Algol or Simula or …), all of our problems would be solved.”

What Happened Instead Things didn’t worked out quite as expected or predicted. –COTS software is mostly written in low- level, unsafe languages (ie, C, C++) –Some ideas have been adopted (eg, objects and classes), most haven’t. –Developers have learned to work with less- than-perfect languages, achieving astonishing results.

Languages Ride Again But the world has changed: strong safety assurances are more important than ever. –Mobile code on the internet. –Increasing reliance on software in “real life”. Schneider made a strong case for language- based security mechanisms. –“Languages aren’t just languages any more.” –Rich body of work on logics, semantics, type systems, verification, compilation.

Language-Based Security Key idea: program analysis is more powerful than execution monitoring. This talk is about one approach to taking this view seriously, typed certifying compilation.

Type Theory and Languages Type theory has emerged as the central organizing principle for language … –Design: genericity, abstraction, and modularity mechanisms. –Implementation: type inference, flow analysis. –Semantics: domain theory, logical relations.

What is a Type System? A type system is a syntactic discipline for enforcing levels of abstraction. –Ensures that bad things do not happen. A type system rules out programs. –Adding a function to a string –Interpreting an integer as a pointer –Violating interfaces

What is a Type System? How can this be a good thing? –Expressiveness arises from strictures: restrictions entail stronger invariants –Flexibility arises from controlled relaxation of strictures, not from their absence. A type system is fundamentally a verification tool that suffices to ensure invariants on execution behavior.

Types Induce Invariants Types induce invariants on programs. –If e : int, then its value must be an integer. –If e : int  int, then it must be a function taking and yielding integers. –If e : filedesc, then it must have been obtained by a call to open. –If e : int{H}, then no “low clearance” expression can read its value.

Types Induce Invariants These invariants provide –Safety properties: well-typed programs do not “go wrong”. –Equational properties: when are two expressions interchangeable in all contexts. –Representation independence (parametricity).

Types as Safety Certificates Typing is a sufficient condition for these invariants to hold. –Well-typed implies well-behaved. –Not (necessarily) checkable at run-time! Types form a certificate of safety. –Type checking = safety checking. –A practical sufficient condition for safety.

The HLL Assumption This is well and good, but … –Programs are compiled to unsafe, low-level machine code. –We want to know that the object code is safe. HLL assumption: trust the correctness of the compiler and run-time system. –A huge assumption. –Spurred much research in compiler correctness.

Certifying Compilers Idea: propagate types from the source to the object code. –Can be checked by a code recipient. –Avoids reliance on compiler correctness. Based on a new approach to compilation. –Typed intermediate languages. –Type-directed translation.

Typed Intermediate Languages Generalize syntax-directed translation to type-directed translation. –intermediate languages come equipped with a type system. –compiler transformations translate both a program and its type. –translation preserves typing: if e:T then e*:T* after translation

Typed Intermediate Languages Classical syntax-directed translation: Source = L 1  L 2  …  L n = Target : T 1 Type system applies to the source language only. –Type check, then throw away types.

Typed Intermediate Languages Type-directed translation: Source = L 1  L 2  …  L n = Target : : : T 1  T 2  …  T n Maintain types during compilation. –Translate a program and its type. –Types guide translation process.

Typed Closure Conversion The type S  T becomes the ADT type Env val env : Env val code : Env * S -> T The application F(a) becomes call(F.code,(F.env, a)) Functions are implementations of the ADT. –Essentially, functions are represented as objects. –Environment = private fields, code = method.

Typed Object Code Typed Assembly Language (TAL) –type information ensures safety –generated by compiler –very close to standard x86 assembly Type information captures –types of registers and stack –type assumptions at branch targets (including join points) Relies heavily on polymorphism! –eg, callee-saves registers, enforcing abstraction

Typed Assembly Language fact: ALL rho.{r1:int, sp:{r1:int, sp:rho}::rho} jgz r1, positive mov r1,1 ret positive: push r1 ; sp : int::{t1:int,sp:rho}::rho sub r1,r1,1 call fact[int::{r1:int,sp:rho}::rho] imul r1,r1,r2 pop r2 ; sp : {r1:int,sp:rho}:: ret

Tracking Stronger Properties Familiar type systems go a long way. –Ensures minimal sanity of code. –Ensures compliance with interfaces. –Especially if you have polymorphism. Refinement types take a step further. –Track value range invariants. –Array bounds checks, null pointer checks, red-black invariants, etc.

Refinement Types First idea: subset types. e : { x : T | P(x) } iff e:T and |= P(e) Examples: –Pascal-like sub-ranges 0..n = { n : int | 0  n < length(A) } –Non-null objects –Red-black condition on RBT’s

Refinement types Checking value range properties is undecidable! –eg, cannot decide if 0  e < 10 for general expressions e Checker must include a theorem prover to validate object code. –either complex and error prone, or –too weak to be useful

Refinement Types Second idea: proof carrying code. (e,  ) : { x:T | P(x) } iff e:T and  |- P(e) Provide a proof of the range property. –How to obtain it? –How to represent it? Verifier checks the types and the proof. –using a proof checker, not a proof finder

Finding Proofs To use A[n] safely, we must prove that 0  n  size(A). If we insert a run-time check, it’s easy! –if 0  n  size(A) then *(A+4n) else fail In general we must find proofs. –Instrumented analysis methods. –Programmer declarations.

Representing Proofs How do we represent the proofs? –Need a formal logic for reasoning about value range properties (for example). –Need a proof checker for each such formalism. But which logic should we use? –How do we accommodate change? –Which properties are of interest?

Logical Frameworks The LF logical framework is a universal language for defining logical systems. –Captures uniformities of a large class of logical systems. –Provides a formal definition language for logical systems. Proof checking is reduced to a very simple form of type checking. –One type checker yields many proof checkers!

Logical Frameworks Specify the syntactic categories as LF types: term : type. formula : type. proof : formula -> type. For each formula, there is a type of proofs of that formula in the logic.

Logical Frameworks Specify generators of each category: zero : term. succ : term -> term. implies : formula -> formula. modus_ponens : proof (implies A B) -> proof A -> proof B. Validity checking = type checking!

General Certified Code The logic is part of the safety certificate! –Logic of type safety. –Logic of value ranges. –Logic of space requirements. Proofs are LF terms for that logic. –Checker is parameterized on specification of the logic (an LF “signature”). –LF type checker checks proofs in any logic (provided it is formalized in LF).

Some Challenges Can certified compilation really be made practical? –TALC [Morrisett] for “safe C”. –TILT [CMU] for Standard ML [in progress]. –SML/NJ [Yale] for Standard ML [in progress]. –Touchstone [Necula, Lee] for “safe C”.

Some Challenges Can refinements be made useful and practical? –Dependent ML [Pfenning, Xi] –Dependently-Typed Assembly [Harper, Xi] Experience with ESC is highly relevant. –A difference is that refinements are built in to the language.

Some Predictions Certifying compilation will be standard technology. –Code will come equipped with checkable safety certificates. Type systems will become the framework for building practical development tools. –Part of the program text. –Mechanically checkable.

Further Information