Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Type Analysis and Typed Compilation Stephanie Weirich Cornell University.
- Vasvi Kakkad.  Formal -  Tool for mathematical analysis of language  Method for precisely designing language  Well formed model for describing and.
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Names and Bindings.
The Assembly Language Level
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
CS 330 Programming Languages 10 / 16 / 2008 Instructor: Michael Eckmann.
CS 536 Spring Run-time organization Lecture 19.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
Typed Memory Management in a Calculus of Capabilities David Walker (with Karl Crary and Greg Morrisett)
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
1 Pertemuan 20 Run-Time Environment Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
Run time vs. Compile time
Catriel Beeri Pls/Winter 2004/5 environment 68  Some details of implementation As part of / extension of type-checking: Each declaration d(x) associated.
OOP #10: Correctness Fritz Henglein. Wrap-up: Types A type is a collection of objects with common behavior (operations and properties). (Abstract) types.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
A Type System for Expressive Security Policies David Walker Cornell University.
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
The Practice of Type Theory in Programming Languages Robert Harper Carnegie Mellon University August, 2000.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.
Self-Adjusting Computation Robert Harper Carnegie Mellon University (With Umut Acar and Guy Blelloch)
C++ fundamentals.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Formal Methods 1. Software Engineering and Formal Methods  Every software engineering methodology is based on a recommended development process  proceeding.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes.
Chapter TwelveModern Programming Languages1 Memory Locations For Variables.
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 14: Pointers, Classes, Virtual Functions, and Abstract Classes.
Introduction to Object Oriented Programming. Object Oriented Programming Technique used to develop programs revolving around the real world entities In.
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Containment and Integrity for Mobile Code Security policies as types Andrew Myers Fred Schneider Department of Computer Science Cornell University.
Compiler Construction
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages.
10/16/2015IT 3271 All about binding n Variables are bound (dynamically) to values n values must be stored somewhere in the memory. Memory Locations for.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Recursion. What is recursion? Rules of recursion Mathematical induction The Fibonacci sequence Summary Outline.
COP4020 Programming Languages Names, Scopes, and Bindings Prof. Xin Yuan.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
C++ Programming Basic Learning Prepared By The Smartpath Information systems
11/23/2015CS2104, Lecture 41 Programming Language Concepts, COSC Lecture 4 Procedures, last call optimization.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Semantics In Text: Chapter 3.
Introduction CPSC 388 Ellen Walker Hiram College.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Object-Oriented Programming Chapter Chapter
Types and Programming Languages Lecture 11 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
CMSC 330: Organization of Programming Languages Operational Semantics.
Prof. Necula CS 164 Lecture 171 Operational Semantics of Cool ICOM 4029 Lecture 10.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Data Types In Text: Chapter 6.
Type Checking and Type Inference
Types for Programs and Proofs
CS 326 Programming Languages, Concepts and Implementation
Closure Representations in Higher-Order Programming Languages
Presentation transcript:

Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002

CS Is About Programming How to build systems. –Better, faster, cheaper. –Reliable, maintainable, extensible. Evaluating and comparing systems. –Performance, behavior, security. –Compliance, usability. What can and can’t be done. –Algorithms and complexity.

Programming Is Linguistic Programming is an explanatory activity. –To yourself, now and in the future. –To whoever has to read your code. –To the compiler, which has to make it run. Explanations require language. –To state clearly what is going on. –To communicate ideas over space and time.

Programming Languages Are Essential Therefore languages are of the essence in computer science. –Programming languages, in the familiar sense. –Description languages, design patterns, architecture languages, specification languages.

Languages Abound New languages crop up all the time. –GPL’s: Java, C#. –API’s: Posix, COM, Corba. –DSL’s: Perl, XML, packet filters. Languages codify abstractions. –Useful programming idioms –Models of program behavior.

Some Conventional Wisdom PL research is over. –The language of the future is X … for various values of X. –It’s not about the one, true language! Anyone can design a PL. –Clearly, anyone does. –But look what you get! TCL, Perl, TeX, C++. PL research is irrelevant to practice. –Seen as a purely academic pursuit. –But the tide is turning as successes accumulate.

Some Accomplishments High-level languages. Static type disciplines. Automatic storage management. Objects, classes, ADT’s. Sophisticated compiler technology. Specification and verification.

Why People Don’t Notice It takes decades to go from research to practice. –Similar to other areas, such as algorithms, AI. –Large installed base of programmers. Ideas become “common sense” long before they are widely used. –Even very simple things, such as lexical scope, were once controversial! –Not to mention data abstraction, OOP, ….

Now More Than Ever PL research is more important, more relevant, more practical than ever! –Far from being “over”, we’re living in a golden age! Why more important? –PL research is all about software quality. –Quality is increasingly important.

Now More Than Ever Why more relevant? –Practice has moved toward theory. –Fundamentals are widely accepted. Why more practical? –New techniques for addressing complex problems are becoming available. –New perspectives on old (and new) problems, such as an emphasis on proof.

Some Important Trends Safety with performance. –High-level languages with good compilers. –Low-level languages with good type systems. Languages, not tools. –The design should live with the code. –Conformance should be checkable throughout the evolution of the code.

Some Important Trends Interplay between theory and practice. –“Prove theorems and report numbers.” –Build systems to assess practicality and to discover new problems. Full-spectrum coverage. –Functional and imperative. –High and low-level languages. –Design and implementation.

Some Important Trends Type theory as the GUT of PL’s. –Provides a precise criterion for safety and sanity of a design. –“Features” correspond to types. –Close connections with logics and semantics.

What Is A Type System? Static semantics: the well-formed programs. Dynamic semantics: the execution model.

What is a Type System? Safety theorem: types predict behavior. –Types describe the states of an abstract machine model. –Execution behavior must cohere with these descriptions. Thus a type is a specification and a type checker is a theorem prover.

Types Are Specifications Examples: –e : float means that e evaluates to a value in, say, floating point register 0. –e : float ! int means that e is a procedure that is called with arg in FR0 and returns with result in GR0. –e : queue means that e behaves like a queue.

Types Are Formal Methods Type checking is the most successful formal method! –In principal there are no limits. –In practice there is no end in sight. Examples: –Using types for low-level languages, say inside a compiler. –Extending the expressiveness of type systems for high-level languages.

Types in Compilation Conventional compilers: Source = L 1  L 2  …  L n = Target : T 1 Types apply only to the source code. –Type check, then discard types. –If compiler is correct, target code is safe.

Typed Intermediate Languages Generalize syntax-directed translation to type-directed translation. –Intermediate languages come equipped with a type system. –Compiler transformations translate both a program and its type. –Translation preserves typing: if e:T then e*:T* after translation

Typed Intermediate Languages Type-directed translation: Source = L 1  L 2  …  L n = Target : : : T 1  T 2  …  T n Transfers typing properties from source code to object code. –Check integrity of compiler. –Exploit types during code generation.

Certifying Compilers Types on object code certify its safety. –Type checking ensures safety. –Type information ensures verifiability. Examples of certified object code: –TAL = typed assembly language. –PCC = bare code + proof of safety. –Many variations are being explored.

TAL Example fact: 8 .{r1:int, sp:{r1:int,  }::  } jgz r1, positive mov r1,1 ret positive: push r1 ; sp:int::{t1:int,sp:  }::  sub r1,r1,1 call fact[int::{r1:int,sp:  }::  ] imul r1,r1,r2 pop r2 ; sp:{r1:int,sp:  }::ret

Types for Low-Level Languages What is a good type system for a low-level language? –Should expose data representations. –Should allow for low-level “hacks”. –Should be verifiably safe. –Should not compromise efficiency. Current systems make serious compromises. –Very weak safety properties. –Force atomicity of complex operations.

Example: Memory Allocation Most type systems take an atomic view of constructors. –Allocate and initialize in “one step”. –Even HLL’s like Java impose restrictions. We’d like to expose the “fine structure”. –Support code motion such as coalescing. –Allow incremental initialization. –But nevertheless ensure safety!

Example: Memory Allocation An allocation protocol (used in TILT): –Reserve: obtain raw, un-initialized space. –Initialize: assign values to the parts. –Allocate: baptize as a valid object. Current type systems cannot handle this. –Partially initialized objects. –In-place modification of parts. –Interaction with collected heap.

Example: Memory Allocation Heap HP AP ???? LP AP 120 HP

A Low-Level Type System Borrow two ideas from linear logic. –Restricted and unrestricted variables. –A modality to mediate between them. Restricted variables are resources. –Think: pointer into middle of an object. Unrestricted variables are standard. –Think: heap pointer.

A Low-Level Type System Variables are bound to valid objects. –Can be used freely. –Garbage collected when inaccessible. Resources are bound to parts of objects-in-creation. –Cannot be passed around freely. –Explicitly allocated and disposed.

Restrictions on Resources Linearity: use resources exactly once. –Admits re-typing after initialization. –Ensure allocation before general usage. Ordering: resource adjacency matters. –Admit “pointers into the middle” of objects. –Supports in-place, piecemeal initialization.

Variables and Resources Typing judgments: Ordering of x’s does not matter. –Abstract “mobile” locations. Ordering and usage of a’s does matter. –Abstract “pinned” locations, with (a form of) pointer arithmetic.

Low-Level Type Constructors Contiguous data:  1 ²  2. –Two contiguous values. –Two adjacent words: int ² int. Mobile data object: ! . –A fully initialized object of type . Example:  1 £  2 := ! (  1 ²  2 ). –A pointer to an adjacent pair of values.

Allocating a Pair Allocate (1,2) : Initialize a 1, using it up. Re-introduce a 1. Must be empty on return. Reserve space at a.Create names for parts. Fuse parts and allocate. Resource a is used up!

Coalescing Reservation Allocate (0,(1,2)) :

What Have We Gained? The ordered type system ensures: –All reserved data is eventually allocated. –Initialization can happen in any order. –Cannot read un-initialized memory. May also be used for inter-operability. –Adjacency requirements. –Marshalling and un-marshalling. –Precise control over representations.

Types for High-Level Languages Recall: types express invariants. –Calling sequence, data representation. –Abstraction boundaries (polymorphism). Theme: capture more invariants. –Representation invariants. –Protocol compliance. Trade-off expressiveness for simplicity and convenience.

Data Structure Invariants Example: bit strings. –nat : no leading 0’s. –pos : a non-zero Nat. –bits : any old thing. Goal: check such properties of code. –Incr takes nat to pos, preserves pos. –Left shift preserves nat, pos.

Data Structure Invariants Properties of interest: –pos ` nat ` bits Operations: –  : bits –add0 : bits ! bits Æ nat ! pos –add1 : bits ! bits Æ nat ! pos

Data Structure Invariants Logical consequences: –add0 : nat ! pos Æ nat ! nat –add1 : pos ! pos –  : bits Type check code to check invariants! –Simple bi-directional algorithm suffices. –Example: increment.

Data Structure Invariants Increment: inc: bits ! bits Æ nat ! pos fun inc (  ) )  1 | inc (b0) ) b1 | inc (b1) ) (inc b)0

Data Structure Invariants Fully mechanically checkable! –Type check inc twice, once for each conjunct. –First pass: assume argument is just bits, derive that result is therefore bits. –Second pass: assume argument is nat, derive that result is pos. Requires checking entailment of properties. –Decidable for subtype-like behavior.

Value Range Invariants Array bounds checking (a la Pascal): –[0..size(A)] Null and non-null objects: –null, really(C) Aliasing: –its(c) : the object c itself

Watch Out! Such types involve dynamic entities. –[0..size(A)] : A is an array. –its(o) : o is a run-time object. But types are static! –What is an expression of type [ 0..size(if … then A else B)] ??? How to get static checking?

Dependent Types Solution: compile-time proxies. –Track values insofar as possible. –Existentially quantify when its not. Examples: –0 : its(0) –+ : its(1) £ its(2) ! its(4) –if … then 1 else 2 : 9 n its(n)

Types for the World What about the state of the world? –The lock l is held. –The file f is open. –The contents of cell c is positive. But such properties change as execution proceeds! –Here I’m holding the lock, there I’m not. –The file is open now, not later.

Types for the World Want a simple, checkable type system for the “world”. –Types characterize the current state. –Types of the state change over time. But what kind of type system? –Properties of the world are ephemeral. –Facts are no longer persistent. Need: a logic of ephemera.

Types for the World Ephemera behave strangely: –Cannot be replicated: holding twice is not the same as holding once. –Cannot be ignored: must not lose track of whether you hold a lock. Once again linear logic does the trick! –Model parts of the world as resources. –Type changes track state changes.

Types for the World A very simple locking protocol: –acquire : 8 l its(l) / free(l) ! unit / held(l) –release : 8 l its(l) / held(l) ! free(l) –new : unit ! 9 l unit / free(l) A “linear” function type. Gets “used up”. “Replaces” premise. Value / World

Types for the World What does this buy you? –Stating and enforcing simple protocol properties. –Locality of reasoning: focus only on what changes, not what stays the same. It’s much harder than it sounds! –Separation logic: Reynolds and O’Hearn. –But it can be done: Vault Project at MSR.

Summary PL research is providing creative, useful solutions to practical problems. –Building on decades of fundamental research, esp. in logic and type theory. There is much, much more coming! –Many ripe fruit, lots of new developments. –Many good problems yet to be addressed.

Acknowledgements Much of the work discussed here was done by or in collaboration with Karl Crary (CMU), Frank Pfenning (CMU), David Walker (Princeton), Greg Morrisett (Cornell). Plus dozens of our current and former students at Carnegie Mellon.

Types and Logic Logic is the science of deduction. A program is a proof of a theorem.

Memory Allocation If allocation and initialization are atomic, the type system can be simple. –Constructors create objects of a type by allocating and initializing them. –Very limited form of control over initialization order.

Selective Memoization A familiar idea: memoization. –To memoize f(x,y), maintain a mapping from pairs (x,y) to values f(x,y). –Check table on call, update on return. Problem: not sufficiently accurate. –Memoization is driven by call history. –This is much too inaccurate!

Selective Memoization Partial dependency: fun f(x,y) = if y=0 then 0 else x/y. –Never “touches” x if y is zero. Memoization succeeds only if (n,0) has been seen before! –Even though n does not matter!

Selective Memoization Dependency on approximations: fun f(x,y,z) = if x>0 then x+y else x+z. –Result depends on sign of x and the value of either y or z. Memoization cannot account for all triples (m,n,*) with m positive! –Can only reproduce previous calls.

Selective Memoization Idea: track dependence of the result on the input. –Keep track of what parts are “touched”. –Key memo table on the control path, not the full data! –Generalizes conventional memoization. How to support this in a language?

Selective Memoization Linguistic support: –Ensure that all true dependencies are properly tracked. –Allow specification of the “granularity” of memoization. Whole pair, or just components. Only the spine of a list. Only an approximation of a value.

Selective Memoization Key idea: resources. –The result of a memoized function cannot depend on a resource. –The argument of a memoized function is a resource.

Selective Memoization To use the value of a resource, it must be explicitly “touched”. –Records usage of the resource. –Binds value to an unrestricted variable. “Mark” types on which value may depend. –!(int £ int): depends on both arguments. –!int £ !int: may depend on either or both arguments.

Selective Memoization Typing:  ;  ` e:  –  : types of unrestricted variables. –  : types of restricted variables. “Marking” is a modality! –Specifically, the modal logic of necessitation.

Selective Memoization Critical typing rules:

Selective Memoization Example: partial dependency fun f(a:!int,b:!int):int is let! x:int be a in if a=0 then return (0) else let! y:int be b in return (a/b)

Selective Memoization Example: approximate dependency. fun f(a:!int, b:!int):int = let! p:bool = pos(x) in if p then return(0) else return(x+y) fun pos(a:!int):!bool = let! x:int be a in return (x>0)