Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002.

Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002

CS Is About Programming How to build systems. –Better, faster, cheaper. –Reliable, maintainable, extensible. Evaluating and comparing systems. –Performance, behavior, security. –Compliance, usability. What can and can’t be done. –Algorithms and complexity.

Programming Is Linguistic Programming is an explanatory activity. –To yourself, now and in the future. –To whoever has to read your code. –To the compiler, which has to make it run. Explanations require language. –To state clearly what is going on. –To communicate ideas over space and time.

Programming Languages Are Essential Therefore languages are of the essence in computer science. –Programming languages, in the familiar sense. –Description languages, design patterns, architecture languages, specification languages.

Languages Abound New languages crop up all the time. –GPL’s: Java, C#. –API’s: Posix, COM, Corba. –DSL’s: Perl, XML, packet filters. Languages codify abstractions. –Useful programming idioms –Models of program behavior.

Some Conventional Wisdom PL research is over. –The language of the future is X … for various values of X. –It’s not about the one, true language! Anyone can design a PL. –Clearly, anyone does. –But look what you get! TCL, Perl, TeX, C++. PL research is irrelevant to practice. –Seen as a purely academic pursuit. –But the tide is turning as successes accumulate.

Some Accomplishments High-level languages. Static type disciplines. Automatic storage management. Objects, classes, ADT’s. Sophisticated compiler technology. Specification and verification.

Why People Don’t Notice It takes decades to go from research to practice. –Similar to other areas, such as algorithms, AI. –Large installed base of programmers. Ideas become “common sense” long before they are widely used. –Even very simple things, such as lexical scope, were once controversial! –Not to mention data abstraction, OOP, ….

Now More Than Ever PL research is more important, more relevant, more practical than ever! –Far from being “over”, we’re living in a golden age! Why more important? –PL research is all about software quality. –Quality is increasingly important.

Now More Than Ever Why more relevant? –Practice has moved toward theory. –Fundamentals are widely accepted. Why more practical? –New techniques for addressing complex problems are becoming available. –New perspectives on old (and new) problems, such as an emphasis on proof.

Some Important Trends Safety with performance. –High-level languages with good compilers. –Low-level languages with good type systems. Languages, not tools. –The design should live with the code. –Conformance should be checkable throughout the evolution of the code.

Some Important Trends Interplay between theory and practice. –“Prove theorems and report numbers.” –Build systems to assess practicality and to discover new problems. Full-spectrum coverage. –Functional and imperative. –High and low-level languages. –Design and implementation.

Some Important Trends Type theory as the GUT of PL’s. –Provides a precise criterion for safety and sanity of a design. –“Features” correspond to types. –Close connections with logics and semantics.

What Is A Type System? Static semantics: the well-formed programs. Dynamic semantics: the execution model.

What is a Type System? Safety theorem: types predict behavior. –Types describe the states of an abstract machine model. –Execution behavior must cohere with these descriptions. Thus a type is a specification and a type checker is a theorem prover.

Types Are Specifications Examples: –e : float means that e evaluates to a value in, say, floating point register 0. –e : float ! int means that e is a procedure that is called with arg in FR0 and returns with result in GR0. –e : queue means that e behaves like a queue.

Types Are Formal Methods Type checking is the most successful formal method! –In principal there are no limits. –In practice there is no end in sight. Examples: –Using types for low-level languages, say inside a compiler. –Extending the expressiveness of type systems for high-level languages.

Types in Compilation Conventional compilers: Source = L 1  L 2  …  L n = Target : T 1 Types apply only to the source code. –Type check, then discard types. –If compiler is correct, target code is safe.

Typed Intermediate Languages Generalize syntax-directed translation to type-directed translation. –Intermediate languages come equipped with a type system. –Compiler transformations translate both a program and its type. –Translation preserves typing: if e:T then e*:T* after translation

Typed Intermediate Languages Type-directed translation: Source = L 1  L 2  …  L n = Target : : : T 1  T 2  …  T n Transfers typing properties from source code to object code. –Check integrity of compiler. –Exploit types during code generation.

Certifying Compilers Types on object code certify its safety. –Type checking ensures safety. –Type information ensures verifiability. Examples of certified object code: –TAL = typed assembly language. –PCC = bare code + proof of safety. –Many variations are being explored.

TAL Example fact: 8 .{r1:int, sp:{r1:int,  }::  } jgz r1, positive mov r1,1 ret positive: push r1 ; sp:int::{t1:int,sp:  }::  sub r1,r1,1 call fact[int::{r1:int,sp:  }::  ] imul r1,r1,r2 pop r2 ; sp:{r1:int,sp:  }::ret

Types for Low-Level Languages What is a good type system for a low-level language? –Should expose data representations. –Should allow for low-level “hacks”. –Should be verifiably safe. –Should not compromise efficiency. Current systems make serious compromises. –Very weak safety properties. –Force atomicity of complex operations.

Example: Memory Allocation Most type systems take an atomic view of constructors. –Allocate and initialize in “one step”. –Even HLL’s like Java impose restrictions. We’d like to expose the “fine structure”. –Support code motion such as coalescing. –Allow incremental initialization. –But nevertheless ensure safety!

Example: Memory Allocation An allocation protocol (used in TILT): –Reserve: obtain raw, un-initialized space. –Initialize: assign values to the parts. –Allocate: baptize as a valid object. Current type systems cannot handle this. –Partially initialized objects. –In-place modification of parts. –Interaction with collected heap.

Example: Memory Allocation Heap HP AP ???? LP AP 120 HP

A Low-Level Type System Borrow two ideas from linear logic. –Restricted and unrestricted variables. –A modality to mediate between them. Restricted variables are resources. –Think: pointer into middle of an object. Unrestricted variables are standard. –Think: heap pointer.

A Low-Level Type System Variables are bound to valid objects. –Can be used freely. –Garbage collected when inaccessible. Resources are bound to parts of objects-in-creation. –Cannot be passed around freely. –Explicitly allocated and disposed.

Restrictions on Resources Linearity: use resources exactly once. –Admits re-typing after initialization. –Ensure allocation before general usage. Ordering: resource adjacency matters. –Admit “pointers into the middle” of objects. –Supports in-place, piecemeal initialization.

Variables and Resources Typing judgments: Ordering of x’s does not matter. –Abstract “mobile” locations. Ordering and usage of a’s does matter. –Abstract “pinned” locations, with (a form of) pointer arithmetic.

Low-Level Type Constructors Contiguous data:  1 ²  2. –Two contiguous values. –Two adjacent words: int ² int. Mobile data object: ! . –A fully initialized object of type . Example:  1 £  2 := ! (  1 ²  2 ). –A pointer to an adjacent pair of values.

Allocating a Pair Allocate (1,2) : Initialize a 1, using it up. Re-introduce a 1. Must be empty on return. Reserve space at a.Create names for parts. Fuse parts and allocate. Resource a is used up!

Coalescing Reservation Allocate (0,(1,2)) :

What Have We Gained? The ordered type system ensures: –All reserved data is eventually allocated. –Initialization can happen in any order. –Cannot read un-initialized memory. May also be used for inter-operability. –Adjacency requirements. –Marshalling and un-marshalling. –Precise control over representations.

Types for High-Level Languages Recall: types express invariants. –Calling sequence, data representation. –Abstraction boundaries (polymorphism). Theme: capture more invariants. –Representation invariants. –Protocol compliance. Trade-off expressiveness for simplicity and convenience.

Data Structure Invariants Example: bit strings. –nat : no leading 0’s. –pos : a non-zero Nat. –bits : any old thing. Goal: check such properties of code. –Incr takes nat to pos, preserves pos. –Left shift preserves nat, pos.

Data Structure Invariants Properties of interest: –pos ` nat ` bits Operations: –  : bits –add0 : bits ! bits Æ nat ! pos –add1 : bits ! bits Æ nat ! pos

Data Structure Invariants Logical consequences: –add0 : nat ! pos Æ nat ! nat –add1 : pos ! pos –  : bits Type check code to check invariants! –Simple bi-directional algorithm suffices. –Example: increment.

Data Structure Invariants Increment: inc: bits ! bits Æ nat ! pos fun inc (  ) )  1 | inc (b0) ) b1 | inc (b1) ) (inc b)0

Data Structure Invariants Fully mechanically checkable! –Type check inc twice, once for each conjunct. –First pass: assume argument is just bits, derive that result is therefore bits. –Second pass: assume argument is nat, derive that result is pos. Requires checking entailment of properties. –Decidable for subtype-like behavior.

Value Range Invariants Array bounds checking (a la Pascal): –[0..size(A)] Null and non-null objects: –null, really(C) Aliasing: –its(c) : the object c itself

Watch Out! Such types involve dynamic entities. –[0..size(A)] : A is an array. –its(o) : o is a run-time object. But types are static! –What is an expression of type [ 0..size(if … then A else B)] ??? How to get static checking?

Dependent Types Solution: compile-time proxies. –Track values insofar as possible. –Existentially quantify when its not. Examples: –0 : its(0) –+ : its(1) £ its(2) ! its(4) –if … then 1 else 2 : 9 n its(n)

Types for the World What about the state of the world? –The lock l is held. –The file f is open. –The contents of cell c is positive. But such properties change as execution proceeds! –Here I’m holding the lock, there I’m not. –The file is open now, not later.

Types for the World Want a simple, checkable type system for the “world”. –Types characterize the current state. –Types of the state change over time. But what kind of type system? –Properties of the world are ephemeral. –Facts are no longer persistent. Need: a logic of ephemera.

Types for the World Ephemera behave strangely: –Cannot be replicated: holding twice is not the same as holding once. –Cannot be ignored: must not lose track of whether you hold a lock. Once again linear logic does the trick! –Model parts of the world as resources. –Type changes track state changes.

Types for the World A very simple locking protocol: –acquire : 8 l its(l) / free(l) ! unit / held(l) –release : 8 l its(l) / held(l) ! free(l) –new : unit ! 9 l unit / free(l) A “linear” function type. Gets “used up”. “Replaces” premise. Value / World

Types for the World What does this buy you? –Stating and enforcing simple protocol properties. –Locality of reasoning: focus only on what changes, not what stays the same. It’s much harder than it sounds! –Separation logic: Reynolds and O’Hearn. –But it can be done: Vault Project at MSR.

Summary PL research is providing creative, useful solutions to practical problems. –Building on decades of fundamental research, esp. in logic and type theory. There is much, much more coming! –Many ripe fruit, lots of new developments. –Many good problems yet to be addressed.

Acknowledgements Much of the work discussed here was done by or in collaboration with Karl Crary (CMU), Frank Pfenning (CMU), David Walker (Princeton), Greg Morrisett (Cornell). Plus dozens of our current and former students at Carnegie Mellon.

Types and Logic Logic is the science of deduction. A program is a proof of a theorem.

Memory Allocation If allocation and initialization are atomic, the type system can be simple. –Constructors create objects of a type by allocating and initializing them. –Very limited form of control over initialization order.

Selective Memoization A familiar idea: memoization. –To memoize f(x,y), maintain a mapping from pairs (x,y) to values f(x,y). –Check table on call, update on return. Problem: not sufficiently accurate. –Memoization is driven by call history. –This is much too inaccurate!

Selective Memoization Partial dependency: fun f(x,y) = if y=0 then 0 else x/y. –Never “touches” x if y is zero. Memoization succeeds only if (n,0) has been seen before! –Even though n does not matter!

Selective Memoization Dependency on approximations: fun f(x,y,z) = if x>0 then x+y else x+z. –Result depends on sign of x and the value of either y or z. Memoization cannot account for all triples (m,n,*) with m positive! –Can only reproduce previous calls.

Selective Memoization Idea: track dependence of the result on the input. –Keep track of what parts are “touched”. –Key memo table on the control path, not the full data! –Generalizes conventional memoization. How to support this in a language?

Selective Memoization Linguistic support: –Ensure that all true dependencies are properly tracked. –Allow specification of the “granularity” of memoization. Whole pair, or just components. Only the spine of a list. Only an approximation of a value.

Selective Memoization Key idea: resources. –The result of a memoized function cannot depend on a resource. –The argument of a memoized function is a resource.

Selective Memoization To use the value of a resource, it must be explicitly “touched”. –Records usage of the resource. –Binds value to an unrestricted variable. “Mark” types on which value may depend. –!(int £ int): depends on both arguments. –!int £ !int: may depend on either or both arguments.

Selective Memoization Typing:  ;  ` e:  –  : types of unrestricted variables. –  : types of restricted variables. “Marking” is a modality! –Specifically, the modal logic of necessitation.

Selective Memoization Critical typing rules:

Selective Memoization Example: partial dependency fun f(a:!int,b:!int):int is let! x:int be a in if a=0 then return (0) else let! y:int be b in return (a/b)

Selective Memoization Example: approximate dependency. fun f(a:!int, b:!int):int = let! p:bool = pos(x) in if p then return(0) else return(x+y) fun pos(a:!int):!bool = let! x:int be a in return (x>0)

Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002.

Similar presentations

Presentation on theme: "Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002.

Similar presentations

Presentation on theme: "Programming Languages: The Essence of Computer Science Robert Harper Carnegie Mellon University October, 2002."— Presentation transcript:

Similar presentations

About project

Feedback