Typed Compilation of Recursive Datatypes Joseph C. Vanderwaart, Derek Dreyer, Leaf Petersen, Karl Crary, Robert Harper, and Perry Cheng Carnegie Mellon.

Slides:



Advertisements
Similar presentations
Sml2java a source to source translator Justin Koser, Haakon Larsen, Jeffrey Vaughan PLI 2003 DP-COOL.
Advertisements

Transposing F to C Transposing F to C Andrew Kennedy & Don Syme Microsoft Research Cambridge, U.K.
More ML Compiling Techniques David Walker. Today More data structures lists More functions More modules.
Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.
Type Analysis and Typed Compilation Stephanie Weirich Cornell University.
The Semantic Soundness of a Type System for Interprocedural Register Allocation and Constructor Registration Torben Amtoft Kansas State University joint.
CSE341: Programming Languages Lecture 16 Datatype-Style Programming With Lists or Structs Dan Grossman Winter 2013.
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Distributed Meta- Programming Rui Shi, Chiyan Chen and Hongwei Xi Boston University.
Module Language. Module language The Standard ML module language comprises the mechanisms for structuring programs into separate units. –Program units.
A Type System for Well-Founded Recursion Derek Dreyer Carnegie Mellon University POPL 2004 Venice, Italy.
A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon.
ISBN Chapter 3 Describing Syntax and Semantics.
What is a recursive module? Crary, Harper, Puri Module Systems, Fall 2002 Aleksey Kliger.
Modular Type Classes Derek Dreyer Robert Harper Manuel M.T. Chakravarty POPL 2007 Nice, France.
Principal Type Schemes for Modular Programs Derek Dreyer and Matthias Blume Toyota Technological Institute at Chicago ESOP 2007 Braga, Portugal.
1 Meta-Programming through Typeful Code Representation Chiyan Chen and Hongwei Xi Boston University.
CSE341: Programming Languages Lecture 6 Tail Recursion, Accumulators, Exceptions Dan Grossman Fall 2011.
Intensional Polymorphism in Type-Erasure Semantics Karl Crary, Stephanie Weirich, Greg Morrisett Presentation by Nate Waisbrot.
Data Abstraction COS 441 Princeton University Fall 2004.
Survey of Typed Assembly Language (TAL) Introduction and Motivation –Conventional untyped compiler < Typed intermediate languages –Typed intermediate language.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Chapter 15 Other Functional Languages. Copyright © 2007 Addison-Wesley. All rights reserved. Functional Languages Scheme and LISP have a simple syntax.
01/17/20031 Guarded Recursive Datatype Constructors Hongwei Xi and Chiyan Chen and Gang Chen Boston University.
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
Elaboration or: Semantic Analysis Compiler Baojian Hua
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Operational Semantics.
Strict Bidirectional Type Checking Adam Chlipala, Leaf Petersen, and Robert Harper.
Introduction to ML Last time: Basics: integers, Booleans, tuples,... simple functions introduction to data types This time, we continue writing an evaluator.
Distributed Meta- Programming (To appear GPCE’06) Rui Shi, Chiyan Chen and Hongwei Xi Boston University.
Modules in UHC A proposal by: Tom Hofte & Eric Eijkelenboom.
MinML: an idealized programming language CS 510 David Walker.
Describing Syntax and Semantics
Cormac Flanagan University of California, Santa Cruz Hybrid Type Checking.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
A Formal Model of Modularity in Aspect-Oriented Programming Jonathan Aldrich : Objects and Aspects Carnegie Mellon University.
CSE-321 Programming Languages Introduction to Functional Programming (Part II) POSTECH March 13, 2006 박성우.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Evolving the ML Module System Derek Dreyer Toyota Technological Institute at Chicago April 15, 2004.
Semantics In Text: Chapter 3.
Ceg860 (Prasad)LADT1 Specification and Implementation of Abstract Data Types Algebraic Techniques.
Singleton Kinds and Singleton Types Christopher A. Stone August 2, 1999 Thesis Committee Bob Harper, chair Peter Lee John Reynolds Jon Riecke (Bell Laboratories)
Introduction to Compiling
Implementing a Dependently Typed λ -Calculus Ali Assaf Abbie Desrosiers Alexandre Tomberg.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
A Type System for Higher-Order Modules Derek Dreyer, Karl Crary, and Robert Harper Carnegie Mellon University POPL 2003.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
Cs776(Prasad)L112Modules1 Modules value : type : function :: structure : signature : functor.
The Ins and Outs of Gradual Type Inference Avik Chaudhuri Basil Hosmer Adobe Systems Aseem Rastogi Stony Brook University.
COMP 412, FALL Type Systems C OMP 412 Rice University Houston, Texas Fall 2000 Copyright 2000, Robert Cartwright, all rights reserved. Students.
Generic Programming and Proving for Programming Language Metatheory
Programming Languages and Compilers (CS 421)
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2017.
ML: a quasi-functional language with strong typing
Corky Cartwright January 18, 2017
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Autumn 2018.
FP Foundations, Scheme In Text: Chapter 14.
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Zach Tatlock Winter 2018.
Background In his classic 1972 paper on definitional interpreters, John Reynolds introduced two key techniques: Continuation-passing style - Makes.
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Autumn 2018.
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2016.
CSE-321 Programming Languages Introduction to Functional Programming
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Zach Tatlock Winter 2018.
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Autumn 2017.
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Spring 2019.
CSE341: Programming Languages Lecture 17 Implementing Languages Including Closures Dan Grossman Spring 2019.
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Autumn 2017.
Presentation transcript:

Typed Compilation of Recursive Datatypes Joseph C. Vanderwaart, Derek Dreyer, Leaf Petersen, Karl Crary, Robert Harper, and Perry Cheng Carnegie Mellon University TLDI 2003

2 SML Datatypes Elegant mechanism for defining recursive variant types, such as: datatype intlist = Nil | Cons of int * intlist Important that constructor applications and pattern matching should be implemented efficiently Subject of this talk: –How to implement SML datatypes efficiently in a type-preserving compiler

3 Formal Framework Harper and Stone’s type-theoretic interpretation of Standard ML: –“Elaborates” SML programs into a type theory Reasons for using HS: –Models first phase of type-preserving compiler, in particular the TILT compiler (developed at CMU) –Can explain datatype semantics in terms of type theory

4 Overview Three interpretations of datatypes: –Harper-Stone interpretation –Transparent interpretation –Coercion interpretation Comparison on three axes: –Efficiency –Fidelity to the Definition of SML –Meta-theoretic complexity

The Harper-Stone Interpretation

6 Datatype Semantics SML datatypes are generative: –Identical datatype declarations in separate modules yield distinct (abstract) types HS elaborates datatypes as modules providing: –The datatype itself defined as a recursive sum type –Functions to construct and destruct values of the datatype HS models generativity by “sealing” the datatype module with an abstract signature

7 ExpDec Example datatype exp = VarExp of var | LetExp of dec * exp and dec = ValDec of var * exp | SeqDec of dec * dec VarExp(v) ¼ “v” LetExp(d,e) ¼ “let d in e” ValDec(v,e) ¼ “val v = e” SeqDec(d1,d2) ¼ “d1; d2”

8 ExpDec Implementation structure ExpDec :> EXPDEC = struct type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) fun exp_in x = roll exp (x) fun exp_out x = unroll exp (x) fun dec_in x = roll dec (x) fun dec_out x = unroll dec (x) end

9 ExpDec Interface signature EXPDEC = sig type exp type dec val exp_in : var + (dec * exp) -> exp val exp_out : exp -> var + (dec * exp) val dec_in : (var * exp) + (dec * dec) -> dec val dec_out : dec -> (var * exp) + (dec * dec) end

10 Elaborating Constructor Calls Client of the datatype does the injection into the sum, then calls the datatype’s “ in ” function: VarExp(v) Ã ExpDec.exp_in(inj 1 (v)) LetExp(d,e) Ã ExpDec.exp_in(inj 2 (d,e)) ValDec(v,e) Ã ExpDec.dec_in(inj 1 (v,e)) SeqDec(d1,d2) Ã ExpDec.dec_in(inj 2 (d1,d2)) But the cost of function calls to the in functions is too expensive.

11 Inlining the Constructor Calls We would like to inline the roll ’s to avoid calling the exp_in and dec_in functions: VarExp(v) Ã roll ExpDec.exp (inj 1 (v)) LetExp(d,e) Ã roll ExpDec.exp (inj 2 (d,e)) ValDec(v,e) Ã roll ExpDec.dec (inj 1 (v,e)) SeqDec(d1,d2) Ã roll ExpDec.dec (inj 2 (d1,d2)) But the definitions of exp and dec are not known outside of ExpDec, so inlining the roll ’s is ill-typed!

12 Separate Compilation Not a problem if client of datatype defined in same compilation unit: –Unseal the datatype ) roll ’s become well-typed Is a problem if client of datatype is defined in separately compiled module: –Datatype is an abstract import of client –Can’t assume knowledge of implementation –Similar problem for datatypes in functor arguments

A Transparent Interpretation

14 Making Datatypes Transparent Expose the implementation of a datatype as a recursive sum type in its interface: signature EXPDEC = sig type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) (* in and out function specs as before *) end Inlining calls to the in and out functions is now well-typed outside of ExpDec

15 Implications of Transparency Datatypes are no longer generative –Identically defined datatypes are “visibly” equal –More types are equivalent, more programs may typecheck Matching a datatype specification is harder –To match a datatype spec, a datatype must now be implemented as a particular recursive sum type –Depending on how you define recursive type equivalence, fewer programs may typecheck!

16 Transparent Matching Example struct datatype exp = VarExp of var | LetExp of dec * exp and dec = ValDec of var * exp | SeqDec of dec * dec end :> sig type exp datatype dec = ValDec of var * exp | SeqDec of dec * dec end ?

17 Transparent Matching Example struct type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) end :> sig type exp datatype dec = ValDec of var * exp | SeqDec of dec * dec end ?

18 Transparent Matching Example struct type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) end :> sig type exp type dec =  1 (  ).(var * exp +  *  ) end ?

19 Transparent Matching Example struct type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) end :> sig type exp type dec =  1 (  ).(var * exp +  *  ) end ? = ?

20 Notation Use  to stand for a recursive type, i.e.:  ::=  k (  1,...,  n ).(  1,...,  n ) (k 2 1..n) Expansion of a recursive type: expand(  ) For example, if intlist = . 1 + int *  then expand ( intlist ) = 1 + int * intlist

21 Iso-Recursive Types Iso-recursive equivalence is purely structural: –   expand(  ), but the two are isomorphic –roll  : expand(  ) !  –unroll  :  ! expand(  Works fine for H-S with abstract datatypes, but…

22 Transparent Matching Example struct type exp =  1 ( ,  ).(var +  * , var *  +  *  ) type dec =  2 ( ,  ).(var +  * , var *  +  *  ) end :> sig type exp type dec =  1 (  ).(var * exp +  *  ) end ? X

23 Equi-Recursive Types Another form of recursive type equivalence: –  = expand(  ) – .  (  ) represents unique solution of  =  (  ) –  = .  (  ) iff  =  (  ) Equi-recursive equivalence is sufficient: –dec matches its specification –Enables transparent interpretation to accept all valid SML datatype matchings

24 Equi-Recursive Types Recall from the example: dec =  2 ( ,  ).(var +  * , var *  +  *  ) and we need dec =  1 (  ).(var * exp +  *  ) Suffices to show dec satisfies the fixed point equation: dec = var * exp + dec * dec Which follows from: dec = expand( dec ) = var *  1 (  ) +  2 (  ) *  2 (  ) = var * exp + dec * dec

25 A Hybrid Equivalence Equi-recursive equivalence is overkill: –Unnecessary to equate a recursive type with a non-recursive type (its expansion) Hybrid of iso- and equi-recursive equivalence: –Based on FLINT intermediate lang. [League and Shao] –Restriction of Amadio-Cardelli algorithm –Only equates  ’s with  ’s Paper gives details of the hybrid algorithm, along with formal argument that it is sufficient

26 Complications Strong versions of type equivalence not well studied outside simply typed -calculus. (TILT IL’s have h.-o. constructors, singleton kinds…) Conflicts with SML semantics: –Datatypes no longer generative. –Problems involving datatypes in sharing and where type constraints. –To implement SML, must handle these issues another way.

The Coercion Interpretation

28 Those in and out Functions Recall the definitions given during elaboration: fun in(x) = roll  (x) fun out(x) = unroll  (x) Consider the roll and unroll operations. –Commonly implemented as “no-ops”. That is, the values v and roll (v) are represented the same. So, roll and unroll are just “retyping” operators, or coercions. –Untyped machine code for in / out same as for the identity function.

29 New type constructor:  1 )  2 – Inhabited only by coercive terms – Coerciveness of exp_in, exp_out reflected in type – Applications can be ignored at runtime signature EXPDEC = sig type exp type dec val exp_in : var + (dec * exp) exp val exp_out : exp var + (dec * exp) val dec_in : (var * exp) + (dec * dec) dec val dec_out : dec (var * exp) + (dec * dec) end ExpDec Revisited ) ) ) ) -> At runtime, exp_in, exp_out act as the identity, but: – Cannot be recognized from the type

30 Coercions New constructs for the internal language: –Coercion values fold  / unfold  replace roll  / unroll  –Special type  1 )  2 distinguishes them from functions. –Special application syntax: e Define in/out using coercions val in : expand(  ) )  = fold  val out :  ) expand(  ) = unfold  Define constructor app’s using coercion app’s VarExp(x) Ã 1 (x))

31 Coercion Erasure Why are coercion applications better than function applications? Because: –A closed value of coercion type can only be fold or unfold. –No work is required at run time to apply either fold or unfold. –To compile generate the same code as for e. Safety argument (in the paper) –Formalized via a translation into an untyped target calculus.

32 Performance Run times of benchmarks under 3 interpretations. Harper-Stone ¼ 37% slower than the others Coercion interpretation about the same as transparent. Coercion interpretation is faithful to SML semantics, requires only simple extension to the type theory.

33 Conclusion Efficiency Conformance to SML Semantics Meta-theoretic Simplicity Harper-Stone  Transparent  ? Coercion

34 Transparent Interpretation –Remove all type abstraction – must recover datatype generativity and sharing constraints by other means. –New point in design space of recursive type equivalence. Coercion Interpretation –Preserve abstract semantics of datatypes. –Contribution: Coercion types may be generally useful.