Monads in Compilation Nick Benton Microsoft Research Cambridge.

Slides:



Advertisements
Similar presentations
Transposing F to C Transposing F to C Andrew Kennedy & Don Syme Microsoft Research Cambridge, U.K.
Advertisements

Types and Programming Languages Lecture 7 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.
Type Checking, Inference, & Elaboration CS153: Compilers Greg Morrisett.
CSE341: Programming Languages Lecture 2 Functions, Pairs, Lists Dan Grossman Winter 2013.
Type Analysis and Typed Compilation Stephanie Weirich Cornell University.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
Cs776 (Prasad)L4Poly1 Polymorphic Type System. cs776 (Prasad)L4Poly2 Goals Allow expression of “for all types T” fun I x = x I : ’a -> ’a Allow expression.
Kathleen Fisher cs242 Reading: “A history of Haskell: Being lazy with class”,A history of Haskell: Being lazy with class Section 6.4 and Section 7 “Monads.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
The lambda calculus David Walker CS 441. the lambda calculus Originally, the lambda calculus was developed as a logic by Alonzo Church in 1932 –Church.
1 PROPERTIES OF A TYPE ABSTRACT INTERPRETATER. 2 MOTIVATION OF THE EXPERIMENT § a well understood case l type inference in functional programming à la.
8. Introduction to Denotational Semantics. © O. Nierstrasz PS — Denotational Semantics 8.2 Roadmap Overview:  Syntax and Semantics  Semantics of Expressions.
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
Catriel Beeri Pls/Winter 2004/5 last 55 Two comments on let polymorphism I. What is the (time, space) complexity of type reconstruction? In practice –
Principal Type Schemes for Modular Programs Derek Dreyer and Matthias Blume Toyota Technological Institute at Chicago ESOP 2007 Braga, Portugal.
CSE341: Programming Languages Lecture 12 Equivalence Dan Grossman Spring 2013.
Reading, Writing and Relations Towards Extensional Semantics for Effect Analyses Andrew Kennedy Microsoft Research Cambridge (joint work with Nick Benton,
CSE341: Programming Languages Lecture 6 Tail Recursion, Accumulators, Exceptions Dan Grossman Fall 2011.
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Defining practical programming languages Carlos Varela RPI.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Type Inference David Walker COS 441. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.
Type Inference David Walker CS 510, Fall Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs.
Describing Syntax and Semantics
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Chapter Twenty-ThreeModern Programming Languages1 Formal Semantics.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
CSE 230 The -Calculus. Background Developed in 1930’s by Alonzo Church Studied in logic and computer science Test bed for procedural and functional PLs.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Chapter 3 Part II Describing Syntax and Semantics.
1 Formal Semantics. 2 Why formalize? ML is tricky, particularly in corner cases generalizable type variables? polymorphic references? exceptions? Some.
CSE 341 : Programming Languages Lecture 2 Functions, Pairs, Lists Zach Tatlock Spring 2014.
CS 2104 – Prog. Lang. Concepts Functional Programming II Lecturer : Dr. Abhik Roychoudhury School of Computing From Dr. Khoo Siau Cheng’s lecture notes.
Types and Programming Languages Lecture 12 Simon Gay Department of Computing Science University of Glasgow 2006/07.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Advanced Formal Methods Lecture 3: Simply Typed Lambda calculus Mads Dam KTH/CSC Course 2D1453, Some material from B. Pierce: TAPL + some from.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
COMP 412, FALL Type Systems II C OMP 412 Rice University Houston, Texas Fall 2000 Copyright 2000, Robert Cartwright, all rights reserved. Students.
Types and Programming Languages Lecture 14 Simon Gay Department of Computing Science University of Glasgow 2006/07.
CMSC 330: Organization of Programming Languages Operational Semantics.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Prof. Necula CS 164 Lecture 171 Operational Semantics of Cool ICOM 4029 Lecture 10.
Type Checking and Type Inference
Programming Languages and Compilers (CS 421)
ML: a quasi-functional language with strong typing
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Spring 2017.
Representation, Syntax, Paradigms, Types
FP Foundations, Scheme In Text: Chapter 14.
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Spring 2013.
Representation, Syntax, Paradigms, Types
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Autumn 2018.
Representation, Syntax, Paradigms, Types
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Zach Tatlock Winter 2018.
Representation, Syntax, Paradigms, Types
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Spring 2016.
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Spring 2019.
Brett Wortzman Summer 2019 Slides originally created by Dan Grossman
CSE341: Programming Languages Lecture 12 Equivalence
CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion Dan Grossman Autumn 2017.
Presentation transcript:

Monads in Compilation Nick Benton Microsoft Research Cambridge

 Outline Intermediate languages in compilation Traditional type and effect systems Monadic effect systems

 Compilation by Transformation Source Language Intermediate Language Target Language parse, typecheck, translate analyse, rewrite generate code Backend IL

 Compilation by Transformation SML MIL JVM bytecode BBC

 Compilation by Transformation SML CPS Native code MLRISC

 Compilation by Transformation Haskell Core Native code C

 Compilation by Transformation Source Language Intermediate Language Target Language Backend IL

 Transformations  Semantics Source Language Intermediate Language Rewrites should preserve the semantics of the user's program. So they should be observational equivalences. Rewrites are applied locally. So they should be instances of an observational congruence relation.

 Why Intermediate Languages? Couldn't we just rewrite on the original parse tree? Complexity Level Uniformity, Expressivity, Explicitness

 Complexity Pattern-matching Multiple binding forms (val,fun,local,…) Equality types, overloading Datatype and record labels Scoped type definitions …

 Level Multiple arguments Holes: fun map f l = if null l then nil else cons (f (hd l), map f (tl l)) fun map f l = let fun mp r xs = if null xs then *r = [] else let val c = cons(f (hd xs), -) in *r = c; mp &(c.tl) (tl xs) end val h = newhole() in mp &h l; *h end

 Uniformity, Expressivity, Explicitness Replace multiple source language concepts with unifying ones in the IL E.g. polymorphism+modules => F  For rewriting want “good” equational theory Need to be able to express rewrites in the first place and want them to be local Make explicit in the IL information which is implicit in (derived from) the source

 Trivial example: naming intermediate values let val x=((3,4),5) in (#1 x, #1 x) end ((3,4),(3,4)) (#1 ((3,4),5), #1 ((3,4),5)) Urk!

 Trivial example: naming intermediate values let val x=((3,4),5) in (#1 x, #1 x) end let val y = (3,4) val x = (y,5) val w = #1 x val z = #1 x in (w,z) end let val y = (3,4) val x = (y,5) val w = y val z = y in (w,z) end let val y = (3,4) in (y,y) end

 MIL’s try-catch-in construct (M handle E => N) P  (M P) handle E => (N P) try x=M catch E=>N in Q Rewrites on ML handle tricky. E.g: Introduce new construct: (try x=M catch E=>N in Q) P = try x=M catch E=>(N P) in (Q P) Then:

 Continuation Passing Style Some compilers (SML/NJ,Orbit) use CPS as an intermediate language CBV and CBN translations into CPS Unrestricted  valid on CPS (rather than just  v and  v ) and prove more equations (Plotkin) Evaluation order explicit, tail-call elimination just , useful with call/cc

 CPS But “administrative redexes”, undoing of CPS in backend Flanagan et al. showed the same results could be achieved for CBV by adding let and performing A-reductions:  [if V then M else N]  if V then  [M] else  [N]

 Typed Intermediate Languages Pros Type-based analysis and representation choices Backend: GC, registers Find compiler bugs Reflection Typed target languages

 Typed Intermediate Languages Cons Type information can easily be bigger than the actual program. Hence clever tricks required for efficiency of compiler. Insisting on typeability can inhibit transformations. Type systems for low- level representations (closures, holes) can be complex.

 ML T as a Typed Intermediate Language Benton 92 (strictness-based optimisations) Danvy and Hatcliff 94 (relation with CPS and A-normal form) Peyton Jones et al. 98 (common intermediate language for ML and Haskell) Barthe et al 98 (computational types in PTS)

 Combining Polymorphism and Imperative Programming The following program clearly “goes wrong”: let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end

 Combining Polymorphism and Imperative Programming But it seems to be well-typed: let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end 

 Combining Polymorphism and Imperative Programming let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end  ref

 Combining Polymorphism and Imperative Programming let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end .  ref

 Combining Polymorphism and Imperative Programming let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end .  ref int  int (int  int) ref

 Combining Polymorphism and Imperative Programming let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end .  ref (bool  bool) ref

 Combining Polymorphism and Imperative Programming let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end .  ref (bool  bool) bool

 Solution: Restrict Generalization Type and Effect Systems Gifford, Lucassen, Jouvelot, Talpin,… Imperative Type Discipline Tofte (SML’90) Dangerous Type Variables Leroy and Weis

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type =  ref Effect = “creates an  ref”

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type =  ref Effect = “creates an  ref”  ref No Generalization

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type =  ref Effect = “creates an  ref”  ref int  int int  int ref Unify

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type = int  int ref Effect = “creates an int  int ref” int  int ref int  int int  int ref

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type = int  int ref Effect = “creates an int  int ref” int  int ref

 Type and Effect Systems let val r = ref (fn x=>x) in (r := (fn n=>n+1); !r true ) end Type = int  int ref Effect = “creates an int  int ref” int  int ref int  int bool Error!

 All very clever, but… Wright (1995) looked at lots of SML code and concluded that nearly all of it would still typecheck and run correctly if generalization were restricted to syntactic values. This value restriction was adopted for SML97. Imperative type variables were “an example of premature optimization in language design”.

 Despite that… Compilers for impure languages still have good reason for inferring static approximations to the set of side effects which an expression may have let val x = M in N end where x not in FV( N ) is observationally equivalent to N if M doesn’t diverge or perform IO or update the state or throw an exception

 “Classic” Type and Effect Systems: Judgements variable type term type effect Variables don’t have effect annotations because we’re only considering CBV, which means they’ll always be bound to values.

 “Classic” Type and Effect Systems: Basic bits No effect Effect sequence, typically  again Effect join (union)

 “Classic” Type and Effect Systems: Functions Abstraction is value, so no effect Effect of body becomes “latent effect” of function “latent effect” is unleashed in application

 “Classic” Type and Effect Systems: Subeffecting Typically just inclusion on sets of effects Can further improve precision by adding more general subtyping or effect polymorphism.

 “Classic” Type and Effect Systems: Regions 1 (let x=!r; y=!r in M) = (let x=!r in M[x/y]) fn (r:int ref, s:int ref) => let x = !r; _ = s := 1; y = !r in M end read write read Can’t commute the middle command with either of the other two to enable the rewrite. Quite right too! r and s might be aliased.

 “Classic” Type and Effect Systems: Regions 2 fn (r:int ref, s:int ref) => let x = !r; _ = s := 1; y = !r in M end read write read Can commute a reading computation with a writing one. Type system ensures can only assign r and s different colours if they cannot alias. What if we had different colours of reference?

 Colours are called regions, used to index types and effects A ::= int | ref(A,  ) | A  B |  | …  ::= rd(A,  ) | wr(A,  ) | al(A,  ) |  |  | e | … “Classic” Type and Effect Systems: Regions 3 

 Neat thing about regions is effect masking: “Classic” Type and Effect Systems: Regions 4 Improves accuracy, also used for region-based memory management in the ML Kit compiler (Tofte,Talpin)

 Monads and Effect Systems   M:A A ::= … | A  B   M:A,  A ::= … | A  B    M v :TA v A v ::= … | A v  TB v Effect inference CBV translate

 Monads and Effect Systems Wadler ICFP 1998 Soundness by instrumented semantics and subject reduction

 Monads and Effect Systems Tolmach TIC 1998 Four monads in linear order ID LIFT EXN ST    identity nontermination exceptions and nontermination stream output, exceptions and nontermination

 Monads and Effect Systems Tolmach TIC 1998 Language has explicit coercions between monadic types Denotational semantics with coercions interpreted by monad morphisms Emphasis on equations for compilation by transformation

 Monads and Effect Systems Benton, Kennedy ICFP 1998, HOOTS 1999 MLj compiler uses MIL (Monadic Intermediate Language) for effect analysis and transformation MIL-lite is a simplified fragment of MIL about which we can prove some theorems Still not entirely trivial…

 MIL-lite types Value types: Computation types: Effect annotations: nontermination reading refs writing refs allocating refs raising particular exceptions values to computations

 MIL-lite subtyping

 MIL-lite terms 1 Like types, terms stratified into values and computations. Terms of value types are actually in normal form. (Could allow non-canonical values but this is simpler, if less elegant.)

 MIL-lite terms 2 Recursion only at function type because CBV Very crude termination analysis Allows lambda abstraction to be defined as syntactic sugar and does the right thing for curried recursive functions

 MIL-lite terms 3

 MIL-lite terms 4 H is shorthand for a set of handlers {E i  P i } try-catch-in generalises handle and monadic let There’s a more accurate version of this rule Effect union localised here

 MIL-lite semantics 1 Computations evaluate to values.

 MIL-lite semantics 2

 Transforming MIL-lite Now want to prove that the transformations performed by MLj are contextual equivalences Giving a sufficiently abstract denotational semantics is jolly difficult (it’s the fresh names, not the monads per se that make it complex) So we used operational techniques in the style of Pitts

 ciu equivalence Reformulate operational semantics using structurally inductive termination relation Use that to prove various things, including that contextual equivalence coincides with  where M 1  M 2 iff for all , H, N 

 Semantics of effects Could use instrumented operational semantics to prove soundness of the analysis But that feels too intensional - it ties the meaning of effects and the justification of transformations to the formal system used to infer effect information For example, having a trace free of writes versus leaving the store observationally unchanged

 Semantics of effects Instead, define the meaning of each type by a set of termination tests defined in the language

 Definition of Tests 

 Tests  and fundamental theorem At value types it’s just a logical predicate: Fundamental theorem:

 Effect-independent Equivalences

 Effect-dependent equivalences 1           

 Effect-dependent equivalences 2