Download presentation
Presentation is loading. Please wait.
Published byPaulina McNamara Modified over 10 years ago
1
Type inference in type-based verification Dimitrios Vytiniotis, Microsoft Research dimitris@microsoft.com May 2010
2
Software is hard to get right* Which tools can help programmers write reliable code? How to make these tools more practical and effective to use? 1 Making programming language types more practical and effective Making programming language types more practical and effective * Toyota recalls 2010 models due to faulty software in the brakes. Upgrade your Prius! this talk
3
Programming language types Why invest in types? 2 complexity # of bugs Model-driven development Development with proof assistants Verification condition generation and constraint solving Verification condition generation and constraint solving Model checking Model checking Other benefits: 1.Integrated verification and development 2.Early error detection 3.Static checks means fast runtime code 4.Force to think about documentation 5.Modular development 6.They scale A demonstrably simple technology that can eliminate lots of bugs this talk
4
A brief (hi)story of type expressivity 3 Simple Types 1970 Simple Types 1970 Hindley-Milner ML, Haskell, F# Hindley-Milner ML, Haskell, F# OutsideIn(X) GADTs First-class polymorphism Dependent types Type families Type classes … … 2015 The contextMy work on expressive typesThe future ICFP 2006 ICFP 2009 ICFP 2006 JFP 2007 ICFP 2008 ML 2009 TLDI 2010 inc::Int->Int map::(a->b)->[a]->[b] NEW: JFP submission
5
A brief (hi)story of type expressivity 4 Simple types 1970 Simple types 1970 Hindley-Milner ML, Haskell, F# Hindley-Milner ML, Haskell, F# OutsideIn(X) GADTs First-class polymorphism Dependent types Type families Type classes … … 2015 My work on expressive typesThe future ICFP 2006 ICFP 2009 ICFP 2006 JFP 2007 ICFP 2008 ML 2009 TLDI 2010 Keeping types practical The context NEW: JFP submission
6
Types express properties 5 [1,2,3,4] :: { l :: List Int where forall i < length(l), l[i]<=4 } [1,2,3,4] :: ListWithLength 4 Int [1,2,3,4] :: List NONEMPTY Int [1,2,3,4] :: List Int [1,2,3,4] :: IntList [1,2,3,4] :: Object # of bugs … but keep the complexity low Our goal: Increase expressivity … Our goal: Increase expressivity … Hindley-Milner [Hindley, Damas & Milner] Haskell, ML, F#, also Java, C#, … Hindley-Milner [Hindley, Damas & Milner] Haskell, ML, F#, also Java, C#, …
7
Keeping type annotation cost low 6 How to convince the type checker that programs are well-typed? StringBuilder sb = new StringBuilder(256); var sb = new StringBuilder(256); Full type inference No user annotations at all Full type checking Explicit types everywhere Hindley-Milner inc x = x+1 Many traditional languages Int inc(Int x) = x+1 Increased expressivity requires more checking Increased expressivity requires more checking Full type inference extremely convenient [no type-induced pain] map f list = case list of nil -> nil h:tail -> cons (f h) (map f tail) Full type inference extremely convenient [no type-induced pain] map f list = case list of nil -> nil h:tail -> cons (f h) (map f tail) map (f :: S -> T) (list :: [S]) = case list of nil -> nil h:tail -> cons (f h) (map f tail) map (f :: S -> T) (list :: [S]) = case list of nil -> nil h:tail -> cons (f h) (map f tail)
8
Keeping types predictable 7 With simple, robust, declarative typing rules test1 = … p1 + p2 … -- ACCEPTED test2 = … p2 + p1 … -- REJECTED And theorems that connect typing rules to low level algorithms test1 = p -- ACCEPTED test2 = -- REJECTED let f x = x in f p t <- infer e s <- infer u α <- fresh solve(t = s -> α ) return α e :: s -> t u :: s e u :: t Hindley-Milner scores perfect here
9
A brief (hi)story of type expressivity 8 Simple Types 1970 Simple Types 1970 Hindley-Milner ML, Haskell, F# Hindley-Milner ML, Haskell, F# OutsideIn(X) GADTs First-class polymorphism Dependent types Type families Type classes … … 2015 The contextMy work on expressive typesThe future ICFP 2006 ICFP 2009 ICFP 2006 JFP 2007 ICFP 2008 ML 2009 TLDI 2010 NEW: JFP submission Simple, predictable No user annotations Low expressivity 1.What are GADTs 2.Why they are difficult for type inference 3.Inference vs checking [ICFP 2006] 4.Simplifying and reducing annotations [ICFP 2009] How to implement GADTs 1.What are GADTs 2.Why they are difficult for type inference 3.Inference vs checking [ICFP 2006] 4.Simplifying and reducing annotations [ICFP 2009] How to implement GADTs
10
GADTs in Glasgow Haskell Compiler (GHC) 9 -- An Algebraic Datatype: Integer Lists data IList where Nil :: IList Cons :: Int -> IList -> IList -- A Generalized Algebraic Datatype (GADT) data IList f where Nil :: IList EMPTY Cons :: Int -> IList f -> IList NONEMPTY x = Cons 1 (Cons 2 Nil) head :: IList NONEMPTY -> Int test0 = head x test0 = head Nil Type checker knows x :: IList NONEMPTY Type checker knows x :: IList NONEMPTY REJECTED!
11
Uses of GADTs 10 Compiler enforces invariants via type checking tail :: ListWithLength (S n) -> ListWithLength n compile :: Term SOURCE -> Maybe (Term TARGET) Significant number of research papers [Cheney & Hinze, Xi, Pottier & Simonet, Pottier & Régis-Gianas, Sulzmann & Stuckey,…] Verified compiler transformations, data structure implementations, reflection & generic programming, … Such a cool feature that people are using GADT-inspired tricks in other languages! For example, C. Russo and A. Kennedy have a C# encoding
12
Example: evaluation of embedded DSL 11 data Term where ILit :: Int -> Term And :: Term -> Term -> Term IsZero :: Term -> Term... eval :: Term -> Val eval (ILit i) = IVal i eval (And t1 t2) = case eval t1 of IVal _ -> error BVal b1 -> case eval t2 of IVal _ -> error BVal b2 -> BVal (b1 && b2)... f = eval (And (ILit 3) (IsZero 0)) data Term a where ILit :: Int -> Term Int And :: Term Bool -> Term Bool -> Term Bool IsZero :: Term Int -> Term Bool... eval :: Term a -> a eval (ILit i) = i eval (And t1 t2) = eval t1 && eval t2... A common example, also appearing in [Peyton Jones, Vytiniotis, Weirich, Washburn, ICFP 2006] data Val where IVal :: Int -> Val BVal :: Bool -> Val data Val where IVal :: Int -> Val BVal :: Bool -> Val Represents only correct terms Tagless evaluation: efficient code Represents only correct terms Tagless evaluation: efficient code A non-GADT representation A GADT representation
13
Type checking and GADTs 12 Pattern matching introduces type equalities, available after the = In the first branch we learn that a ~ Int data Term a where ILit :: Int -> Term Int eval :: Term a -> a eval (ILit i) = i eval _ = … i :: Int Possible with the help of programmer annotations Right-hand side: we must return type a Right-hand side: we must return type a That’s fine because we know that (a~Int) from pattern matching That’s fine because we know that (a~Int) from pattern matching Determines the term we analyze Determines the result
14
Type inference and GADTs 13 Here is a possible type of getILit : Term a -> [Int] But if (a ~ Int) is used then there is also another one Term a -> [a] data Term a where ILit :: Int -> Term Int... -- Get a list of literals in this term getILit (ILit i) = [i] getILit _ = [] Haskell programmers omit type signatures BAD!
15
A threat for modularity 14 Two different “specifications” for getILit btrm :: Term Bool f1 = (getILit btrm) ++ [0] f2 = (getILit btrm) ++ [True] test = let getILit (ILit i) = [i] getILit _ = [] in... Works only with: Term a -> [Int] Works only with: Term a -> [a] And this one? We want to have a unique principal type that we infer once and use throughout the scope of the function
16
Separating checking and inference [ICFP 2006] 15 S. Peyton Jones, D. Vytiniotis, G. Washburn, S. Weirich Not all programs have principal types, so use annotations to let programmers decide No annotation: do not use GADT equalities To use the other type supply an annotation: Annotations determine two interweaved modes of operation: checking mode and inference mode getILit (ILit i) = [i] -- inferred: (Term a -> [Int]) getILit :: Term a -> [a] getILit (ILit i) = [i]
17
Discovering a complete implementation 16 Predictability mandates high-level declarative typing rules That turned out to be possible because: 1. Typing rules [and algorithm] can “switch” mode when they meet annotations 2. The GADT checking problem is easy 3. All non-GADT branches are typed as in Hindley-Milner This is what GHC implements since 2006 Extremely effective and popular: http://darcs.net, commercial users, …http://darcs.net The first work on type inference and GADTs to achieve this The first work on type inference and GADTs to achieve this Theorem: There exists a provably decidable, sound and complete algorithm for the [ICFP 2006] type system Needed to design a type system and a sound and complete algorithm Needed to design a type system and a sound and complete algorithm
18
[ICFP 2006] was a breakthrough but … 17 To reduce required annotations it used some ad-hoc annotation propagation How to improve this? opt :: Term b -> Term b eval :: Term a -> a eval x = case opt x of ILit i -> i eval :: Term a -> a eval x = let f x = x in case f (opt x) of ILit i -> i fails Because no type annotation for f Quite remarkable BUT what about predictability? typechecks
19
The Outside-In solution 18 Shrijvers, Sulzmann, Peyton Jones, Vytiniotis [ICFP 2009] perform full inference outside a GADT branch first, and then use what you learnt to go inside the branch Very aggressive type information discovery + a simpler “Outside-In” type system eval :: Term a -> a eval x = let f x = x in case f (opt x) of ILit i -> i Working on the outside of the branch first determines that f (opt x) :: Term a Working on the outside of the branch first determines that f (opt x) :: Term a
20
Simplifying and reducing annotations [ICFP 2009] 19 Fewer annotations needed Predictability Forthcoming implementation in GHC, invited paper in special issue of JFP “the system of this paper is the simplest proposal ever made to solve type inference for GADTs” [anonymous reviewer] Theorem: There exists a provably decidable, sound and complete algorithm for the “Outside-In” type system in [ICFP 2009] All type-safe programs All programs with principal types Modularity Theorem: “Outside-In” type system
21
Inferring principal types in [ICFP 2009] 20 data Term a where ILit :: Int -> Term Int If :: Term Bool -> Term a -> Term a -> Term a -- Get the least number in this term findLeast (ILit i) = i findLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2 Because of (x1 < x2), f indLeast must return Int. T here is a principal type [and ICFP 2009 finds it]: Term a -> Int Because of (x1 < x2), f indLeast must return Int. T here is a principal type [and ICFP 2009 finds it]: Term a -> Int Not due to arbitrarily choosing Term a -> Int as previously Not due to arbitrarily choosing Term a -> Int as previously REJECTED in [ICFP 2009] No ad-hoc assumptions about programmer intentions REJECTED in [ICFP 2009] No ad-hoc assumptions about programmer intentions
22
The algorithm in [ICFP 2009] 21 findLeast (ILit i) = i findLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2 GADT branches introduce implication constraints that we must solve ( α ~ Int) => ( β ~ Int) GADT branches introduce implication constraints that we must solve ( α ~ Int) => ( β ~ Int) Type checker infers partially known type: findLeast :: Term α -> β Implication constraints may have many solutions β := Int or β := α which result in different types. Constraint abduction [Maher] or (rigid) E-unification [Degtyarev & Voronkov, Veanes, Gallier & Snyder, Gurevich] Detecting incomparable solutions only possible in special cases. Mostly negative results about complexity or even decidability of the general problem. NOT VERY ENCOURAGING Implication constraints may have many solutions β := Int or β := α which result in different types. Constraint abduction [Maher] or (rigid) E-unification [Degtyarev & Voronkov, Veanes, Gallier & Snyder, Gurevich] Detecting incomparable solutions only possible in special cases. Mostly negative results about complexity or even decidability of the general problem. NOT VERY ENCOURAGING
23
Restricting implications for Outside-In 22 Step 1: Introduce special constraints that record the interface of the branch with the outside Step 2: Solve non-implication constraint (B) first. Easy, no multitude of solutions to pick from: β := Int Step 3: Substitute solution on implication constraint (A) [ a ] (α ~ Int) => (Int ~ Int) Step 4: Solve remaining implications fixing interface variables findLeast (ILit i) = i findLeast (If cond t1 t2) = let x1 = findLeast t1 x2 = findLeast t2 in if (x1 < x2) then x1 else x2 Constraint A: [ α,β ] ( α ~ Int) => ( β ~ Int) Interface: [ α,β ] Constraint A: [ α,β ] ( α ~ Int) => ( β ~ Int) Interface: [ α,β ] Constraint B: [ α,β ] ( β ~ Int) Interface: [ α,β ] Constraint B: [ α,β ] ( β ~ Int) Interface: [ α,β ]
24
A brief (hi)story of type expressivity 23 Simple Types 1970 Simple Types 1970 Hindley-Milner ML, Haskell, F# Hindley-Milner ML, Haskell, F# OutsideIn(X) GADTs First-class polymorphism Dependent types Type families Type classes … … 2015 The contextMy work on expressive typesThe future ICFP 2006 ICFP 2009 ICFP 2006 JFP 2007 ICFP 2008 ML 2009 TLDI 2010 NEW: JFP submission
25
The Hindley-Milner type system 25 years later 24 How all the above affect our “golden standard” of modern type systems? We had to add user type annotations to HM to get GADTs Yet another reason for this is first-class polymorphism [THESIS TOPIC] QML: Explicit first-class polymorphism for ML [Russo, Vytiniotis, ML 2009] FPH: First-class polymorphism for Haskell [Vytiniotis, Peyton Jones, Weirich, ICFP 2008] Practical type inference for higher-rank types [Peyton Jones, Vytiniotis, Weirich, Shields, JFP 2007] The canonical reference for Higher-Rank type systems Boxy Types [Vytiniotis, Peyton Jones, Weirich, ICFP 2006] … but are we also forced to remove anything? Reminder: Hindley-Milner does not need any annotations, at all Reminder: Hindley-Milner does not need any annotations, at all
26
let generalization in Hindley-Milner 25 For some extensions [e.g. GHCs celebrated type families] we must allow deferring because: no-deferring hard-to-generalize* … but is it practical to defer? main = let group x y = [x,y] in (group 0 1, group False False) group is polymorphic. We can give it the generalized type group :: forall a. a -> a -> [a] or defer the check to the call sites [Pottier, Sulzmann, HM(X)]: group :: forall a b. (a ~ b) => a -> b -> [a] group is polymorphic. We can give it the generalized type group :: forall a. a -> a -> [a] or defer the check to the call sites [Pottier, Sulzmann, HM(X)]: group :: forall a b. (a ~ b) => a -> b -> [a] * trust me
27
No generalization for let -bound definitions 26 Well-typed if we defer equality to the call site of g: g :: (a ~ Int) => b -> Int f :: a -> Term a -> Int f x y = let g b = x + 1 in case y of ILit i -> g () a ~ Int... errk??? If typing rules allow deferring Then algorithm must not solve any equality [BAD!] completeness proof reveals nasty surprise completeness proof reveals nasty surprise
28
The proposal [TLDI 2010] 27 D. Vytiniotis, S. Peyton Jones, T. Schrijvers [TLDI 2010] Abandon generalization of local definitions The only complete algorithms are not practical RADICAL: removing a basic ingredient of HM But not restrictive in practice: 127 lines affected in 95Kloc of Haskell libraries (0.13%)! No expressivity loss: Polymorphism can be recovered with annotations RADICAL: removing a basic ingredient of HM But not restrictive in practice: 127 lines affected in 95Kloc of Haskell libraries (0.13%)! No expressivity loss: Polymorphism can be recovered with annotations
29
OutsideIn(X) 28 Many recent extensions exhibit those problems: GADTs [previous slides] Type classes: sort :: forall a. Ord a => [a] -> [a] Type families: append :: forall n m. (IList n)->(IList m)->(IList (Plus n m)) Units of measure [Kennedy 94], implicit parameters, functional dependencies, impredicative polymorphism … OutsideIn(X) [TLDI 2010, new JFP submission] Parameterize “Outside-In” type system and infrastructure [implication constraints] by a constraint theory X and its solver w/o losing inference Do the Hard Work once
30
OutsideIn(X) – new JFP submission 29 Substantial article that brings the results of a multi-year collaborative research program together Many people involved over the years: Simon Peyton Jones, Tom Schrijvers (KU Leuven), Martin Sulzmann (Informatik Consulting Systems AG), Manuel Chakravarty (UNSW), Stephanie Weirich (Penn), Geoff Washburn (LogicBlox), … Bonus: a new glorious constraint solver to instantiate X, which improves previous work, and for the first time shows how to deal with all of GHCs tricky features
31
A brief (hi)story of type expressivity 30 Simple types 1960 Simple types 1960 Hindley-Milner ML, Haskell, F# Hindley-Milner ML, Haskell, F# OutsideIn(X) GADTs First-class polymorphism Dependent types Type families Type classes … … 2015 My work on expressive typesThe future ICFP 2006 ICFP 2009 ICFP 2006 JFP 2007 ICFP 2008 ML 2009 TLDI 2010 The context NEW: JFP submission
32
What we did learn 31 We now know about: Local assumptions [ ICFP 2006, ICFP 2009, TLDI 2010 ] Local definitions [ TLDI 2010 ] Generalizing Outside-In with OutsideIn(X) [ TLDI 2010 ] Where to from here?
33
2015 (And ideas for collaborations!) 32 … towards practical pluggable type systems + inference! import UnitTheory.thy data Vehicle = Vehicle { weight :: Int[kg], power :: Int[hp],... }... import UnitTheory.thy data Vehicle = Vehicle { weight :: Int[kg], power :: Int[hp],... }... UnitTheory.thy A theory of units of measure: [Kennedy, ESOP94] constant kg,hp,sec,m axiom u*1 = u axiom u*v = v*u axiom … UnitTheory.thy A theory of units of measure: [Kennedy, ESOP94] constant kg,hp,sec,m axiom u*1 = u axiom u*v = v*u axiom … A solver for UnitTheory constraints Type checker/inference OutsideIn(UnitTheory) Type checker/inference OutsideIn(UnitTheory) DSL Designer / User DSL User We (the compiler) Yes/No Programs with principal types Open: How to design syntactic language extensions Open: How to trust solver [proof checking, certificates?] Open: How to trust solver [proof checking, certificates?] Open: How to type more programs with principal types [revisiting rigid E-unification, better constraint solvers, ideas from SMT solving] Open: How to type more programs with principal types [revisiting rigid E-unification, better constraint solvers, ideas from SMT solving] Open: How to combine multiple theories and solvers [revisiting Nelson-Oppen]
34
Understanding and writing better software 33 Past:What do GADTs mean? How many functions have type forall a. [a] -> a -> a forall a. Term a -> a -> a [Vytiniotis & Weirich, MFPS XXIII, Vytiniotis & Weirich, JFP 2010] Past: PL proofs are tedious and error-prone. Mechanize them in proof assistants. The POPLMark Challenge [TPHOLS 2005] Have been using Isabelle/HOL and Coq in recent works with Claudio Russo and Andrew Kennedy Ongoing: Typed intermediate languages that better support type equalities and full-blown dependent types [with S. Weirich, S. Zdancewic, S. Peyton Jones] Ongoing: Adding probabilities to contracts to combine static analysis and testing or statistical methods [with V. Koutavas, TCD] On the wish list: Macroscopically programming groups of agents of limited computational power
35
Q-A games for encoding and decoding 34 Imagine a binary format such that every bitstring encodes a non-empty set of type-safe CIL programs Not easy to program from first principle! Instead, understand and program encoders using question-answer games Good coding scheme follows by asking good questions! Recent ICFP 2010 submission with A. Kennedy y y y n n n
36
Programming language types Making good software easier to write 35 complexity # bugs A demonstrably simple technology that can already eliminate lots of bugs This talk: solving research problems to make types more effective and practical: Catch more bugs Require little user guidance Remain predictable and modular This talk: solving research problems to make types more effective and practical: Catch more bugs Require little user guidance Remain predictable and modular
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.