Dependently Typed Pattern Matching Hongwei Xi Boston University
Datatypes Available in various functional programming languages such as SML and Haskell Convenience in programming Clarity in code
An Example: Random-Access Lists Cons: O(log n) (Amortized: O(1)) Uncons: O(log n) (Amortized: O(1)) Lookup operation: O(log n) Update operation: O(log n)
Datatype for Random Lists datatype ‘a ralist = Nil | One of ‘a | Even of ‘a ralist * ‘a ralist | Odd of ‘a ralist * ‘a ralist L1: x 1, …,x n ; L2: y 1, …, y n Even(L1, L2): x 1, y 1, …, x n, y n L1: x 1, …,x n, x n+1 ; L2: y 1, …, y n Odd(L1, L2): x 1, y 1, …, x n, y n, x n+1
Some Inadequacies Even should only be applied to two nonempty lists of equal length Odd should only be applied to two nonempty lists where the first list contains exactly one more element than the second one Unfortunately, these invariants cannot be captured by the type system of ML
Dependent Datatypes for Random Lists datatype ‘a ralist with nat = Nil(0) | One(1) of ‘a | {n:pos} Even(n+n) of ‘a ralist(n) * ‘a ralist(n) | {n:pos} Odd(n+n+1) of ‘a ralist(n+1) * ‘a ralist(n) For instance, Even is given the type: {n:pos} ‘a ralist(n) * ‘a ralist(n) -> ‘a ralist(n+n)
uncons in Dependent ML(DML) fun(‘a) uncons (One x) = (x, Nil) | uncons (Even (l1, l2)) = (case uncons l1 of (x, Nil) => (x, l2) | (x, l1) => (x, Odd (l2, l1)) | uncons (Odd (l1, l2)) = let val (x, l1) = uncons l1 in (x, Even (l2, l1)) end withtype {n:pos} ‘a ralist(n) -> ‘a * ‘a ralist(n-1)
Pattern Matching in DML Nondeterministic at compile-time Sequential at run-time This can cause an annoying problem in DML: the previous code for uncons does not type-check
Mutually Disjoint Patterns Note that: nondeterministic pattern matching is the same as sequential pattern matching if all patterns are disjoint We can manually expand patterns into disjoint ones, but this may be inconvenient and error-prone
An Example of Expansion (case uncons l1 of (x, Nil) => (x, l2) | (x, l1) => (x, Odd (l2, l1)) is expanded into (case uncons l1 of (x, Nil) => (x, l2) | (x, l1 as One _) => (x, Odd (l2, l1)) | (x, l1 as Even _) => (x, Odd (l2, l1)) | (x, l1 as Odd _) => (x, Odd (l2, l1))
The Problem Given patterns p, p 1, …, p n, we intend to find a list patterns p’ 1, …, p’ n’ such that a value v matches p but none of p i if and only if it matches one of p’ j. Note that p’ 1, …, p’ n’ need not be disjoint. An algorithm that generates the least n’ is said to be optimal.
The result An algorithm, which is essentially based upon Laville’s work, is presented and proven to be optimal. Note that this is an exponential algorithm. We do handle datatypes with infinitely many constructors (integers).
A Motivating Example fun restore (R(R t, y, c), z, d) = R(B t, y, B(c, z, d)) | restore (R(a, x, R(b, y, c)), z, d) = R(B (a, x, b), y, B(c, z, d)) | restore (a, x, R(R(b, y, c), z, d)) = R(B (a, x, b), y, B(c, z, d)) | restore (a, x, R(b, y, R t)) = R(B (a, x, b), y, B t) | restore t == B t (* == indicates the need for resolving sequentiality *) withtype … The last clause in the above definition needs to be expanded into 36 ones in order to type-check.
Exhaustiveness of Patterns datatype ‘a list with nat = nil(0) | {n:nat} cons(n+1) of ‘a * ‘a list(n) fun(‘a, ‘b) zip ([], []) = [] | zip (x :: xs, y :: ys) = (x, y) :: zip (xs, ys) withtype {n:nat} ‘a list(n) * ‘b list(n) -> (‘a * ‘b) list(n) The pattern matching clauses in the definition of zip is exhaustive: neither ([], _ :: _) nor (_ :: _, []) can have type ‘a list(n) * ‘b list(n) for any natural number n.
Exhaustiveness of Patterns fun(‘a) nth_safe (0, x :: _) = x | nth_safe (i, _ :: xs) = nth_safe (i-1, xs) withtype {i:nat, n:nat | i ‘a The pattern matching clauses are also exhaustive since …
Tag Check Elimination Pat = (_, _) Pos = o.0 1 Pat = (_ :: _, _) Pos = o.1 Pat = ([], _) Pos = o.1 23 Pat = ([], [])Pat = ([], _ :: _) 45 Pat = (_ :: _, _ ::_)Pat = (_ :: _, []) 67
Interpreter (I) sort typ = Int | Bool | Fun of typ * typ sort ctx = nil | :: of typ * ctx datatype exp = Int of int | Bool of bool | Add of exp * exp | Sub of exp * exp | Eq of exp * exp | If of exp * exp * exp | One | Shift of exp | lam of exp | App of exp * exp | Fix of exp
Interpreter (II) We can refine exp with a type indes expression of sort typ * ctx : Add: {c:ctx} exp(Int, c) * exp (Int, c) -> exp (Int, c) One:{t:typ,c:ctx} exp(t, t :: c)) Shift:{ta:typ,tb:typ,c:ctx} exp(ta,c) -> exp(ta, tb :: c) Lam:{ta:typ,tb:typ,c:ctx} exp(tb, ta :: c) -> exp (Fun(ta, tb), c) …
Interpreter (III) fun evaluate e = eval (e, []) withtype {t:typ} exp(t, nil) -> value(t) and eval (Zero e, env) = let val ValInt i = eval (e, env) in ValBool (i = 0) end … …
Untagged Representation Obviously, there is no need for tags if we never do tag-checking on the values of a particular datatype However, garbage collection makes things much more difficult
Conclusion Dependent datatypes can more accurately model data structures More program errors can be detected at compile-time Code becomes more robust This is a case when safer code runs faster