Transformation and Analysis of Haskell Source Code Neil Mitchell www.cs.york.ac.uk/~ndm 

Slides:



Advertisements
Similar presentations
Sml2java a source to source translator Justin Koser, Haakon Larsen, Jeffrey Vaughan PLI 2003 DP-COOL.
Advertisements

Optional Static Typing Guido van Rossum (with Paul Prescod, Greg Stein, and the types-SIG)
Tom Schrijvers K.U.Leuven, Belgium with Manuel Chakravarty, Martin Sulzmann and Simon Peyton Jones.
Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004.
De necessariis pre condiciones consequentia sine machina P. Consobrinus, R. Consobrinus M. Aquilifer, F. Oratio.
Type Checking, Inference, & Elaboration CS153: Compilers Greg Morrisett.
Semantics Static semantics Dynamic semantics attribute grammars
Python Programming Chapter 5: Fruitful Functions Saad Bani Mohammad Department of Computer Science Al al-Bayt University 1 st 2011/2012.
1 Temporal Claims A temporal claim is defined in Promela by the syntax: never { … body … } never is a keyword, like proctype. The body is the same as for.
CATCH: Case and Termination Checker for Haskell Neil Mitchell (Supervised by Colin Runciman)
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
0 PROGRAMMING IN HASKELL Chapter 7 - Higher-Order Functions.
Courtesy Costas Buch - RPI1 Simplifications of Context-Free Grammars.
WRT 2007 Refactoring Functional Programs Huiqing Li Simon Thompson Computing Lab Chris Brown Claus Reinke University of Kent.
CSE 130 : Winter 2006 Programming Languages Ranjit Jhala UC San Diego Lecture 7: Polymorphism.
Sparkle A theorem prover for the functional language Clean Maarten de Mol University of Nijmegen February 2002.
Operational Semantics Semantics with Applications Chapter 2 H. Nielson and F. Nielson
By: Jamie McPeek. 1. Background Information 1. Metasearch 2. Sets 3. Surface Web/Deep Web 4. The Problem 5. Application Goals.
Type Inference: CIS Seminar, 11/3/2009 Type inference: Inside the Type Checker. A presentation by: Daniel Tuck.
Cormac Flanagan University of California, Santa Cruz Hybrid Type Checking.
Database Management Systems (DBMS)
Grammars and Parsing. Sentence  Noun Verb Noun Noun  boys Noun  girls Noun  dogs Verb  like Verb  see Grammars Grammar: set of rules for generating.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Language Evaluation Criteria
Playing with Haskell Data Neil Mitchell. Overview The “boilerplate” problem Haskell’s weakness (really!) Traversals and queries Generic traversals and.
PrasadCS7761 Haskell Data Types/ADT/Modules Type/Class Hierarchy Lazy Functional Language.
Haskell programming language. Haskell is… Memory managed (allocation, collection) “Typeful” (static, strong) – Types are checked at compile-time – Types.
Scrap Your Boilerplate /~simonpj/papers/hmap/
Chapter 3 (Part 3): Mathematical Reasoning, Induction & Recursion  Recursive Algorithms (3.5)  Program Correctness (3.6)
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
A Lightweight Interactive Debugger for Haskell Simon Marlow José Iborra Bernard Pope Andy Gill.
CSC 580 – Theory of Programming Languages, Spring, 2009 Week 9: Functional Languages ML and Haskell, Dr. Dale E. Parson.
Sound Haskell Dana N. Xu University of Cambridge Joint work with Simon Peyton Jones Microsoft Research Cambridge Koen Claessen Chalmers University of Technology.
0 PROGRAMMING IN HASKELL Chapter 9 - Higher-Order Functions, Functional Parsers.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Compiling Functional Programs Mooly Sagiv Chapter 7
A Second Look At ML 1. Outline Patterns Local variable definitions A sorting example 2.
Looping and Counting Lecture 3 Hartmut Kaiser
Chapter 3 Part II Describing Syntax and Semantics.
Chapter 3 Syntax, Errors, and Debugging Fundamentals of Java.
0 Odds and Ends in Haskell: Folding, I/O, and Functors Adapted from material by Miran Lipovaca.
Copyright Curt Hill The C/C++ switch Statement A multi-path decision statement.
1 CS 457/557: Functional Languages Lists and Algebraic Datatypes Mark P Jones Portland State University.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Interactive Programs: I/O and Monads Dr. Hyunyoung Lee.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
First Order Haskell Neil Mitchell York University λ.
1 Total Pasta: Unfailing Pointer Programs Neil Mitchell, ndm AT cs.york.ac.uk Department of Computer Science, University of York.
Chapter SevenModern Programming Languages1 A Second Look At ML.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
1 Static Contract Checking for Haskell Dana N. Xu University of Cambridge Joint work with Simon Peyton Jones Microsoft Research Cambridge Koen Claessen.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
0 PROGRAMMING IN HASKELL Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources) Odds and Ends,
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Haskell With Go Faster Stripes Neil Mitchell. Catch: Project Overview Catch checks that a Haskell program won’t raise a pattern match error head [] =
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
DGrid: A Library of Large-Scale Distributed Spatial Data Structures Pieter Hooimeijer,
1 PROGRAMMING IN HASKELL An Introduction Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources)
Functional Programming
Chapter 4 (Part 3): Mathematical Reasoning, Induction & Recursion
Types CSCE 314 Spring 2016.
What, Where and Why Swift
Matching Logic An Alternative to Hoare/Floyd Logic
Reasoning about code CSE 331 University of Washington.
Combining Compile-Time and Run-Time Components
6.001 SICP Interpretation Parts of an interpreter
Programming Languages
Functional Programming and Haskell
Functional Programming and Haskell
Presentation transcript:

Transformation and Analysis of Haskell Source Code Neil Mitchell 

Why Haskell? Functional programming language  Short, beautiful programs Referential transparency  Easier to reason about and manipulate Lazy  Beta-reduction holds  Can inline easily

Goals Transform  Make transformations concise Optimise  Make programs execute faster Analyse  Generate proofs of safety  Pinpoint unsafe aspects 

Haskell Source data Core = Core [Data] [Func] data Func = Func Name [Args] Expr data Expr = Let [(Name,Expr)] Expr | App Expr [Expr] | Case Expr [(Expr,Expr)] | Var Name | Fun Name | Con Name | -- lots more

Find all functions f :: Expr  [String] f (Let x y) = concatMap (f.snd) x ++ f y f (App x y) = f x ++ concatMap f y f (Case x y) = f x ++ concatMap f [[a,b] | (a,b) <- y] f (Fun x) = [x] -- lots more cases

Removing Boilerplate uniplate x = [x | Fun x <- universe x] syb x = everything (++) ([] `mkQ` getFun) where getFun (Fun x) = [x] getFun _ = [] compos :: Tree c -> [Name] compos (Fun x) = [x] compos x = composOpFold [] (++) compos x

Generic Traversals Reduce the quantity of code Make programs more readable Make code more robust My extra goal: Use Haskell 98 (no scary types)

Fewer Extensions Uniplate (GHC, Yhc, nhc, Hugs – H98)  Advanced features require Hugs/GHC – H’ SYB (GHC 6.4+ only)  Requires rank-2 types  Data instances in the compiler Compos (GHC 6.6+ only)  Rank-2 types  GADT’s (very unportable)

Central Idea class Uniplate a where uniplate :: a  ([a], [a]  a) uniplate x = (get,set) Children  maximal contained items of the same type  Get the children  Set a new set of children

Traversals Queries  Extract information out  Already seen an example Transformations  Create a modified value  Some change

Removing Let’s The operation removeLet (Let bind x) = Just $ substitute bind x removeLet _ = Nothing The transformation removeAllLet = rewrite removeLet

Concise and Fast

Uniplate in the World My uses  Optimiser, Analyser  Hoogle (Haskell search engine)  Dr Haskell (Haskell tutorial tool) Matt Naylor’s uses (see next)  Reach, Reduceron Several other projects  Configurations, QHC, Javascript generator…

Optimisation Goal  Haskell code should be as fast a C  Code should remain high-level Central idea  Remove overhead  Remove intermediate steps

Intermediate Steps Eliminate values (data/functions)  length [1..n]  not (not x) INPUTOUTPUT

The Method Remove higher order functions 1.Either: using specialise/inline rule 2.Or: using over/under staturation rules Convert data to functions  Church encoding Remove higher order functions Leaves little data or functions

First Order Haskell Remove lambda abstractions (lambda lift) Leaving only partial application/currying odd = (.) not even (.) f g x = f (g x) Generate templates (specialised bits)

Oversaturation f x y z, where arity(f) < 3 main = odd 12 x = (.) not even x main = 12

Undersaturation f x (g y) z, where arity(g) > 1 x = (.) not even x x = not (even x) x = x

Special Rules let z = f x y, where arity(f) > 2  (let-under) rule  inline z, after sharing x and y d = Ctor (f x) y, where arity(f) > 1  (ctor-under) rule  inline d  The “dictionary” rule

Standard Rules let x = (let y=z in q) in … let/let case (let x=y in z) of … case/let case (case x of …) of … case/case (case x of …) y z app/case case C x of … case/ctor

Church Encoding data List a = Nil | Cons a (List a) len x = case x of Nil  0 Cons y ys  1 + len ys nil= \n c  n cons x y= \n c  c x y len x = x 0 (\y ys  1 + len ys)

The Preliminary Results

Future Work Refactoring  Requires extensible transformations  Needs to integrate with GHC’s IO Monad More Benchmarks Proofs  Correctness  Laziness/strictness preserving  Termination

Analysis: Pattern matching Haskell programs may crash at runtime  Pattern-match errors are quite common head “neil” = ‘n’ head [] =  Can get very complex 

The Goal Statically prove the absence of pattern- match errors  Be conservative  Generate a “proof” of safety Entirely automatic  No annotations Practical  Catch tool has been released 

A Pattern-Match Error In Haskell you match a value with a set of patterns  Patterns do not have to be exhaustive A “default” pattern is inserted, calling error Analysis:  Can the error case be reached?  What are the preconditions on functions? 

Preconditions Calculate a precondition on the input  Sufficient to ensure the output is never   INPUT OUTPUT 

Properties Calculate a precondition on the input  Sufficient to ensure a particular output INPUT OUTPUT 

Automatic inference Can automatically infer the properties and preconditions  Precondition of error is False  Precondition of an expression can be expressed as preconditions of its parts  Properties are used for calculating preconditions on function results 

Constraints All based on the partitioning of a function  Constraints on values are used BP constraints – list of patterns RE constraints – use regular expressions MP constraints – clever list of patterns  Used in Catch 

MP Constraints Haskell has recursive data structures data List  = Nil | Cons  (List  ) MP is: non-recursive  recursive  Non-recursive represents top-level values  Recursive represents all other values (Cons _ *)  (Cons _ * | Nil) 

MP Examples (Cons _ *)  (Cons _ * | Nil)  Non-empty list (Cons True *)  (Cons True *)  Infinite list of True True  _  The value True (Zero | One | Pos)  _  A natural number 

Key MP Property Any proposition on MP constraints of one variable is equivalent to one MP constraint (True  _)  (False  _) = (_  _)  Works in all cases Results in simplification, and fast analysis 

A real-world program XMonad: An window manager for X  Lots of low-level details  A single pure core module “StackSet”  No special annotations Running Catch:  $ catch StackSet.hs --quiet Checking StackSet 14 error calls found All proven safe

One XMonad sample views n | n < 1 = … | otherwise = h : g t where (h:t) = [f i | i ← [1..n]] This is safe for Int, Integer Not safe for all numeric types 

Analysis Times Lines of Code Secs 

Catch in the Real World XMonad was proven safe  Developers have started using it as standard FilePath library checked FiniteMap library checked HsColour program checked  Found 3 previously unknown, genuine bugs 

Conclusions Transform: Uniplate  Concise and fast code  Without scary types (beginner friendly) Optimise: Supero  Fast code, with reasonable compile times Analyse: Catch  Can automatically check real world programs  Can find genuine bugs 