Refactoring Haskell Programs Huiqing Li Computing Lab, University of Kent
21/06/2015TCS2 Outline Refactoring HaRe: The Haskell Refactorer Implementation of HaRe Formalisation of Haskell Refactorings Future Work
21/06/2015TCS3 Refactoring What? Changing the structure of existing code without changing its behaviour. module Main (main) where f z y = y : f z (y + z) main y = print $ f 1 10 module Main (main) where f y = y : f (y + 1) main = print $ f 10 Example:
21/06/2015TCS4 HaRe – The Haskell Refactorer A tool for refactoring Haskell 98 programs. Full Haskell 98 coverage. Driving concerns: usability and extensibility. Integrated with the two program editors: (X)Emacs and Vim. Preserves both comments and layout style of the source.
21/06/2015TCS5 HaRe – The Haskell Refactorer Implemented in Haskell, using Programatica’s frontends and Strafunski’s generic traversals. Programatica: a system implemented in Haskell for the development of high-assurance software in Haskell. Strafunski: A Haskell library for supporting generic programming in application areas that involve term traversals over large abstract syntaxes.
21/06/2015TCS6 Refactorings Implemented in HaRe Structural Refactorings Module Refactorings Data-Oriented Refactorings
21/06/2015TCS7 Refactorings Implemented in HaRe Structural Refactorings Generalise a definition module Main (main) where f y = y : f (y + 1) main = print $ f 10 module Main (main) where f z y = y : f z (y + z) main y = print $ f 1 10
21/06/2015TCS8 Refactorings Implemented in HaRe Structural Refactorings (cont.) Rename an identifier Promote/demote a definition to widen/narrow its scope Delete an unused function Duplicate a definition Unfold a definition Introduce a definition to name an identified expression Add an argument to a function Remove an unused argument from a function
21/06/2015TCS9 Refactorings Implemented in HaRe Module Refactorings module Test (f) where f y = y : f (y + 1) module Main where import Test main = print $ f 10 module Test ( ) where module Main where import Test f y = y : f (y + 1) main = print $ f 10 Move a definition from one module to another module
21/06/2015TCS10 Refactorings Implemented in HaRe Module Refactorings (cont.) Clean the imports Make the used entities explicitly imported Add an item to the export list Remove an item from the export list
21/06/2015TCS11 Refactorings Implemented in HaRe Data-oriented Refactorings From concrete to abstract data-type (ADT), which is a composite refactoring built from a sequence of primitive refactorings.
21/06/2015TCS12 Refactorings Implemented in HaRe
21/06/2015TCS13 Refactorings Implemented in HaRe
21/06/2015TCS14 TS: token stream; AST: annotated abstract syntax tree; MI: module information; Programatica (lexer, parser, type checker, module analysis) TS AST + MI Program TS’ analysis and transformation using Strafunski (with DrIFT) AST’ extract program from token stream Program Implementation of HaRe
21/06/2015TCS15 The Architecture of HaRe Composite Refactorings Elementary Refactorings RefacUtils (HaRe_API) Programatica Strafunski (with DrIFT) RefacLocUtils
21/06/2015TCS16 Formalisation of Refactorings Advantages: Clarify the definition of refactorings in terms of side-conditions and transformations. Improve our confidence in the behaviour-preservation of refactorings. Guide the implementation of refactorings. Challenges: Haskell is a non-trivial language. Haskell does not have an officially defined semantics.
21/06/2015TCS17 Formalisation of Refactorings Our Strategy: Start from a simple language ( letrec ). Extend the language gradually to formalise more complex refactorings.
21/06/2015TCS18 Formalisation of Refactorings The specification of a refactoring contains four parts: The representation of the program before the refactorings, say P 1 The side-conditions for the refactoring. The representation of the program after the refactorings, say P 2. A proof showing that P 1 and P 2 have the same functionality under the side-conditions.
21/06/2015TCS19 Formalisation of Refactorings The -calculus with letrec ( letrec ) Syntax of letrec terms. E ::= x | x.E | E 1 E 2 | letrec D in E D ::= | x i =E i | D, D Use the call-by-name semantics developed by Zena M. Ariola and Stefan Blom in the paper Lambda Calculi plus letrec.
21/06/2015TCS20 Formalisation of Generalisation Recall the example module Main (main) where f y = y : f (y + 1) main = print $ f 10 module Main (main) where f z y = y : f z (y + z) main = print $ f 1 10
21/06/2015TCS21 Formalisation of Generalisation Formal definition of Generalisation using letrec Given the expression: Assume E is a sub-expression of E i, and E i = C[E]. letrec x 1 =E 1,..., x i =E i,..., x n =E n in E 0
21/06/2015TCS22 Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
21/06/2015TCS23 Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
21/06/2015TCS24 Formalisation of Generalisation Formal definition of Generalisation using letrec The condition for generalising the definition x i =E i on E is: x i FV(E ) Æ 8 x, e: (x 2 FV(E ) Æ e 2 sub(E i, C) ) x 2 FV(e)) module Main (main) where f y = y : f (y + 1) main = print $ f 10
21/06/2015TCS25 Formalisation of Generalisation Formal definition of Generalisation using letrec After generalisation, the original expression becomes: letrec x 1 = E 1 [x i := x i E],..., x i = z.C[z][x i :=x i z],..., x n = E n [x i := x i E] in E 0 [x i := x i E], where z is a fresh variable. module Main (main) where f z y = y : f z (y + z) main = print $ f 1 10 module Main (main) where f y = y : f (y + 1) main = print $ f 10
21/06/2015TCS26 Formalisation of Generalisation Formal definition of Generalisation using letrec Proof. Decompose the transformation into a number of sub steps, if each sub step is behaviour-preserving, then the transformation is behaviour-preserving.
21/06/2015TCS27 Formalisation of Generalisation Step1: add definition x = z.C[z], where x and z are fresh variables, and C[E]=E i. module Main (main) where f y = y : f (y +1) x z y = y : f ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i =E i, x = z.C[z],..., x n =E n in E 0 Step 2: Replace E i with x E. (Note: E i = x E)
21/06/2015TCS28 Step 2: Replace E i with x E. (Note: E i = x E) module Main (main) where f y = x 1 y x z y = y : f ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z],..., x n =E n in E 0 Step 3: Unfolding x i in the right-hand side of x. Formalisation of Generalisation
21/06/2015TCS29 Formalisation of Generalisation Step 3: Unfolding x i in the right-hand side of x. module Main (main) where f y = x 1 y x z y = y : x 1 ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z] [x_i:= x E],..., x n =E n in E 0 Step 4: In the definition of x, replace E with z, and prove this does not change the semantics of x E.
21/06/2015TCS30 Formalisation of Generalisation Step 4: In the definition of x, replace E with z. and prove this does not change the semantics of x E. module Main (main) where f y = x 1 y x z y = y : x z ( y + z) main = print $ f 10 letrec x 1 =E 1,..., x i = x E, x = z.C[z] [x_i:= x z],..., x n =E n in E 0 Step 5: Unfolding the occurrences of x i.
21/06/2015TCS31 Formalisation of Generalisation Step 5: Unfolding the occurrences of x i. module Main (main) where f y = x 1 y x z y = y : x z ( y + z) main = print $ x 1 10 letrec x 1 =E 1 [x i := x E],..., x i = x E, x = z.C[z] [x i := x z],..., x n =E n [x i := x E] in E 0 [x i := x E] Step 6: Remove the definition of x i.
21/06/2015TCS32 Formalisation of Generalisation Step 6: Remove the definition of x i. module Main (main) where x z y = y : x z ( y + z) main = print $ x 1 10 letrec x 1 =E 1 [x i := x E],..., x = z.C[z] [x i := x z],..., x n =E n [x i := x E] in E 0 [x i := x E] Step 7: Rename x to x i and simplify the substitution.
21/06/2015TCS33 Formalisation of Generalisation module Main (main) where f z y = y : f z ( y + z) main = print $ f 1 10 letrec x 1 =E 1 [x i := x E] [x:=x i ],..., x = z.C[z] [x i := x z] [x:=x i ],..., x n =E n [x i := x E] [x:=x i ] in E 0 [x i := x E] [x:=x i ] letrec x 1 = E 1 [x i := x i E],..., x i = z.C[z][x i :=x i z],..., x n = E n [x i := x i E] in E 0 [x i := x i E]
21/06/2015TCS34 Formalisation of Refactorings letrec has been extended to model the Haskell module system ( M ). The move a definition from one module to another refactoring has also been formalised using M.
21/06/2015TCS35 Future Work Adding more refactoring to HaRe. Making use of type information. Coping with modules without source code. More interaction between HaRe and the user. Further development of the formalisation of refactorings. Metric-based refactoring. Support for scripting refactorings. Porting HaRe to GHC...
21/06/2015TCS36 Questions?