PROGRAMMING IN HASKELL Types, Modules, and I/O Based on lecture notes by Graham Hutton The book “Learn You a Haskell for Great Good” (and a few other sources) 1
Binary Trees In computing, it is often useful to store data in a two-way branching structure or binary tree. 5 7 9 6 3 4 1 2
Using recursion, a suitable new type to represent such binary trees can be declared by: data Tree = Leaf Int | Node Tree Int Tree For example, the tree on the previous slide would be represented as follows: Node (Node (Leaf 1) 3 (Leaf 4)) 5 (Node (Leaf 6) 7 (Leaf 9)) 3
We can now define a function that decides if a given integer occurs in a binary tree: occurs :: Int Tree Bool occurs m (Leaf n) = m==n occurs m (Node l n r) = m==n || occurs m l || occurs m r But… in the worst case, when the integer does not occur, this function traverses the entire tree. 4
Now consider the function flatten that returns the list of all the integers contained in a tree: flatten :: Tree [Int] flatten (Leaf n) = [n] flatten (Node l n r) = flatten l ++ [n] ++ flatten r A tree is a search tree if it flattens to a list that is ordered. Our example tree is a search tree, as it flattens to the ordered list [1,3,4,5,6,7,9]. 5
Search trees have the important property that when trying to find a value in a tree we can always decide which of the two sub-trees it may occur in: occurs m (Leaf n) = m==n occurs m (Node l n r) | m==n = True | m<n = occurs m l | m>n = occurs m r This new definition is more efficient, because it only traverses one path down the tree. 6
Exercise Node (Node (Leaf 1) 3 (Leaf 4)) 5 (Node (Leaf 6) 7 (Leaf 9)) A binary tree is complete if the two sub-trees of every node are of equal size. Define a function that decides if a binary tree is complete. data Tree = Leaf Int | Node Tree Int Tree occurs :: Int Tree Bool occurs m (Leaf n) = m==n occurs m (Node l n r) = m==n || occurs m l || occurs m r 7
Modules So far, we’ve been using built-in functions provided in the Haskell prelude. This is a subset of a larger library that is provided with any installation of Haskell. (Google for Hoogle to see a handy search engine for these.) Examples of other modules: - lists - concurrent programming - complex numbers - char - sets - … 8
This is a function in Data.List that removes duplicates from a list. Example: Data.List To load a module, we need to import it: import Data.List All the functions in this module are immediately available: numUniques :: (Eq a) => [a] -> Int numUniques = length . nub function concatenation This is a function in Data.List that removes duplicates from a list. 9
You can also load modules from the command prompt: ghci> :m + Data.List Or several at once: ghci> :m + Data.List Data.Map Data.Set Or import only some, or all but some: import Data.List (nub, sort) import Data.List hiding (nub) 10
If duplication of names is an issue, can extend the namespace: import qualified Data.Map This imports the functions, but we have to use Data.Map to use them – like Data.Map.filter. When the Data.Map gets a bit long, we can provide an alias: import qualified Data.Map as M And now we can just type M.filter, and the normal list filter will just be filter. 11
ghci> intersperse '.' "MONKEY" "M.O.N.K.E.Y" Data.List has a lot more functionality than we’ve seen. A few examples: ghci> intersperse '.' "MONKEY" "M.O.N.K.E.Y" ghci> intersperse 0 [1,2,3,4,5,6] [1,0,2,0,3,0,4,0,5,0,6] ghci> intercalate " " ["hey","there","guys"] "hey there guys" ghci> intercalate [0,0,0] [[1,2,3],[4,5,6], [7,8,9]] [1,2,3,0,0,0,4,5,6,0,0,0,7,8,9] 12
ghci> transpose [[1,2,3],[4,5,6], [7,8,9]] And even more: ghci> transpose [[1,2,3],[4,5,6], [7,8,9]] [[1,4,7],[2,5,8],[3,6,9]] ghci> transpose ["hey","there","guys"] ["htg","ehu","yey","rs","e"] ghci> concat ["foo","bar","car"] "foobarcar" ghci> concat [[3,4,5],[2,3,4],[2,1,1]] [3,4,5,2,3,4,2,1,1] 13
ghci> and $ map (>4) [5,6,7,8] True And even more: ghci> and $ map (>4) [5,6,7,8] True ghci> and $ map (==4) [4,4,4,3,4] False ghci> any (==4) [2,3,5,6,1,4] True ghci> all (>4) [6,9,10] True 14
A nice example: adding functions Functions are often represented as vectors: 8x^3 + 5x^2 + x - 1 is [8,5,1,-1]. So we can easily use List functions to add these vectors: ghci> map sum $ transpose [[0,3,5,9], [10,0,0,9],[8,5,1,-1]] [18,8,6,17] 15
There are a ton of these functions, so I could spend all semester covering just lists. More examples: group, sort, dropWhile, takeWhile, partition, isPrefixOf, find, findIndex, delete, words, insert,… Instead, I’ll make sure to post a link to a good overview of lists on the webpage, in case you need them. In essence, if it’s a useful thing to do to a list, Haskell probably supports it! 16
Examples: isAlpha, isLower, isSpace, isDigit, isPunctuation,… The Data.Char module: includes a lot of useful functions that will look similar to python, actually. Examples: isAlpha, isLower, isSpace, isDigit, isPunctuation,… ghci> all isAlphaNum "bobby283" True ghci> all isAlphaNum "eddy the fish!"False ghci> groupBy ((==) `on` isSpace) "hey guys its me" ["hey"," ","guys"," ","its"," ","me"] 17
The Data.Char module has a datatype that is a set of comparisons on characters. There is a function called generalCategory that returns the information. (This is a bit like the Ordering type for numbers, which returns LT, EQ, or GT.) ghci> generalCategory ' ' Space ghci> generalCategory 'A' UppercaseLetter ghci> generalCategory 'a' LowercaseLetter ghci> generalCategory '.' OtherPunctuation ghci> generalCategory '9' DecimalNumber ghci> map generalCategory " ¥t¥nA9?|" [Space,Control,Control,UppercaseLetter,DecimalNumber,OtherPunctuation,MathSymbol] ] 18
There are also functions that can convert between Ints and Chars: ghci> map digitToInt "FF85AB" [15,15,8,5,10,11] ghci> intToDigit 15 'f' ghci> intToDigit 5 '5' ghci> chr 97 'a' ghci> map ord "abcdefgh" [97,98,99,100,101,102,103,104] 19
Neat application: Ceasar ciphers A primitive encryption cipher which encodes messages by shifted them a fixed amount in the alphabet. Example: hello with shift of 3 encode :: Int -> String -> String encode shift msg = let ords = map ord msg shifted = map (+ shift) ords in map chr shifted 20
ghci> encode 3 "Heeeeey" "Khhhhh|" ghci> encode 4 "Heeeeey" Now to use it: ghci> encode 3 "Heeeeey" "Khhhhh|" ghci> encode 4 "Heeeeey" "Liiiii}" ghci> encode 1 "abcd" "bcde" ghci> encode 5 "Marry Christmas! Ho ho ho!” "Rfww~%Hmwnxyrfx&%Mt%mt%mt&" 21
Decoding just reverses the encoding: decode :: Int -> String -> String decode shift msg = encode (negate shift) msg ghci> encode 3 "Im a little teapot" "Lp#d#olwwoh#whdsrw" ghci> decode 3 "Lp#d#olwwoh#whdsrw" "Im a little teapot" ghci> decode 5 . encode 5 $ "This is a sentence" "This is a sentence" 22
Making our own modules We specify our own modules at the beginning of a file. For example, if we had a set of geometry functions: module Geometry ( sphereVolume , sphereArea , cubeVolume , cubeArea , cuboidArea , cuboidVolume ) where 23
Then, we put the functions that the module uses: sphereVolume :: Float -> Float sphereVolume radius = (4.0 / 3.0) * pi * (radius ^ 3) sphereArea :: Float -> Float sphereArea radius = 4 * pi * (radius ^ 2) cubeVolume :: Float -> Float cubeVolume side = cuboidVolume side side side … 24
Note that we can have “private” helper functions, also: cuboidVolume :: Float -> Float -> Float -> Float cuboidVolume a b c = rectangleArea a b * c cuboidArea :: Float -> Float -> Float -> Float cuboidArea a b c = rectangleArea a b * 2 + rectangleArea a c * 2 + rectangleArea c b * 2 rectangleArea :: Float -> Float -> Float rectangleArea a b = a * b 25
Each will hold a separate group of functions. To load: Can also nest these. Make a folder called Geometry, with 3 files inside it: Sphere.hs Cubiod.hs Cube.hs Each will hold a separate group of functions. To load: import Geometry.Sphere Or (if functions have same names): import qualified Geometry.Sphere as Sphere 26
module Geometry.Sphere ( volume , area ) where The modules: module Geometry.Sphere ( volume , area ) where volume :: Float -> Float volume radius = (4.0 / 3.0) * pi * (radius ^ 3) area :: Float -> Float area radius = 4 * pi * (radius ^ 2) 27
module Geometry.Cuboid ( volume , area ) where volume :: Float -> Float -> Float -> Float volume a b c = rectangleArea a b * c … 28
File I/O So far, we’ve worked mainly at the prompt, and done very little true input or output. This is logical in a functional language, since nothing has side effects! However, this is a problem with I/O, since the whole point is to take input (and hence change some value) and then output something (which requires changing the state of the screen or other I/O device. Luckily, Haskell offers work-arounds that separate the more imperative I/O. 29
A simple example: save the following file as helloword.hs main = putStrLn "hello, world" Now we actually compile a program: $ ghc --make helloworld [1 of 1] Compiling Main ( helloworld.hs, helloworld.o ) Linking helloworld ... $ ./helloworld hello, world 30
What are these functions? ghci> :t putStrLn putStrLn :: String -> IO () ghci> :t putStrLn "hello, world" putStrLn "hello, world" :: IO () So putStrLn takes a string and returns an I/O action (which has a result type of (), the empty tuple). In Haskell, an I/O action is one with a side effect - usually either reading or printing. Usually some kind of a return value, where () is a dummy value for no return. 31
A more interesting example: An I/O action will only be performed when you give it the name “main” and then run the program. A more interesting example: main = do putStrLn "Hello, what's your name?” name <- getLine putStrLn ("Hey " ++ name ++ ", you rock!") Notice the do statement - more imperative style. Each step is an I/O action, and these glue together. 32
More on getLine: ghci> :t getLine getLine :: IO String This is the first I/O we’ve seen that doesn’t have an empty tuple type - it has a String. Once the string is returned, we use the <- to bind the result to the specified identifier. Notice this is the first non-functional action we’ve seen, since this function will NOT have the same value every time it is run! This is called “impure” code, and the value name is “tainted”. 33
nameTag = "Hello, my name is " ++ getLine An invalid example: nameTag = "Hello, my name is " ++ getLine What’s the problem? Well, ++ requires both parameters to have the same type. What is the return type of getLine? Another word of warning: what does the following do? name = getLine 34
ghci> putStrLn "HEEY" HEEY Just remember that I/O actions are only performed in a few possible places: A main function inside a bigger I/O block that we have composed with a do (and remember that the last action can’t be bound to a name, since that is the one that is the return type). At the ghci prompt: ghci> putStrLn "HEEY" HEEY 35
Note that <- is for I/O, and let for expressions. You can use let statements inside do blocks, to call other functions (and with no “in” part required): import Data.Char main = do putStrLn "What's your first name?" firstName <- getLine putStrLn "What's your last name?" lastName <- getLine let bigFirstName = map toUpper firstName bigLastName = map toUpper lastName putStrLn $ "hey " ++ bigFirstName ++ " " ++ bigLastName ++ ", how are you?" Note that <- is for I/O, and let for expressions. 36
Return in haskell: NOT like other languages. main = do line <- getLine if null line then return () else do putStrLn $ reverseWords line main reverseWords :: String -> String reverseWords = unwords . map reverse . words Note: reverseWords = unwords . map reverse . words is the same as reverseWords st = nwords (map reverse (words st)) 37
What is return? Does NOT signal the end of execution! Return instead makes an I/O action out of a pure value. main = do a <- return "hell" b <- return "yeah!" putStrLn $ a ++ " " ++ b In essence, return is the opposite of <-. Instead of “unwrapping” I/O Strings, it wraps them. 38
Last example was a bit redundant, though – could use a let instead: main = do let a = "hell" b = "yeah" putStrLn $ a ++ " " ++ b Usually, you’ll use return to create I/O actions that don’t do anything (but you have to have one anyway, like an if-then-else), or for the last line of a do block, so it returns some value we want. 39
print (works on any type in show, but calls show first) Other I/O functions: print (works on any type in show, but calls show first) putStr - And as putStrLn, but no newline putChar and getChar main = do print True print 2 print "haha" print 3.2 print [3,4,3] main = do c <- getChar if c /= ' ' then do putChar c main else return () 40
More advanced functionality is available in Control.Monad: import Control.Monad import Data.Char main = forever $ do putStr "Give me some input: " l <- getLine putStrLn $ map toUpper l (Will indefinitely ask for input and print it back out capitalized.) 41
sequence: takes list of I/O actions and does them one after the other Other functions: sequence: takes list of I/O actions and does them one after the other mapM: takes a function (which returns an I/O) and maps it over a list Others available in Control.Monad: when: takes boolean and I/O action. If bool is true, returns same I/O, and if false, does a return instead 42
System Level programming Scripting functionality deals with I/O as a necessity. The module System.Environment has several to help with this: getArgs: returns a list of the arguments that the program was run with getProgName: returns the string which is the program name (Note: I’ll be assuming you compile using “ghc –make myprogram” and then running “./myprogram”. But you could also do “runhaskell myprogram.hs”.) 43
An example: import System.Environment import Data.List main = do args <- getArgs progName <- getProgName putStrLn "The arguments are:" mapM putStrLn args putStrLn "The program name is:" putStrLn progName 44
$ ./arg-test first second w00t "multi word arg" The arguments are: The output: $ ./arg-test first second w00t "multi word arg" The arguments are: first second w00t multi word arg The program name is: arg-test 45