Functional Programming And Why It Matters
In the beginning… There were the forces of light… There were the forces of darkness… Actually, there were two very early higher level languages, Fortran and Lisp Fortran was an imperative language, and became mainstream Lisp was a functional language, with fewer followers Both languages had many descendants Fortran: C, Algol, Pascal, Python, etc., plus modern versions of Fortran Lisp: Erlang, Haskell, OCaml, etc., plus modern versions of Lisp (such as Clojure)
Why did Fortran pull ahead? Fortran was slightly earlier in time A great deal of work was put into optimizing Fortran, so that it could compete with assembly language Fortran was (and is) better for heavy number crunching, which made it more suitable for military uses
The Lisp influence “All languages converge to Lisp.” -- Anonymous Modern languages all contain many features that originated in Lisp What did Lisp have that Fortran didn’t? Recursion Automatic garbage collection Data structures other than arrays What does Lisp have that Java 7 still doesn’t? Functions (not the same as methods!) Ability to treat functions as values Immutable (persistent) data structures Macros C has “macros”, but they are difficult and badly integrated into the language Homoiconicity
Functions as values All modern languages have recursion, automatic garbage collection (except C++), and a variety of data structures Many languages now have the ability to treat a function as an ordinary value (like an integer or a string) There is a way to write a “literal” (anonymous) function Functions can be stored in variables and in collections Functions can be passed as parameters to functions Functions can be returned as the result of a function There are operators to combine functions into new functions
Functions in FP languages Given a set of input values, a function produces an output value Given the same input values, a function always produces the same output value Functions can use only the information provided by their parameters This excludes “functions” that return the date, the time, or a random number Functions have no side effects Functions don’t do input or output Functions don’t change the values of any variables or data Functions only return a value Consequently, functions are easier to reason about To understand a function, you need examine only the function itself A function can use other functions, and of course you need to know what those functions are supposed to compute (but nothing about how they do it) In addition, functions can be called in any order, including in parallel
Immutable and persistent data structures An immutable data structure is one that, once created, cannot be modified Immutable data structures can (usually) be copied, with modifications, to create a new version The modified version takes up as much memory as the original version If all data is immutable, we may need infeasible amounts of memory A persistent data structure is one that, when modified, retains both the old and the new values Persistent data structures are effectively immutable, in that prior references to it do not see any change Modifying a persistent data structure may copy part of the original, but the new version shares memory with the original We generally talk about functional programming as having immutable data, but really it’s persistent
Lists Lists are the original persistent data structures, and are very heavily used in functional programming w [w, x, y, z] x z y [x, y, z] [y, z] The tail of [x, y, z] is [y, z]. Adding an element w to [x, y, z] doesn’t change [x, y, z] itself. 8
Immutable values Fundamental to functional programming is that all data is immutable or persistent You can only compute new data from it Structure sharing makes this feasible Consequences: Lists, not arrays, are the fundamental data structure Arrays are designed for random access; lists are designed for working from one end Complex data structures can be built more easily from lists than from arrays Recursion replaces loops as a fundamental “control structure” You never need to protect data with mutexes (locks)
“No silver bullet” There are language zealots that will try to convince you that some particular language (usually Lisp) will solve all the worlds problems. If Lisp is so great, why isn’t everybody using it? In a sense, both Lisp and Prolog are “perfect,” or nearly so Both are direct computer implementations of beautiful mathematical structures Both are extremely powerful in certain domains “In theory, there is no difference between theory and practice. But in practice, there is.” -- Jan L. A. van de Snepscheut Functional programming will not solve all the world’s problems, but… It solves an awful lot of problems when doing concurrent programming It provides some really nice tools even when you aren’t doing concurrent programming
Working with lists Arrays: Loop over the array, doing something with each element in turn Lists: Do something with the head, and recur with the tail Head: The first element in the list Tail: The list that remains when you step past the head. Occasionally you may need to work at the “wrong end” of a list General solution: Reverse the list, work at the head, and reverse the result
Basic list operations There aren’t very many Get the head of the list (hd in Erlang) Get the tail of the list (tl in Erlang) Add a new element E to the list L ([E|L] in Erlang) Test if a list is empty (L==[] in Erlang)
Recursion on lists Basic strategy: Return some value if the list is empty Do something with the head Recur with the tail Finding the length of a list (Do It Yourself version): myLength([]) -> 0; myLength([_|T]) -> 1 + myLength(T). Again, but with tail recursion: myLengthTR(List) -> myLengthHelper(0, List). myLengthHelper(Acc, []) -> Acc; myLengthHelper(Acc, [_|T]) -> myLengthHelper(Acc + 1, T).
Standard higher-order functions A higher-order function is a function that (1) takes a function as a parameter, or (2) returns a function as its value, or both map(Fun, List) Applies Fun to each element of List, returning a list of the results. The result may contain values of a type different that those in List . filter(Pred, List) Returns a list of the elements of List that satisfy the predicate Pred. The result will be a sublist of List . The following standard function comes in a variety of flavors—whether it works left to right or right to left, and whether it needs an explicit accumulator foldl(Fun, Acc, List) Calls Fun on successive pairs of elements of List , starting with Acc The type returned is the type of Acc
More higher-order functions all(Pred, List) any(Pred, List) takewhile(Pred, List) dropwhile(Pred, List) flatten(DeepList) flatmap(Fun, List) foreach(Fun, List) partition(Pred, List) zip(List1, List2) unzip(List)
takewhile in Java Suppose you have a list of numbers, and you want a new list consisting of only the positive numbers at the beginning of the given list List newList<Integer> = new LinkedList<>(); for (int e : oldList) { if (e <= 0) break; newList.append(e); } That’s not as easy as takewhile(fun(X) -> X > 0 end, List) but you can work through it without too much difficulty Suppose you also want to get only the even numbers at the beginning, or only the prime numbers, or only ones less than 100 You cannot reuse the above Java code; you have to do it all over again!
Loop elimination Good code is clear and easy to understand takewhile(fun(X) -> X > 0 end, List) is clear and easy to understand (once you know the syntax) Loops always need to be worked through in order to understand them (well, almost always) for (int i; i < array.length; i++) array[i] = 0; is pretty clear Functional programming languages typically provide quite a few very useful higher-order functions In a lot of cases, these functions can replace loops FP languages may also provide list comprehensions, which again are clearer than loops, and can often replace them When all else fails, loops can be replaced by recursion, which is (arguably) no harder to understand than loops
Functions that return functions Erlang has a built-in version of quicksort, but you could write your own: quicksort([]) -> []; quicksort([H | T]) -> quicksort( [ X || X <- T, X < H ]) ++ [H] ++ quicksort([ X || X <- T, X >= H ]). But what if you wanted to sort something other than numbers, say, student records? The key point to notice is that X < H (and X >= H, which is really just not(X < H)) is just a binary predicate You could rewrite this function to pass in some binary predicate You could also write a function that takes in just a binary predicate, and returns a sort function
Summary In a functional language, especially a “pure” one, you lose: Global variables The ability to change the values of variables The ability to write loops You gain: The ability to treat functions just like any other values A useful set of higher-order functions More ways to avoid code duplication (DRY) Far better ways to deal with concurrency
The End
I’m thinking of something. Animal Vegetable Mineral 16