How do Haskell developers use monads? A preliminary assessment Ismael Figueroa ismael.figueroa@pucv.cl http://zeus.inf.ucv.cl/~ifigueroa http://zeus.inf.ucv.cl/~ifigueroa/doku.php/research/empirical-monads
This works aims to bridge this gap in empirical knowledge About this work Programming with monads is perhaps one of the defining characteristics of the Haskell language However, despite the prevalence of monads and their use in programming Haskell software… we are not aware of any empirical studies on how Haskell developers use monads in their software This works aims to bridge this gap in empirical knowledge According to our results, the State monad is by far the most used one
Monads in a nutshell
Monads come from Category Theory (Moggi, ‘91) Notions of Computation: a value of type A is different from a computation of type T A Monads provide a mechanism for reasoning about non-pure programs in the lambda calculus
Monads in Functional Programming (Wadler, ‘92) class Monad m where return :: a -> m a (>>=) m a -> (a -> m b) -> m b Wadler proposes the use of monads for practical functional programming with notions of computation. In Haskell this is reflected by the Monad typeclass
mtl: monad transformers library Haskell provides a standardized programming interface for monadic programming: the mtl library. Based on monad transformers (Liang+ ’96), a mechanism for modular composition of monads Each notion of computation is implemented in several modules, data types and typeclasses
mtl Identity Error List Continuations State Reader Writer RWS pure computations computations that may fail may yield multiple non-deterministic values can be suspended, passed around, and resumed mtl mutable access to a collection of values read-only access to a collection of values write-only access to a collection of values combines State, Reader, and Writer State Reader Writer RWS
Monads as a “Design Pattern” dequeue :: StateT Identity [Int] dequeue = do queue <- get put (tail queue) return (head queue)
Monads as a “Design Pattern” dequeue :: StateT Identity [Int] dequeue = do queue <- get put (tail queue) return (head queue) StateT transformer Identity “Monad Stack” do is a sequence of the monad >>= function
Monads as a “Design Pattern” ErrorT transformer dequeue :: ErrorT String (StateT Identity [Int]) dequeue = do queue <- get put (tail queue) return (head queue) StateT transformer Identity Code is uniform with respect to the monad stack in use, avoids code repetition and boilerplate
Monads as a “Design Pattern” dequeue :: ErrorT String (StateT Identity [Int]) dequeue = do queue <- get if null queue then throwError "Empty queue“ else do put (tail queue) return (head queue) ErrorT transformer StateT transformer Identity Code is uniform with respect to the monad stack in use, avoids code repetition and boilerplate
Research Questions RQ1: How many packages directly depend on the mtl library? What is their distribution with respect to package metadata? RQ2: What is the usage distribution of mtl’s monads in packages that directly import mtl? RQ3: What is the situation of alternative implementations to the mtl, and their usage on existing packages?
Methodology
Methodology / Processing Pipeline Part 1: Simpler, answers just RQ1 Part 2: Complex, answers RQ2 and RQ3
Methodology / Package Index hackage.haskell.org is the de-facto repository for open-source Haskell software A tar.gz file with the hierarchical structure of Hackage and the package metadata
Methodology / Package Index Each subfolder is a version Each package has a folder Each versión is described by a .cabal file
Methodology / .cabal files key/value metadata
Methodology / Initial Package Data Name Version Stability Categories Dependencies Modules provided Main modules, for executables Other info for further processing… This data is enough to answer RQ1 It only requires processing the .cabal files, which is simple with the Cabal API
Methodology / Package Archives Processing Full source code of the package This is quite a complex step! Custom analysis program, using haskell-src-exts Set of modules imported by the package modules
Methodology /Packages with monad usage Initial Package Data, plus: Set of imported modules Indicator flags, 0 or 1, for each module that is available in the mtl With this we can answer RQ2 and RQ3! But it is complex and costly, we need to download and analyse the source code of each package version
Results
Results / RQ1 How many packages directly depend on the mtl library? What is their distribution with respect to package metadata? mtl-packages by stability , threshold at 3% Stability distribution of relevant categories mtl-packages by category, threshold at 5% On the 11171 packages analysed, we found 2803 mtl-packages Some packages had several categories, after considering this multiplicity we got 3670 unique package/category entries
Results / RQ2 What is the usage distribution of mtl’s monads in packages that directly import mtl? 600+ packages import mtl but use no monad??? Most packages use between 0 and 3 monads, and only outliers use more than 6… The State monad is the most used one, followed by the Reader monad. The Trans typeclass, used to create monad Transformers is also widely used!!
Results / RQ3 What is the situation of alternative implementations to the mtl, and their usage on existing packages? We found 104 packages that do not import mtl, but use one module that matches the name of an mtl module 78 packages use an alternative implementation of mtl The remaining 26 appear to implement their own customized versión of the monads they use
ONLY THE “LATEST” VERSION Limitations ONLY THE “LATEST” VERSION 11171 packages ~3GB of data 436 packages with parsing errors 451231 provided modules
Summary / Conclusions packages analysed: 11171 mtl-packages: 2803 (3670 single category) alt-mtl packages: 78 (104 with custom impl) parsed modules: 451231 pkgs with parsing errors: 438 Around 25% of sampled packages import the mtl The State monad is the most used one The monad Transformers module is also widely used Around 1% of sampled packages use alternatives to mtl We would like to answer qualitative questions: What are developers actually doing with the monad transformer class? How and why are developers creating their own monads? How does the usage of monads evolve over time, versions? http://zeus.inf.ucv.cl/~ifigueroa/doku.php/research/empirical-monads