Debugging Haskell by Observing Intermediate Data Structures Andy Gill Galois Connections andy@galconn.com www.cse.ogi.edu/~andy
Debugging any program… Debugging is locating a misunderstanding Finding the difference between What you told the computer to do What you think you told the computer to do Involves running code Sometimes about finding out what someone else told the computer to do A Debugger must never, never lie You are gathering evidence, the debugger is under oath
Traditional debugging tools The “printf” debuggers Insert printf statements into code Generally the dump and inspect approach Written Statements CLI or GUI Interactive Debuggers GDB Visual Studio Witnesses
Tracer Debuggers Trace reduction steps, producing extra output Present dynamically the internal reduction graph Two ways of getting the trace – both compiler specific Instrumenting the code via transformations Use a modified internal reduction interpreter … There is another way … Two major technical problems The size of the traces can be huge The structures that the user might browse are also huge
Algorithmic Debugging Also called declarative debugging Semi-automated binary search The system asks Yes/No questions, starting at “main” If a function is giving incorrect results One of the sub functions must be faulty Or this function must be faulty Can use oracles to good effect Expensive asserts Strange interactions with IO
Work on Haskell Debuggers Tracing ART/HAT project – York (nhc) HOOD – OGI GHood, HOOD in NHC, HOOD in Hugs. Declarative Freja – Hendrik Nilsson Budda – Melbourne, Australia Testing Frameworks Quickcheck – Chalmers Auburn – York Look at www.haskell.org/debugging
Understanding Haskell execution natural :: Int -> [Int] natural = reverse . map (`mod` 10) . takeWhile (/= 0) . iterate (`div` 10) Main> natural 3408 [3,4,0,8] :: [Int] 3 : 4 : 0 : 8 : [] 8 : 0 : 4 : 3 : [] 3408 : 340 : 34 : 3 : [] 3408 : 340 : 34 : 3 : 0 : … 3408
The Haskell Object Observation Debugger Provide combinators for debugging STG Hugs or GHC or NHC (others will follow) Frustrated Haskell programmer uses combinators to annotate their code, and re-runs the program The execution of the Haskell program allows observations to be made internally At the termination of the Haskell execution, observed structures are displayed to stderr in a pretty printed form Think written statements, but not witnesses…
Using trace for debugging Function called trace trace :: String -> a -> a Print the string to stderr & return the second argument All Haskell systems have this function
trace can help… foo (x:xs) (y:xs) = … foo [] [] = … Can rewrite this as foo x y | trace (“foo:” ++ show (x,y)) False = undefined
Problems with trace Very Bad News The best we had – until now… Invasive to your code foo (n + 1) => let r = foo (n + 1) in trace (show(n+1,r)) r Evaluation of first argument can cause other traces to fire, mid trace. End up with a spaghetti output Perhaps trace should be evaluate its argument? Can change the evaluation order of your program! Very Bad News The best we had – until now…
Our debugging combinator Provide a way of annotating a specific expression with a label. observe :: (Observable a) => String -> a -> a boring = observe “after” . reverse . observe “during” . observe “before” Similar to probe in Hawk, but works over most data types.
First example printO :: Show a => a -> IO () Main> printO (take 2 (observe “list” [1..10])) printO :: Show a => a -> IO () Opens the debugging context evaluates the argument with possible observations made Closes the debugging context, pretty prints observed structures. ( 1 : 2 : _)
Second example ( 1 : 2 : _) printO ( let lst = [1..10] lst1 = take 2 (observe “list” lst) in sum (lst ++ lst1) ) ( 1 : 2 : _)
Annotating natural Focuses on the intermediate data structures Add observe annotations to natural, capturing the flow between the subcomponents. natural = reverse . observe "after map …” . map (`mod` 10) . observe "after takeWhile …” . takeWhile (/= 0) . observe "after iterate …” . iterate (`div` 10) Focuses on the intermediate data structures
Output from debugging natural -- after iterate (`div` 10) { (3408 : 340 : 34 : 3 : 0 : _) } -- after takeWhile (/= 0) { ( 3408 : 340 : 34 : 3 : [] ) -- after map (`mod` 10) { ( 8 : 0 : 4 : 3 : [] )
What can we observe? Base types Structured Types Int, Char, Integer, Bool, Float, Double, () Structured Types Tuples, Lists, Maybe, Arrays Exceptions? What about Monads? (List and Maybe are Monads). What about Functions? observe :: (Observe a,Observe b) => String -> (a -> b) -> a -> b What does this definition mean?
Observing functions A observed function Only called a specific arguments Finite map from argument to result For observational purposes, functions are A set of pairs, representing argument and result Isomorphic to {(a,b)} Is this a Set or a Bag? We can pretty print in a Haskell like manner
Example of function observation Main> printO (map (observe "null" null) [[],[1..]]) -- null { \ [] -> True , \ (_: _) -> False } Shows only the specific invocations used.
Example of higher order function observation Main> printO $ (observe “map null” map null [[],[1..]]) -- map null { \ { \ [] -> True , \ (_ : _) -> False } ([] : (_ : _) : []) -> (True : False : [])
Debugging natural (again) Now use HO debugging natural = . observe “reverse” reverse . observe "map (`mod` 10)” map (`mod` 10) . observe "takeWhile (/= 0)” takeWhile (/= 0) . observe "iterate (`div 10)” iterate (`div` 10) Focus is now what the sub-components do
Output from debugging natural -- iterate (`div' 10) \ { \ 3408 -> 340 , \ 340 -> 34 , \ 34 -> 3 , \ 3 -> 0 } 3408 -> (3408 : 340 : 34 : 3 : 0 : _) -- takeWhile (/= 0) \ { \ 3408 -> True , \ 340 -> True , \ 34 -> True , \ 3 -> True , \ 0 -> False (3408 : 340 : 34 : 3 : 0 : _) -> ( 3408 : 340 : 34 : 3 : [] )
Bad implementation of observe observe :: (Show a) => String -> a -> a observe label a = trace (label ++ “:” ++ show a) a We’ve changed the strictness of the second argument Would fail on our natural example, because one of the intermediate lists is infinite Needs to be a member of the class Show Functions are not Show-able.
How does lazy evaluation work? p => (:) p,2 => (:) p,2,2 => [] p p,1 p,2 p,2,1 p,2,2 (:) Thunk Thunk (:) Thunk (:) Thunk []
Structural observers class Observer a where observer :: a -> (PATH,LABEL) -> a -- Example for (a,b) instance (Observer a,Observe b) => Observe (a,b) where -- observer :: (a,b) -> (PATH,LABEL) -> (a,b) observer (a,b) (path,label) = unsafePerformIO do { sendPacket “… (,) … path … label … ” ; return (observer a (1 : path,label), observer b (2 : path,label)) }
Systematic side effects fst (observe “…” (f x,44)) = fst (observer (f x,44) cxt) = fst (unsafe $ do { “tell about tuple” ; return (observe (f x) (1:cxt),observe 44 (2:cxt)) }) -- I’m a 2-tuple at path “cxt” ( _ , _ ) = fst (observe (f x) (1:cxt),observe 44 (2:cxt)) = observe (f x) (1:cxt) = seq (f x) $ (unsafe $ do { “tell about the number” ; return (f x) }) = unsafe $ do { “tell about the number” ; return 99 } -- I’m an 99 at path “1 : cxt” ( 99 , _ ) = 99
Status and future of HOOD Current Status V0.1 - Available on web: www.haskell.org/hood Works with Hugs, GHC, STG Hugs, NHC Handles GHC threads fine Can catch and observe exceptions (errors, ^C, etc). -- list ( 1 : 2 : error “boom” : _ ) The Future V0.2 – Interactive browser via XML file (already in NHC’s version) Polymorphic observations (using RTS hooks) Operational semantics for the debugger? Extensions GHOOD – Shows trees instead of pretty printed Haskell. Quickcheck – for showing counterexamples.
Conclusions Haskell has something resembling a debugging tool that works on real Haskell Type classes can be used to augment a data structure evaluation with side-effecting functions Observation of non-trivial examples is possible
Demo … import Observe n = 10 x1 = foldr (+) 0 [1..n] y1 = foldr (observe "Add" (+)) 0 (observe "input" [1..n]) z1 = printO y1 x2 = foldl (+) 0 [1..n] x3 = foldr (+) 0 (reverse [1..n]) y3 = foldr (+) 0 (take 4 (observe "revlist" (reverse (observe "input" [1..n])))) z3 = printO x3 x4 = foldl (+) 0 (reverse [1..n]) y4 = foldl (+) 0 (observe "revlist" (reverse (observe "input" [1..n]))) z4 = printO y4
Homework Download HOOD Download HOOD documentation Investigate the following foldr (+) 0 [1..n] foldl (+) 0 [1..n] foldr (+) 0 (reverse [1..n]) foldl (+) 0 (reverse [1..n])