Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009.

Slides:



Advertisements
Similar presentations
Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
Advertisements

Parallel Haskell Tutorial: The Par Monad Simon Marlow Ryan Newton Simon Peyton Jones.
Comparing and Optimising Parallel Haskell Implementations on Multicore Jost Berthold Simon Marlow Abyd Al Zain Kevin Hammond.
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Kathleen Fisher cs242 Reading: “A history of Haskell: Being lazy with class”,A history of Haskell: Being lazy with class Section 6.4 and Section 7 “Monads.
Concurrency Important and difficult (Ada slides copied from Ed Schonberg)
Kathleen Fisher cs242 Reading: A Tutorial on Parallel and Concurrent Programming in HaskellA Tutorial on Parallel and Concurrent Programming in Haskell.
CSE 230 Parallelism Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart.
Software Transactional Memory Steve Severance – Alpha Heavy Industries.
Functional Programming Universitatea Politehnica Bucuresti Adina Magda Florea
Designing an ADT The design of an ADT should evolve naturally during the problem-solving process Questions to ask when designing an ADT What data does.
Kernighan/Ritchie: Kelley/Pohl:
Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010.
CS 4800 By Brandon Andrews.  Specifications  Goals  Applications  Design Steps  Testing.
Chapter 7Louden, Programming Languages1 Chapter 7 - Control I: Expressions and Statements "Control" is the general study of the semantics of execution.
Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Ulf Norell.
George Blank University Lecturer. CS 602 Java and the Web Object Oriented Software Development Using Java Chapter 4.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Introduction to Computers and Programming Lecture 5 Boolean type; if statement Professor: Evan Korth New York University.
Concurrency and Software Transactional Memories Satnam Singh, Microsoft Faculty Summit 2005.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Introduction to Functional Programming Notes for CSCE 190 Based on Sebesta,
Chapter 7Louden, Programming Languages1 Chapter 7 - Control I: Expressions and Statements "Control" is the general study of the semantics of execution.
Java CourseWinter 2009/10. Introduction Object oriented, imperative programming language. Developed: Inspired by C++ programming language.
Review of C++ Programming Part II Sheng-Fang Huang.
C# Tutorial From C++ to C#. Some useful links Msdn C# us/library/kx37x362.aspxhttp://msdn.microsoft.com/en- us/library/kx37x362.aspx.
Intel Concurrent Collections (for Haskell) Ryan Newton*, Chih-Ping Chen*, Simon Marlow+ *Intel +Microsoft Research Software and Services Group Jul 29,
Intel Concurrent Collections (for Haskell) Ryan Newton, Chih-Ping Chen, Simon Marlow Software and Services Group Jul 27, 2010.
Lightweight Concurrency in GHC KC Sivaramakrishnan Tim Harris Simon Marlow Simon Peyton Jones 1.
Introduction to Object Oriented Programming. Object Oriented Programming Technique used to develop programs revolving around the real world entities In.
Object Oriented Programming: Java Edition By: Samuel Robinson.
Basics of Java IMPORTANT: Read Chap 1-6 of How to think like a… Lecture 3.
CSE 230 Concurrency: STM Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 8.
Haskell - A Perspective Presented by Gábor Lipták April 2011.
1 Interfacing & embedding. 2 References  Full tutorial and references Embedding and Interfacing Manual Windows - select “Documentation” from the ECLiPSe.
Introduction to Java Lecture Notes 3. Variables l A variable is a name for a location in memory used to hold a value. In Java data declaration is identical.
CS333 Intro to Operating Systems Jonathan Walpole.
C# Classes and Inheritance CNS 3260 C#.NET Software Development.
COMP313A Functional Programming (1)
1 Haskell Kevin Atkinson John Simmons. 2 Contents Introduction Type System Variables Functions Module System I/O Conclusions.
Li Tak Sing COMPS311F. Threads A thread is a single sequential flow of control within a program. Many programming languages only allow you to write programs.
QIAN XI COS597C 10/28/2010 Explicit Concurrent Programming in Haskell.
Overview of C++ Templates
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Interactive Programs: I/O and Monads Dr. Hyunyoung Lee.
Parallel Processing (CS526) Spring 2012(Week 8).  Shared Memory Architecture  Shared Memory Programming & PLs  Java Threads  Preparing the Environment.
Programming Languages and Paradigms Activation Records in Java.
Advanced Functional Programming Tim Sheard 1 Lecture 17 Advanced Functional Programming Tim Sheard Oregon Graduate Institute of Science & Technology Lecture:
Page :Algorithms in the Real World Parallelism: Lecture 1 Nested parallelism Cost model Parallel techniques and algorithms
New Language Features For Parallel and Asynchronous Execution Morten Kromberg Dyalog LTD Dyalog’13.
Methods.
1 Linked Lists Assignment What about assignment? –Suppose you have linked lists: List lst1, lst2; lst1.push_front( 35 ); lst1.push_front( 18 ); lst2.push_front(
Christopher Brown and Kevin Hammond School of Computer Science, University of St. Andrews July 2010 Ever Decreasing Circles: Implementing a skeleton for.
Exception Handling How to handle the runtime errors.
Concurrency in Java MD. ANISUR RAHMAN. slide 2 Concurrency  Multiprogramming  Single processor runs several programs at the same time  Each program.
Object Oriented Programming Lecture 2: BallWorld.
24-Jun-16 Haskell Dealing with impurity. Purity Haskell is a “pure” functional programming language Functions have no side effects Input/output is a side.
FASTFAST All rights reserved © MEP Make programming fun again.
Parallelism and Concurrency Koen Lindström Claessen Chalmers University Gothenburg, Sweden Patrik Jansson.
Principles of programming languages 12: Functional programming
Parallelism and Concurrency
(Nested) Open Memory Transactions in Haskell
Haskell Dealing with impurity 30-Nov-18.
Haskell Dealing with impurity 8-Apr-19.
Haskell Dealing with impurity 29-May-19.
Haskell Dealing with impurity 28-Jun-19.
Presentation transcript:

Semi-Explicit Parallel Programming in Haskell Satnam Singh Microsoft Research Cambridge Leeds2009

0119

01 9

public class ArraySummer { private double[] a; // Encapsulated array private double sum; // Variable used to compute sum // Constructor requiring an initial value for array public ArraySummer(double[] values) { a = values; } // Method to compute the sum of segment of the array public void SumArray(int fromIndex, int toIndex, out double arraySum) { sum = 0; for (int i = fromIndex; i < toIndex; i++) sum = sum + a[i]; arraySum = sum; }

thread 1 thread 2 ThreadCreate thread.Start thread.Join

class Program { static void Main(string[] args) { const int testSize = ; double[] testValues = new double[testSize] ; for (int i = 0; i < testSize; i++) testValues[i] = i/testSize; ArraySummer summer = new ArraySummer(testValues) ; Stopwatch stopWatch = new Stopwatch(); stopWatch.Start(); double testSum ; summer.SumArray(0, testSize, out testSum); TimeSpan ts = stopWatch.Elapsed; Console.WriteLine("Sum duration (mili-seconds) = " + stopWatch.ElapsedMilliseconds); Console.WriteLine("Sum value = " + testSum); Console.ReadKey(); }

class Program { static void Main(string[] args) { const int testSize = ; double[] testValues = new double[testSize]; for (int i = 0; i < testSize; i++) testValues[i] = i / testSize; ArraySummer summer = new ArraySummer(testValues); Stopwatch stopWatch = new Stopwatch(); stopWatch.Start(); double testSumA = 0 ; double testSumB; Thread sumThread = new Thread(delegate() { summer.SumArray(0, testSize / 2, out testSumA); }); sumThread.Start(); summer.SumArray(testSize/2+1, testSize, out testSumB); sumThread.Join(); TimeSpan ts = stopWatch.Elapsed; Console.WriteLine("Sum duration (mili-seconds) = " + stopWatch.ElapsedMilliseconds); Console.WriteLine("Sum value = " + (testSumA+testSumB)); Console.ReadKey(); }

The Accidental Semi-colon ;

A ; B ; createThread (A) ; B; A B AB

Execution Model fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) fib 0 = 0 fib 1 = 1 fib n = fib (n-1) + fib (n-2) “Thunk” for “fib 10” Pointer to the implementation Storage slot for the result Values for free variables

wombat and numbat wombat :: Int -> Int wombat n = 42*n numbat :: Int -> IO Int numbat n = do c <- getChar return (n + ord c) pure function side-effecting function Computation inside a ‘monad’

IO (), pronounced “IO unit” numbat :: IO () numbat = do c <- getChar putChar (chr (1 + ord c))

f (g + h) z!!2mapM f [a, b,..., g] infer type [Int] -> BoolIO String pure function deterministic stateful operation may be non-deterministic

Functional Programming to the Rescue? Why not evaluate every-sub expression of our pure functional programs in parallel? –execute each sub-expression in its own thread? The 80s dream does not work: –granularity –data-dependency

Infix Operators mod a b mod 7 3 = 1 Infix with backquotes: a `mod` b 7 `mod` 3 = 1

x `par` y x is sparked for speculative evaluation a spark can potentially be instantiated on a thread running in parallel with the parent thread x `par` y = y typically x used inside y blurRows `par` (mix blurCols blurRows)

x `par` (y + x) x y y is evaluated first x x is evaluated second x is sparked x fizzles

x `par` (y + x) x y y is evaluated on P1 x x is taken up for evaluation on P2 x is sparked on P1 P1P2

par is Not Enough pseq :: a -> b -> b pseq is strict in its first argument but not in its second argument Related function: – seq :: a -> b -> b –Strict in both arguments –Compiler may transform seq x y to seq y x –No good for controlling order for evaluation for parallel programs

Don Stewart Parallel fib with threshold cutoff = Threshold for parallel evaluation -- Sequential fib fib' :: Int -> Integer fib' 0 = 0 fib' 1 = 1 fib' n = fib' (n-1) + fib' (n-2) -- Parallel fib with thresholding fib :: Int -> Integer fib n | n < cutoff = fib' n | otherwise = r `par` (l `pseq` l + r) where l = fib (n-1) r = fib (n-2) -- Main program main = forM_ [0..45] $ \i -> printf "n=%d => %d\n" i (fib i)

Parallel fib performance

Parallel quicksort (wrong) quicksortN :: (Ord a) => [a] -> [a] quicksortN [] = [] quicksortN [x] = [x] quicksortN (x:xs) = losort `par` hisort `par` losort ++ (x:hisort) where losort = quicksortN [y|y <- xs, y < x] hisort = quicksortN [y|y = x]

What went wrong? cons cell Unevaluated thunk losort

forceList forceList :: [a] -> () forceList [] = () forceList (x:xs) = x `seq` forceList xs

Parallel quicksort (right) quicksortF [] = [] quicksortF [x] = [x] quicksortF (x:xs) = (forceList losort) `par` (forceList hisort) `par` losort ++ (x:hisort) where losort = quicksortF [y|y <- xs, y < x] hisort = quicksortF [y|y = x]

parSumArray :: Array Int Double -> Double parSumArray matrix = lhs `par` (rhs`pseq` lhs + rhs) where lhs = seqSum 0 (nrValues `div` 2) matrix rhs = seqSum (nrValues `div` 2 + 1) (nrValues-1) matrix

Strategies Haskell provides a collection of evaluation strategies for controlling the evaluation order of various data-types. Users have to define indicate how their own types are evaluated to a normal form. Algorithms + Strategy = Parallelism, P. W. Trinder, K. Hammond, H.-W. Loidl and S. L. Peyton Jones. tml/Strategies/strategies.htmlhttp:// tml/Strategies/strategies.html

Explicitly Creating Threads forkIO :: IO () -> ThreadID Creates a lightweight Haskell thread, not an operating system thread.

Inter-thread Communication putMVar :: MVar a -> IO () takeMVar :: MVar a -> IO a

MVars mv... putMVar mv v <- takeMVar mv... 52empty

Rendezvous threadA :: MVar Int -> MVar Float -> IO () threadA valueToSendMVar valueReceivedMVar = do -- some work -- new perform rendezvous by sending 72 putMVar valueToSendMVar send value v <- takeMVar valueToReadMVar putStrLn (show v)

Rendezvous threadB :: MVar Int -> MVar Float -> IO () threadB valueToReceiveMVar valueToSendMVar = do -- some work -- now perform rendezvous by waiting on value z <- takeMVar valueToReceiveMVar putMVar valueToSendMVar (1.2 * z) -- continue with other work

Rendezvous main :: IO () main = do aMVar <- newEmptyMVar bMVar <- newEmptyMVar forkIO (threadA aMVar bMVar) forkIO (threadB aMVar bMVar) threadDelay BAD!

fib again fib :: Int -> Int -- As before fibThread :: Int -> MVar Int -> IO () fibThread n resultMVar = putMVar resultMVar (fib n) sumEuler :: Int -> Int -- As before

fib fixed fibThread :: Int -> MVar Int -> IO () fibThread n resultMVar = do pseq f (return ()) putMVar resultMVar f where f = fib n

$ time fibForkIO +RTS -N1 real 0m40.473s user 0m0.000s sys 0m0.031s $ time fibForkIO +RTS -N2 real 0m38.580s user 0m0.000s sys 0m0.015s

43 “STM”s in Haskell data STM a instance Monad STM -- Monads support "do" notation and sequencing -- Exceptions throw :: Exception -> STM a catch :: STM a -> (Exception->STM a) -> STM a -- Running STM computations atomically :: STM a -> IO a retry :: STM a orElse :: STM a -> STM a -> STM a -- Transactional variables data TVar a newTVar :: a -> STM (TVar a) readTVar :: TVar a -> STM a writeTVar :: TVar a -> a -> STM ()

Transactional Memory do {...this...} orelse {...that...} tries to run “this” If “this” retries, it runs “that” instead If both retry, the do-block retries. GetEither() will thereby wait for there to be an item in either queue Q1 Q2 R void GetEither() { atomic { do { i = Q1.Get(); } orelse { i = Q2.Get(); } R.Put( i );}

ThreadScope GHC run-time can generate eventlogs. Instrument: –thread creating, start/stop, migration –GCs ThreadScope graphical viewer Q: how to mine / understand the information?

Lots Unsaid xperf / VTune correlation Verification Debugging Parallel garbage collection

Summary Three ways of writing parallel and concurrent programs in Haskell: –`par` and `pseq` (semi-explicit parallelism) –Mvars (explicit concurrency) –STM (explicit concurrency with transactions) Implicit concurrency Pure functional programming has pros and cons for parallel programming. Can mainstream languages take advantage of the same techniques? How can visualization help with performance tuning?