Shared Memory Programming Michael Cormier
Memory in Message-Passing Computers What we've been using P1 M1 P2 M2 P3 M3 Cluster P4 M4 P5 M5
Memory in Shared Memory Systems The actual topic of the lecture P1 P2 P3 Shared Memory P4 P5
Systems With Shared Memory Where is this stuff, anyway? Single system --Special-purpose systems --Off-the-shelf hardware (Core 2 Quad, etc.) --This lecture assumes this type Distributed --Makes a cluster look like a single machine, as above --Can use hardware or software --Tune in next class for more on these
Advantages of Shared Memory Systems Why do we care? --Eliminates data transfer costs (except in distributed shared memory systems) --Common in desktops --Can use existing thread/process methods to write programs
Disadvantages of Shared Memory Systems What's the catch? --Can be difficult to maintain memory consistency --Large numbers of processors can have different memory access times --Can't just add another node as in a cluster
Programming in Shared Memory Systems These aren't as funny to you as they are to me, are they? --Two key requirements --Consistent --Deadlock-free --Many ways of implementing parallelism --Processes --Threads --Specialized languages --Many versions of each of the above
Consistency and Deadlocking Keeping things straight --Consistency --Processes accessing shared variables can store incorrect values Process A Process B load x load x x = 4 x = 5 store x store x --What is x? --Must avoid this situation (race condition) --Deadlock --Process A is waiting for process B, but process B is waiting for process A
Locks ...And throw away the key! --Method of ensuring consistency --Spin locks: Processes continue to loop until they can enter the critical section --Effective, but inefficient --Descheduling: A blocked process is halted until it can resume --Eliminates busy waiting --Complex, and has overhead in changing processes --Must decide which properties are most important
Semaphores SOS! SOS! --Another method for synchronization --Integer variable that is tested, waited for, and incremented atomically --P(s): Wait until s > 0 --V(s): Increment s --Can be used for mutual exclusion --Can be used to control access to buffers, etc. --Can be used to control order of execution
Monitors Not the lizards, the other kind --Semaphores are versatile but prone to error in practice --Monitors are intended to make it easier to synchronize processes --Only one process can be executing any method that a monitor applies to at one time --Can be inefficient: assume F1 accesses x and y, F2 accesses x, and F3 accesses y. They are all under the same monitor because F1's access must be protected. F2 and F3 do not conflict, but they cannot run simultaneously either
Bernstein's Conditions They may be boring but they're important --Used to determine if two processes can be safely run in parallel --Two processes Pi and Pj can be run in parallel if and only if None of the inputs to Pi are outputs of Pj None of the inputs to Pj are outputs of Pi None of the outputs of Pi are outputs of Pj None of the outputs of Pj are outputs of Pi
Deadlock Just about as bad as it sounds --Occurs when two or more processes are waiting for each other to finish before continuing --Program ceases to function --Must design programs carefully to avoid deadlock
UNIX Processes Luke! Use the fork()! --Operating system feature --Makes a copy of the parent process' memory at creation --Shared memory must be explicitly created --Much slower to create a new process than a new thread --Can be very useful, if the OS supports them --Most other operating systems have processes of some sort
Threads Nice threads, man! --Threads can be created much faster than processes --PThreads --C library for thread support --Standardized in POSIX-compliant OSes --Java threads --Created with runnable objects --Portable
Parallel Languages What would perpendicular languages be like? --Constructs: --Declaring a variable to be shared --Executing a list of statements in parallel --forall: Executing iterations of a for loop in parallel --Ada --Developed by the US Dept. of Defense --Not widely adopted outside military applications --OpenMP --Uses compiler directives and an existing sequential programming language --Thread-based, but uses fork/join commands like processes --Includes easy-to-use synchronization constructs --Many other constructs are implemented