U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Processes & Threads Emery Berger and Mark Corner University of Massachusetts Amherst
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2 Outline Processes Threads Basic synchronization Bake-off
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science What is a Process? One or more threads running in an address space 3
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science What is an address space? The collection of data and code for the program organized in an organized (addressable) region. 4
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Why do we need processes? A process (with a single thread) supports a serial flow of execution Is that good enough? What if something goes wrong? What if something takes a long time? What if we have more than one user? 5
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 6 Processes vs. Threads… Both useful –parallel programming & concurrency –Hide latency –Maximize CPU utilization –Handle multiple, asynchronous events But: different programming styles, performance characteristics, and more
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 7 Processes Process: execution context (PC, registers) + address space, files, etc. Basic unit of execution
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 8 Process API UNIX: –fork() – create copy of current process Different return value Copy-on-write –exec() – replace process w/ executable Windows: –CreateProcess (…) 10 arguments
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 9 Translation Lookaside Buffer TLB: fast, fully associative memory –Stores page numbers (key), frame (value) in which they are stored Copy-on-write: protect pages, copy on first write
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 10 Processes Example Do we need the sleep?
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 11 Quick Activity Print out fib(0) … fib(100), in parallel –Main loop should fork off children (returns pid of child) int pid; pid = fork(); if (pid == 0) { // I’m child process } else { // I’m parent process } –Wait for children (waitpid(pid,0,0))
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 12 Exercise // Print fib(0)... fib(100) in parallel. int main() { int cids[101]; // Saved child pids. for (int i = 0; i < 101; i++) { // Fork off a bunch of processes to run fib(i). int cid = fork(); if (cid == 0) { // I am the child. cout << "Fib(" << i << ") = " << fib(i) << endl; return 0; // Now the child is done, so exit (IMPORTANT!). } else { // I am the parent. Store the child’s pid somewhere. cids[i] = cid; } // Wait for all the children. for (int i = 0; i < 101; i++) { waitpid (cids[i], 0, 0); // Wait for the ith child. } return 0; }
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 13 Communication Processes: –Input = state before fork() –Output = return value argument to exit() But: how can processes communicate during execution?
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 14 IPC signals –Processes can send & receive ints associated with particular signal numbers Process must set up signal handlers –Best for rare events (SIGSEGV) Not terribly useful for parallel or concurrent programming pipes –Communication channels – easy & fast –Just like UNIX command line ls | wc -l
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 15 int main() { int pfds[2]; pipe(pfds); if (!fork()) { close(1); /* close normal stdout */ dup(pfds[1]); /* make stdout same as pfds[1] */ close(pfds[0]); /* we don't need this */ execlp("ls", "ls", NULL); } else { close(0); /* close normal stdin */ dup(pfds[0]); /* make stdin same as pfds[0] */ close(pfds[1]); /* we don't need this */ execlp("wc", "wc", "-l", NULL); } } Pipe example
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 16 IPC, continued sockets –Explicit message passing –Can distribute processes anywhere mmap (common and aws0m hack) –All processes map same file into fixed memory location –Objects in region shared across processes –Use process-level synchronization Much more expensive than threads…
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 17 Threads Processes - everything in distinct address space Threads – same address space (& files, sockets, etc.)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 18 Threads API UNIX (POSIX): –pthread_create() – start separate thread executing function –pthread_join() – wait for thread to complete Windows: –CreateThread (…) only 6 arguments!
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 19 Threads example #include void * run (void * d) { int q = ((int) d); int v = 0; for (int i = 0; i < q; i++) { v = v + expensiveComputation(i); } return (void *) v; } main() { pthread_t t1, t2; int *r1, *r2; pthread_create (&t1, NULL, run, (int *)100); pthread_create (&t2, NLL, run, (int *)100); pthread_join (t1, (void **) &r1); pthread_join (t2, (void **) &r2); cout << "r1 = " << *r1 << ", r2 = " << *r2 << endl; }
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 20 Exercise Compute fib(100), in parallel –Function definition: void * funName (void * arg) –Creating thread: int tid; pthread_create(&tid, NULL, funName, argument); –Wait for child: pthread_join(tid,&result)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 21 Exercise
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 22 Communication In threads, everything shared except: stacks, registers & thread-specific data –Old way: pthread_setspecific pthread_getspecific –New way: __thread static __thread int x; Easier in Java… Updates of shared state must be synchronized
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 23 Bake-off Processes or threads? –Performance –Flexibility / Ease-of-use –Robustness
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 24 Scheduling
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 25 Context Switch Cost Threads – much cheaper –Stash registers, PC (“IP”), stack pointer Processes – above plus –Process context –TLB shootdown Process switches more expensive, or require long quanta
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 26 Flexibility / Ease-of-use Processes – more flexible –Easy to spawn remotely ssh foo.cs.umass.edu “ls -l” –Can communicate via sockets = can be distributed across cluster / Internet –Requires explicit communication or risky hackery Threads –Communicate through memory – must be on same machine –Require thread-safe code
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 27 Robustness Processes – far more robust –Processes isolated from other processes Process dies – no effect –Apache 1.x Threads: –If one thread crashes (e.g., derefs NULL), whole process terminates –Then there’s the stack size problem –Apache 2.x…
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 28 The End