Chapter 4: Threads
Chapter 4: Threads Topics: Overview Multithreading Models Thread Libraries Threading Issues Operating System Examples Windows XP Threads Linux Threads Objectives: To introduce the notion of a thread — a fundamental unit of CPU utilization that forms the basis of multithreaded computer systems To discuss the APIs for the Pthreads, Win32, and Java thread libraries To examine issues related to multithreaded programming
Threads Motivation Threads run within application Multiple tasks with the application can be implemented by separate threads Update display Fetch data Spell checking Answer a network request Process creation is heavy-weight while thread creation is light-weight Can simplify code, increase efficiency Kernels are generally multithreaded
Benefits Responsiveness Resource Sharing Economy Scalability
Question Which pieces of information from a process can be shared by all threads of a process and which ones need to be unique? Process info: State Program counter Registers Open files Stack Text section (code) Data
Single and Multithreaded Processes
Multithreaded Server Architecture
Multicore Programming Multicore systems putting pressure on programmers, challenges include: Dividing activities Balance Data splitting Data dependency Testing and debugging
Concurrency Single-core System Multicore System
Multithreading Models
User Threads Thread management done by user-level threads library Three primary thread libraries: POSIX Pthreads Win32 threads Java threads
Kernel Threads Supported by the Kernel Examples Windows XP/2000 Solaris Linux Tru64 UNIX Mac OS X
Multithreading Models Many-to-One One-to-One Many-to-Many Two-level Main considerations: Limited number of threads? Concurrency: What if a thread makes a blocking system call? Allows for parallel execution on multiprocessor machines?
Many-to-One Many user-level threads mapped to single kernel thread Examples: Solaris Green Threads GNU Portable Threads Only 1 thread can access the kernel at a time
One-to-One Each user-level thread maps to kernel thread Examples: Windows NT/XP/2000, Linux, Solaris 9 and later
Many-to-Many Model Allows many user level threads to be mapped to many kernel threads Allows the operating system to create a sufficient number of kernel threads Examples: Solaris prior to version 9 Windows NT/2000 with the ThreadFiber package
Two-level Model Similar to M:M, except that it allows a user thread to be bound to kernel thread Examples IRIX HP-UX Tru64 UNIX Solaris 8 and earlier
Thread Libraries
Thread Libraries Thread library provides programmer with API for creating and managing threads Two primary ways of implementing Library entirely in user space Kernel-level library supported by the OS
Pthreads May be provided either as user-level or kernel-level A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization API specifies behavior of the thread library, implementation is up to development of the library Policy vs. mechanism Common in UNIX operating systems (Solaris, Linux, Mac OS X & in Windows via shareware implementations)
Examples Pthreads http://www.cs.mtsu.edu/~hcarroll/3250/private/code/chapter04/pthreadsExample.c Win32 API http://www.cs.mtsu.edu/~hcarroll/3250/private/code/chapter04/win32ThreadsExample.c Java http://www.cs.mtsu.edu/~hcarroll/3250/private/code/chapter04/Driver.java
Java Threads Java threads are managed by the JVM Typically implemented using the threads model provided by underlying OS Java threads may be created by: Extending Thread class Implementing the Runnable interface
Tuesday, January 31, 2012 lab1 – graded copies pushed out today @ 4:00 PM lab2 – due tomorrow hwk04 – due Thursday http://www.youtube.com/watch?v=0QRO3gKj3qw “What is Google Chrome OS?”
Threading Issues
Threading Issues Scheduler activations Semantics of fork() and exec() system calls Thread cancellation of target thread Asynchronous or deferred Signal handling Synchronous and asynchronous Thread pools Thread-specific data Create Facility needed for data private to thread
Scheduler Activations Both many-to-one and many-to-many models require communication to maintain the appropriate number of kernel threads allocated to the application Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library This communication allows an application to maintain the correct number kernel threads
Lightweight Processes Virtual processor (LWP) between thread library (user thread(s)) and kernel Not present in all OSes In some contexts, LWP refers to the kernel thread
Tracking Threads on ranger ps –eLf command: $ ps -eLf | head -n 1; ps -eLf | grep $USER UID PID PPID LWP C NLWP STIME TTY TIME CMD hcarroll 13291 14715 13291 0 2 13:23 pts/22 00:00:00 ./pthreadsExample 55555555555 hcarroll 13291 14715 13292 98 2 13:23 pts/22 00:00:31 ./pthreadsExample 55555555555 hcarroll 13458 14715 13458 0 1 13:24 pts/22 00:00:00 ps -eLf hcarroll 13459 14715 13459 0 1 13:24 pts/22 00:00:00 grep hcarroll root 14712 1 14712 0 1 12:21 ? 00:00:00 sshd: hcarroll [priv] hcarroll 14714 14712 14714 0 1 12:21 ? 00:00:00 sshd: hcarroll@pts/22 hcarroll 14715 14714 14715 0 1 12:21 pts/22 00:00:00 -bash hcarroll 30081 14715 30081 0 1 12:45 pts/22 00:00:00 emacs
Semantics of fork() and exec() What does fork() do? What does exec() do? Does fork() duplicate only the calling thread or all threads for that process? Discuss with your neighbor(s) how can you test this? Q: How can we learn more about these commands? A: “man exec -S 3”
Multi-processes with Multi-threads http://www.cs.mtsu.edu/~hcarroll/3250/private/code/chapter04/webServerSkeleton.c http://www.cs.mtsu.edu/~hcarroll/3250/private/code/chapter04/webServerSkeleton-withFork.c ps -eLf | head -n 1; ps -eLf | grep $USER
Thread Cancellation Terminating a thread before it has finished Two general approaches: Asynchronous cancellation terminates the target thread immediately. Deferred cancellation allows the target thread to periodically check if it should be cancelled.
Signal Handling Signals are used in UNIX systems to notify a process that a particular event has occurred. A signal handler is used to process signals Signal is generated by particular event Signal is delivered to a process Signal is handled Options for multi-threaded processes: Deliver the signal to the thread to which the signal applies Deliver the signal to every thread in the process Deliver the signal to certain threads in the process Assign a specific thread to receive all signals for the process
Thread Pools Create a number of threads in a pool where they await work Advantages: Usually slightly faster to service a request with an existing thread than create a new thread Allows the number of threads in the application(s) to be bound to the size of the pool
Thread Specific Data Allows each thread to have its own copy of data Useful when you do not have control over the thread creation process (i.e., when using a thread pool)
Operating System Examples
Windows XP Threads Implements the one-to-one mapping, kernel-level Each thread contains A thread id Register set Separate user and kernel stacks Private data storage area The register set, stacks, and private storage area are known as the context of the threads The primary data structures of a thread include: ETHREAD (executive thread block) KTHREAD (kernel thread block) TEB (thread environment block)
Windows XP Threads Data Structures
Linux Threads fork() and clone() system calls Doesn’t distinguish between process and thread Uses term task rather than thread clone() takes options to determine sharing on process create struct task_struct points to process data structures (shared or unique)
Recap Provide 2 programming example in which multithreading provides better performance than a single-threaded solution. What are at least 2 differences between a kernel thread and a user thread? What is the difference between creating and managing processes and threads (e.g., what do they share)? Which of the following components of program state are shared across threads in a multithreaded process? (Exercise 4.10) Register values, Heap memory, Global variables, Stack memory Consider a multiprocessor system and a multithreaded program written using the many-to-many threading model. Let the number of user-level threads in the program be more than the number of processors in the system. Discuss the performance implications of the following scenarios: (Exercise 4.14) Number of kernel threads allocated to the program is less than the number of cores. Number of kernel threads allocated to the program is equal to the number of cores. Number of kernel threads allocated to the program is greater than the number of cores but less than the number of user-level threads.
Recap Provide 2 programming example in which multithreading provides better performance than a single-threaded solution. A Web server that services each request in a separate thread. A parallelized application such as matrix multiplication where different parts of the matrix may be worked on in parallel. An interactive GUI program such as a debugger where a thread is used to monitor user input, another thread represents the running application, and a third thread monitors performance. What are at least 2 differences between a kernel thread and a user thread? User-level threads are unknown by the kernel, whereas the kernel is aware of kernel threads. On systems using either M:1 or M:N mapping, user threads are scheduled by the thread library and the kernel schedules kernel threads. Kernel threads need not be associated with a process whereas every user thread belongs to a process. Kernel threads are generally more expensive to maintain than user threads as they must be represented with a kernel data structure.
Recap What is the difference between creating and managing processes and threads (e.g., what do they share)? Process (using fork()): Makes a COPY of the process (global & local variables) Shares open files (e.g., pipes) Execution starts directly after fork() Threads (using pthread_create() / CreateThread()): Shares global variables, memory space (e.g., things on the heap), open files, etc. Execution starts in functions <Timing example: timing-*.c> Which of the following components of program state are shared across threads in a multithreaded process? (Exercise 4.10) Register values, Heap memory, Global variables, Stack memory
Recap Consider a multiprocessor system and a multithreaded program written using the many-to-many threading model. Let the number of user-level threads in the program be more than the number of processors in the system. Discuss the performance implications of the following scenarios: (Exercise 4.10) The number of kernel threads allocated to the program is less than the number of processors. Not all of the processors will be utilized The number of kernel threads allocated to the program is equal to the number of processors. Whenever a kernel thread is blocked, a processor will go unutilized The number of kernel threads allocated to the program is greater than the number of processors but less than the number of user-level threads. Allows for the greatest utilization of processors and the kernel threads will be scheduled on the processors
End of Chapter 4