Threads
Thread A basic unit of CPU utilization. An Abstract data type representing an independent flow of control within a process A traditional (or heavyweight) process has a single thread of control If a process has multiple threads of control, it can perform more than one task at a time. Threads are a way for a program to split itself into two or more simultaneously running tasks. That is the real excitement surrounding threads
Thread Examples word processor word processor may have a thread for displaying graphics, another thread for responding to keystrokes from the user, and a third thread for performing spelling and grammar checking in the background
Thread Example: Multithreaded Server Architecture
Single and Multithreaded Processes
One or More Threads in a Process an execution state (Running, Ready, etc.) saved thread context when not running (TCB) an execution stack some per-thread static storage for local variables access to the shared memory and resources of its process (all threads of a process share this) Each thread has:
Threads vs. Processes
Benefits Responsivene ss Resource Sharing Economy Scalability
Benefits of Threads Takes less time to create a new thread than a process Less time to terminate a thread than a process Switching between two threads takes less time than switching between processes Threads enhance efficiency in communication between programs
Types of Threads User Level Thread (ULT) Kernel level Thread (KLT)
User-Level Threads (ULTs) Thread management is done by the application The kernel is not aware of the existence of threads Not the kind we’ve discussed so far.
Thread switching does not require kernel mode privileges (no mode switches) Scheduling can be application specific ULTs can run on any OS
Disadvantages of ULTs In a typical OS many system calls are blocking as a result, when a ULT executes a system call, not only is that thread blocked, but all of the threads within the process are blocked In a pure ULT strategy, a multithreaded application cannot take advantage of multiprocessing
Overcoming ULT Disadvantages Jacketing converts a blocking system call into a non-blocking system call Writing an application as multiple processes rather than multiple threads
Kernel-Level Threads (KLTs) Thread management is done by the kernel (could call them KMT) no thread management is done by the application Windows is an example of this approach
Advantages of KLTs The kernel can simultaneously schedule multiple threads from the same process on multiple processors If one thread in a process is blocked, the kernel can schedule another thread of the same process
Disadvantage of KLTs The transfer of control from one thread to another within the same process requires a mode switch to the kernel
Multiple Cores & Multithreading Multithreading and multicore chips have the potential to improve performance of applications that have large amounts of parallelism Gaming, simulations, etc. are examples Performance doesn’t necessarily scale linearly with the number of cores …
Amdahl’s Law Speedup depends on the amount of code that must be executed sequentially Formula: Speedup = time to run on single processor time to execute on N || processors 1 = (1 – f) + f / N (where f is the amount of parallelizable code)
Multithreading Models Three common ways of establishing relationship between user and kernel threads – Many-to-One – One-to-One – Many-to-Many
Many-to-One (User-Level Threads) Many user-level threads mapped to single kernel thread
Many-to-One (User-Level Threads) Advantages – Thread switching does not involve kernel no mode switching – Scheduling can be application specific choose best algorithm – ULTs can run on any OS only needs a thread library Disadvantages – Most system calls are blocking and the kernel blocks processes all threads within the process will be blocked – Kernel can only assign processes to processors threads within same process cannot run simultaneously on processors
One-to-One (Kernel-Level Threads) Each user-level thread maps to kernel thread
One-to-One (Kernel-Level Threads) Advantages – Multiple threads can run concurrently on different processors – When one user thread and its kernel thread block, the other user threads can continue to execute since their kernel threads are unaffected Disadvantages – creating a user thread requires creating the corresponding kernel thread Overhead of creating kernel threads can burden the performance of the application
Many-to-Many Model Allows many user-level threads to be mapped to many kernel threads Idea is to combine the best of both approaches
Thread Libraries Provides the programmer with an API for creating and managing threads. Approaches 1)To provide a library entirely in user space with no kernel support. All code and data structures for the library exist in user space, i.e.a local function call in user space and not a system call. 2) To implement a kernel-level library supported directly by the OS where code and data structures for the library exist in kernel space, i.e. a system call to the kernel. Thread libraries: a)POSIX Pthreads (a user-level or a kernel-level library) b)Windows (a kernel-level ) and c)Java (Java thread API allows threads to be created and managed directly in Java programs. However, because in most instances the JVM is running on top of a host OS, the Java thread API is implemented using a thread library on the host system. Eg. Java threads are implemented using Windows API. UNIX and Linux systems often use Pthreads.
Strategies for creating multiple threads Asynchronous threading: Parent creates a child thread and resumes its execution- Concurrent. Each thread runs independently and the parent thread need not know when child terminates. There is typically little data sharing between threads. Synchronous threading : Parent thread creates one or more children and then must wait for all of its children to terminate before it resumes the so-called fork-join strategy. Threads created by the parent perform work concurrently, not the parent. Once a thread has finished its work, it terminates and joins with its parent. Only after all of the children have joined can the parent resume execution. Data sharing among threads. Eg, the parent thread may combine the results calculated by its various children.
#include int sum; /* this data is shared by the thread(s) */ void *runner(void *param); /* threads call this function */ int main(int argc, char *argv[]) { pthread t tid; /* the thread identifier */ pthread attr t attr; /* set of thread attributes */ if (argc != 2) { fprintf(stderr,"usage: a.out \n"); return -1; } if (atoi(argv[1]) < 0) { fprintf(stderr,"%d must be >= 0\n",atoi(argv[1])); return -1;} pthread attr init(&attr); /* get the default attributes */ pthread create(&tid,&attr,runner,argv[1]); /* create the thread */ pthread join(tid,NULL); /* wait for the thread to exit */ printf("sum = %d\n",sum);} void *runner(void *param) /* The thread will begin control in this function */ {int i, upper = atoi(param); sum = 0; for (i = 1; i <= upper; i++) sum += i; pthread exit(0);}
A single thread of control begins in main(). After initialization, main() creates a second thread that begins control in the runner() function. Both threads share the global data sum. pthread t tid declares the identifier for the thread we will create. The pthread attr t attr declaration represents the attributes for the thread(stack size and scheduling info.) function call pthread attr init(&attr) uses the default attributes provided. Thread creation with the pthread create() function call( thread identifier, attributes for the thread, name of the function where the new thread will begin execution i.e.the runner() function, integer parameter that was provided on the command line, argv[1]. Now program has two threads: the initial (or parent)in main() and the summation (or child) thread performing the summation operation in the runner() function. This program follows the fork-join strategy. The summation thread will terminate when it calls the function pthread exit() and returned, the parent thread will output the value of the shared data sum. With the growing dominance of multicore systems, writing programs containing several threads has become increasingly common. A simple method for waiting on several threads using the pthread join() function is to enclose the operation within a simple for loop.
Pthread code for joining ten threads. #define NUM THREADS 10 /* an array of threads to be joined upon */ pthread t workers[NUM THREADS]; for (int i = 0; i < NUM THREADS; i++) pthread join(workers[i], NULL);
Question 1 Which of the following components of program state are shared across threads in a multithreaded process? a.Register values b. Heap memory c. Global variables d. Stack memory
Question 2 Can a multithreaded solution using multiple user-level threads achieve better performance on a multiprocessor system than on a single- processor system?
Question 3 Consider a multiprocessor system and a multithreaded program written using the many-to-many threading model. Let the number of user-level threads in the program be more than the number of processors in the system. Discuss the performance implications of the following scenarios. a. The number of kernel threads allocated to the program is less than the number of processors. b. The number of kernel threads allocated to the program is equal to the number of processors. c. The number of kernel threads allocated to the program is greater than the number of processors but less than the number of user level threads.