Chapter 4:Threads Book: Operating System Principles , 9th Edition , Abraham Silberschatz, Peter Baer Galvin, Greg Gagne.

2 Threads A thread is a basic unit of CPU utilization and it comprises of: Thread ID Program Counter Register set Stack A thread shares with other threads belonging to the same process: Code Section Data Section Operating System Resources (Signals, Open files) Heavy weight process can perform one task at a time, whereas multithreaded process can perform more than one task at a time

3 Threads (Contd…) Many of the applications typically implemented as a separate process with multiple threads of control (e.g. web browser, word processor ) Multithreaded processes reduce wastage of time of a client due to concurrent processing of several tasks. Threads play a vital role in remote procedure call by allowing concurrent processing Operating system kernel are now multithreaded, each perform a specific task

5 Benefits Increased responsiveness (i.e. a multithreaded web browser could still allow user interaction in one thread while an image is being loaded in another thread). Sharing Resources: Threads share memory and resources of a process to which they belong (code sharing allows an application to have several different threads of activity all within the same address space). Economy: Due to sharing of resources and reduction in time consumption for creating and managing threads (in Solaris 2, creating a process is about 30 times slower than creating a thread, and context switching is about five times slower). Utilization of multiprocessor architectures increases concurrency and efficiency because of running of each thread in parallel on a different processors.

Multithreaded programming provides a mechanism for more efficient use of multiple computing cores and improved concurrency Concurrency Parallelism

Trend towards multicore system continues to place pressure on system designer and application programmers to make better use of multiple computing cores Designers of OS must write scheduling algorithms that use multiple processing cores to allow parallel execution Five areas present challenges in programming for multicore systems Identifying tasks Balance Data Splitting Data dependency Testing and debugging

Data Parallelism: distribute subset of the same data across multiple computing cores and perform the same operation on each core Task Parallelism: distribute tasks across multiple computing cores. Each thread is performing a unique operation. Different threads may be operating on the same data or may be operating on different data

9 User Threads User-level threads are supported above the kernel and are implemented by a thread library at the user level. User-thread library provides support for thread creation & scheduling in user space, and management with no support from the kernel. User-level threads are generally fast to create and manage. User-level threads performing a blocking system call will cause the entire process to block if the kernel is single- threaded. User- level thread libraries include POSIX Pthreads, Mach C-threads and Solaris 2 UI-threads.

10 Kernel Threads Kernel threads are supported directly by the Kernel (thread creation, scheduling and management in kernel space). Kernel threads are generally slower to create and manage than user threads because thread management is done by the OS. The kernel can schedule another thread in the application for execution on a processor using single-processor or multiprocessor environment if a thread performs a blocking system call. Kernel threads are supported by Windows 2000, Windows NT, Solaris 2, Tru64 UNIX (Digital UNIX), BeOS and Linux.

There exists a relationship between user and kernel threads. Based on the relationship, the following model exists Many to One One to One Many to Many

12 Many to One Model Maps many user level thread to only one kernel thread Advantage: It is an efficient model. Disadvantage: The entire process will block if a thread makes a blocking system call. Unable to run multiple threads in parallel on multiprocessors. Example: Green threads (a thread library) in Solaris 2 uses this model.

13 Many to One Model

It provides more concurrency by allowing another thread to run when a thread makes a blocking system call. Allows multiple threads to run in parallel on multiprocessors. Disadvantage: Creating a user thread requires creating the corresponding kernel thread burdening the performance of an application. Number of threads supported by the system are restricted. Examples: Windows NT, Windows 2000, OS/2

15 One to One Model

16 Many to Many Model Multiplexes many user level threads to a smaller or equal number of kernel threads The number of kernel threads may be specific to either a particular application or a particular machine. Developers can create as many user threads as necessary and the corresponding kernel threads can run in parallel on multiprocessors. The kernel can schedule another thread for execution in case a thread performs a blocking system call. Examples: Solaris 2, IRIX, HP-UX, and Tru 64 UNIX

17 Many to Many Model

18 Two-level Model Similar to Many to Many model, except that it allows a user thread to be bound to kernel thread Examples IRIX HP-UX Tru64 UNIX Solaris 8 and earlier

19 Two-level Model

20 Thread Library A thread library provides the programmer an API for creating and managing threads Two primary ways of implementing a thread library Provide a library entirely in the user space with no kernel support. Code and data structure for the library exists in the user space. Invoking a function in the library results in the local function call. Implemented a kernel level library supported directly by the operating system. Code and data structure for the library exists in the kernel space. Invoking a function in the library results in system call to the kernel.

POXIS Pthreads (either as user or kernel level library) Win32 thread library (kernel level library) Java threads (typically implemented through Win 32 API)

22 Pthreads A POSIX standard (IEEE c) API for thread creation and synchronization API specifies behavior of the thread library, implementation is up to development of the library Common in UNIX operating systems (Solaris, Linux, Mac OS X) Multithreaded C Program can be implemented by using Pthreads API

23 Win32 Threads The technique used for creating thread in win32 is similar to the one used in Pthreads

Each thread contains A thread id Register set Separate user and kernel stacks Private data storage area The register set, stacks, and private storage area are known as the context of the threads The primary data structures of a thread include: ETHREAD (executive thread block) KTHREAD (kernel thread block) TEB (thread environment block)

API provides a rich set of features for the creation and management of threads Java threads may be created by: Extending Thread class Implementing the Runable interface Sharing of data occurs by passing reference to the shared object of the appropriate thread

Asynchronous Threading Once the parent creates the child thread, the parent resumes its execution, so that the parent and child execute concurrently Threads are independent of each other Synchronous Threading When the parent thread creates one or more children, then must wait of all its children to terminate before it resumes

Thread cancellation Signal handling Thread pools Thread specific data Scheduler activations

The fork and exec System Calls Versions of fork system call are:- If one thread in a program calls fork, the new process duplicates all threads– useful where the separate process does not call exec after forking. If one thread in a program calls fork, the new process is single- threaded – useful where exec is called immediately after forking. If a thread invokes the exec system call, the program specified in the parameter to exec will replace the entire process (including all threads and LWPs).

29 Cancellation Thread cancellation is a task of terminating a thread before it has completed. (e.g. many threads searching from a database, cancel a web page) A thread that is to be cancelled is often referred to as the target thread. Cancellation of a target thread may occur in two different scenarios: Asynchronous cancellation: One thread immediately terminates the target thread. Deferred cancellation: The target thread can periodically check if it should terminate, allowing the target thread an opportunity to terminate itself in an orderly fashion.

30 Cancellation Canceling a thread asynchronously may not free a necessary system-wide resource because OS often will not reclaim all resources of the cancelled thread (many OS use this mechanism). Deferred cancellation allows a thread to check if it should be cancelled at a point when it can safely be cancelled (Pthreads refer to such points as cancellation points).

31 Signal Handling A signal (received either synchronously or asynchronously) is used in UNIX systems to notify a process that a particular event has occurred. All signals follow the same pattern: A signal is generated by the occurrence of a particular event. A generated signal is delivered to a process. Once delivered, the signal must be handled. Synchronous signals (an illegal memory access or division by zero) are delivered to the same process that performed the operation causing the signal (an event internal to a running process).

32 Signal Handling Asynchronous signals (terminating a process with specific keystrokes or having a timer expire) are delivered to another process (an event external to a running process). Every signal may be handled by one of two possible handlers: A default signal handler which is run by the kernel when handling the signal. A user-defined signal handler calls the user-defined function to handle the signal. In the single-threaded programs, signals are always delivered to a process (a straightforward method).

33 Signal Handling Delivering signals in multi-threaded programs is more complicated as a process may have several threads. Following options exist: Deliver the signal to the thread to which the signal applies (synchronous signals) Deliver the signal to every thread in the process (asynchronous signals a signal that terminates a process). Deliver the signal to certain threads in the process (UNIX allows a thread to specify which signals it will accept and which it will block). Assign a specific thread to receive all signals for the process (Solaris 2– asynchronous signals).

34 Thread Pools A multithreaded server (e.g. web server) creates a separate thread to service the request – efficient compared to creating a separate process. Potential problems of a multithreaded servers are: The amount of time required to create the thread prior to serving the request. Unlimited threads concurrently active in the system could exhaust system resources (i.e. CPU time or memory). Thread pools are used to resolve these issues. A number of threads are created at process startup and are placed into a pool, a thread is awakened (if available) when a server receives a request, returns the thread after completing its service, and come back in pool and waits for more work

It is usually faster to service a request with an existing thread than waiting to create a thread. A thread pool limits the number of threads that exist at any one point. This is particularly important on systems that cannot support a large number of concurrent threads. The number of threads in the pool can be determined by various factors: Number of CPUs in the system. The amount of physical memory. The expected number of concurrent client requests.

36 Thread Pools Dynamic adjustment of number of threads in the pool is carried out in the more sophisticated thread-pool architectures according to usage patterns (smaller pool when the load on the system is low).

37 Thread-Specific Data Thread belonging to the process share the data of the process Thread-Specific data is required in circumstances where each thread might need its own copy of data for processing ( for example, each transaction in a separate thread in a transaction-processing system). Examples: Win32, Pthreads, and Java.

Both M:M and Two-level models require communication between kernel and thread library to maintain the appropriate number of kernel threads allocated to the application Scheduler activations provide upcalls - a communication mechanism from the kernel to the thread library This communication allows an application to maintain the correct number kernel threads

Many systems implementing either the many-to-many or two-level model place an intermediate data structure between the user and kernel threads, typically known as a lightweight process, or LWP To the user-thread library, the LWP appears to be a virtual processor on which the application can schedule a user thread to run. Each LWP is attached to a kernel thread, and it is kernel threads that the operating system schedules to run on physical processors. If a kernel thread blocks (such as while waiting for an I/O operation to complete), the LWP blocks as well. Up the, chain, the user-level thread attached to the LWP also blocks.

An application may require any number of LWPs to run efficiently. A CPU-bound application running on a single processor. In this scenario, only one thread can run at once, so one LWP is sufficient. An application that is I/O intensive may require multiple LWPs to execute, however. Typically, an LWP is required for each concurrent blocking system call. Suppose, for example, that five different file-read requests occur simultaneously. Five LWPs are needed, because all could be waiting for I/O completion in the kernel. If a process has only four LWPs, then the fifth request must wait for one of the LWPs to return from the kernel.

