Chapter 4 – Threads (Pgs 153 – 174)
Threads A "Basic Unit of CPU Utilization" A technique that assists in performing parallel computation by setting up sharing for you A thread consists of: 1. Register set (values), including the PC 2. A stack 3. Shared code, data, files, with the other threads in the same process A sub-component of a process
Threading Until now, all our applications have been single-threaded (c.f., multi-threaded) Threads are sometimes called "lightweight" processes Threads are not as useful on single CPU (one CPU core) systems On multi-CPU/core systems, threads allow a single process to use multiple CPUs
Figure 4.1: Threading
Why threads?
Benefits Responsiveness: E.g., MS Word saving file with one thread and doing input with another Resource Sharing: Automatic sharing of code and (some) data for an application Economy: Easier to make and less memory intensive than a process Scalability(?): Allows a process to use multiple CPUs/cores
Threads vs. Processes Threads can be a little less expensive in overhead than processes Threads can use less memory than processes Threads can require more synchronisation on non-stack variables (e.g., globals, objects) Differences are very minimal in many modern OS (e.g., some versions of Linux) Threads may not be fully available in some OS (i.e., limited functionality)
Programming Challenges Finding independent activities that can be run in parallel Ensuring that an activity does enough work to justify the overhead of creating a thread Dividing data sets to support the threads and avoiding data dependencies Synchronisation of the threads Testing/Debugging: Thread scheduling (ordering) permutations, reproducing an error
User vs. Kernel Threads If a thread can be independently scheduled by the OS, it is a kernel thread This is really what is meant when we say "lightweight process" If creation and scheduling is done in a library or by a "user" application, the threads are called user threads Lightweight, easy to make, no O/S support needed All threads block if one blocks, can't use multiple CPUs, best used for process organisation
Multithreading Models Many:1 Model No OS thread support, only user threads Usually a library, e.g., GNU Portable Threads 1:1 Model User thread is just an interface to the OS (kernel thread), true light-weight process Can overload the OS, so limits exist Many:Many (Hybrid) Model Arbitrary mapping, best of both worlds Complicated to implement and use
Thread Libraries An API for programmers to use threads in their applications Pthreads – Part of POSIX, may be user or kernel level Win32 Threads – Windows kernel thread library Many others, e.g., GNU Portable Threads, Green Threads, l Some programming languages provide threads as a language feature (e.g., Java, µC++) Use man pthreads for info about the library on cs.smu.ca
Pthreads Specification, NOT implementation Use Need pthread_attr_t instance for each thread 1. Initialise: pthread_attr_init() 2. Create: pthread_create() 3. Exit: pthread_exit() 4. Wait: pthread_join()
Issues in Threading fork() : When a copy of a process is made, should a copy of all its threads also be made, or of just the thread calling the fork() ? cancellation (killing a thread): Resources (e.g., disk buffers) are shared between threads but not between processes scheduling in many:many models signals: Which thread gets a signal? The thread to which the signal applies, the currently executing thread(s) All threads, some subset of threads A signal handling thread What to do really depends on the signal generated
Thread Pools Automatically create a set (pool) of threads when a process is created Processes can use and reuse the threads in their pool, but cannot create more Extra startup overhead, but better runtime performance if many threads started/stopped (e.g., web browsers) Pool size can be dynamic, with changes based on number of processes, CPU usage, free memory, etc.
Thread Data Threads all share the data of a process (except each have own stack) Sometimes, a thread needs its own data (i.e., like a process, but with shared code) Not easy to achieve, and often more work than using processes (particularly when OS shares code pages among processes)
Cloning (Linux) fork() calls the clone() system call with minimal sharing pthread_create() calls clone() with maximal sharing Various "halfway" points exist and can be created Processes and threads are not very different Generally what future operating systems will probably be like
To Do: Work on Assignment 1 Finish reading Chapter 4 (pgs ; this lecture) if you haven’t already Read Chapter 5 (pgs ; next lecture)