Thread & Processor Scheduling CSSE 332 Operating Systems Rose-Hulman Institute of Technology 1
Thread Scheduling Distinction between user-level and kernel-level threads Many-to-one and many-to-many models, thread library schedules user-level threads to run on LWP Known as process-contention scope (PCS) since scheduling competition is within the process Kernel thread scheduled onto available CPU is system-contention scope (SCS) – competition among all threads in system 2
Pthread Scheduling API allows specifying either PCS or SCS during thread creation PTHREAD_SCOPE_PROCESS schedules threads using PCS scheduling PTHREAD_SCOPE_SYSTEM schedules threads using SCS scheduling. PTHREAD_SCOPE_PROCESS defines a thread scheduling policy that schedules threads using process contention scope PTHREAD_SCOPE_SYSTEM defines a thread scheduling policy that schedules threads using system contention scope On some systems, only certain contentions scopes are allowed. E.g., LINUX and Mac OS X systems allow only PTHREAD_SCOPE_SYSTEM. 3
Pthread Scheduling API #include <pthread.h> #include <stdio.h> #define NUM_THREADS 5 int main(int argc, char *argv[]){ int i; pthread_t tid[NUM_THREADS]; pthread_attr_t attr; /* get the default attributes */ pthread_attr_init(&attr); /* set the scheduling scope to PROCESS or SYSTEM */ pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* set the scheduling policy - FIFO, RR, or OTHER */ pthread_attr_setschedpolicy(&attr, SCHED_OTHER); /* create the threads */ for (i = 0; i < NUM_THREADS; i++){ pthread_create(&tid[i],&attr,runner,NULL); } Scheduling attributes are among the set of thread attributes that can be set. OTHER is most appropriate in this context because this will use the system level scheduling policy. 4
Pthread Scheduling API /* now join on each thread */ for (i = 0; i < NUM_THREADS; i++){ pthread_join(tid[i], NULL); } /* Each thread will begin control in this function */ void *runner(void *param){ printf("I am a thread\n"); pthread_exit(0); 5
Multiple-Processor Scheduling CPU scheduling more complex when multiple CPUs are available Homogeneous processors within a multiprocessor Asymmetric multiprocessing – only one processor accesses the system data structures, alleviating the need for data sharing Symmetric multiprocessing (SMP) – each processor is self-scheduling, all processes in common ready queue, or each has its own private queue of ready processes Processor affinity – process has affinity for processor on which it is currently running soft affinity hard affinity Homogeneous: all processors are identical in terms of functionality. Can use any processor to run any process from the ready queue. Asymmetric: One master server processor handles all scheduling decisions, I/O processing, and other system activities. Other processors execute only user code. This form of multiprocessing is simple because only one process accesses all the system resources. This reduces the need for data sharing. Most modern systems are SMP systems. Because of high cost of invalidating and repopulating caches, most SMP systems try to avoid process from migrating from one processor to another. This is known as processor affinity. Processor affinity takes several forms: Soft affinity: the OS tries to schedule a process on the same processor, but makes no guarantees. Hard affinity: Some systems, e.g., LINUX, provide system calls in support of this, thereby allowing a process to specify that it is not to migrate to other processors. 6
Moving processes between CPUs Causes cache misses – bad for speed Processor affinity – process has affinity for processor on which it is currently running soft affinity hard affinity But without moving – individual processors become overloaded Can happen 2 ways: Push – a system process notices the problem and moves some processes around Pull – a bored processor steals some processes from a busy one Because of high cost of invalidating and repopulating caches, most SMP systems try to avoid process from migrating from one process to another. This is known as processor affinity. Processor affinity takes several forms: Soft affinity: the OS tries to schedule a process on the same processor, but makes no guarantees. Hard affinity: Some systems, e.g., LINUX, provide system calls in support of this, thereby allowing a process to specify that it is not to migrate to other processors. 7
NUMA and CPU Scheduling NUMA Non Uniform Memory Access. CPUs on a board have slower access to memory on other boards. This occurs on systems containing combined CPU and memory boards. The main memory architecture of a system can affect processor affinity issues. This occurs in systems containing combined CPU boards and memory boards. If OS CPU scheduler and memory-placement algorithms work together, then a process with affinity to a particular CPU can be allocated memory on the board where the CPU resides.
Multicore Processors Recent trend to place multiple processor cores on same physical chip Faster and consume less power Multiple hardware threads per core also growing Takes advantage of memory stall to make progress on another thread while memory retrieve happens Memory stall when a processor accesses memory, it waits a significant amount of time for the data to become available. Cash misses are one possible reason for memory stalls. A multithreaded core can switch to another hardware thread while one is is stalled waiting for memory.
Multithreaded Multicore System Example of a single threaded multiprocessor core C compute cycle M memory stall cycle
Multithreaded Multicore System Example of a dual-threaded multiprocessor core. While one thread is stalled waiting on memory, the other hardware thread is scheduled to run in its compute cycle. If time is available, live code the example from trunk/Solutions/pThreadScheduling and test with students.