CPU Scheduling Presentation by Colin McCarthy
Runqueues Foundation of Linux scheduler algorithm Keeps track of all runnable tasks assigned to CPU One runqueue for each CPU
Runqueues ActiveExpired Task 1 Task 2 (lock) Task 3 Task 4 Task 5
Priority Arrays O(1) performance Multiple tasks with same priority are scheduled round robin
Priority Arrays BitmapPriority Array T1 T2 T
Calculating Priority Static task priority –Nice value: changed with nice() system call –Scheduler never changes static priority, but user can Dynamic task priority –Adds to or subtracts from static priority with rewards and penalties –I/O-bound tasks get rewarded, CPU-bound tasks are penalized
I/O-bound vs. CPU-bound Keep track of how much time a task spent sleeping (blocked on I/O) and how much time spent running on CPU The higher a task’s sleep average, the higher it’s dynamic priority
Calculating Timeslice Scale a task’s static priority onto the possible timeslice range (max and min) The higher the task’s static priority, the larger the timeslice it gets. An interactive task’s timeslice may be broken up into chunks
Other effects on calculations When tasks are forked, the sleep average of both the parent and child are reduced –Prevents highly interactive parents from hogging the CPU with like children –Timeslice is not reduced, since timeslice is based only on static priority Interactive tasks whose timeslice has run out will be reinserted into the active priority array as long as there are no tasks starving in the expired array.
The schedule() Function Picks a new task to run and switches to it 1.Check that it’s not being called during an atomic period 2.Disable preemption and check how long the current task has been running 3.Look for runnable tasks. If none, try a load balancing. If still none, idle or swap priority arrays 4.Check active priority array’s bitmap 5.Switch tasks and re-enable preemption
Load Balancing Tasks tend to stay on the same CPU in the interest of cache hotness and memory bank proximity Load balancing is distributing tasks more evenly between CPUs Since there is a single runqueue for each CPU, it keeps track of the CPU’s load A migration thread is kept running on each CPU to handle load balancing
Soft RT Scheduling Linux 2.6 scheduler provides soft RT scheduling support in the same manner described in Nathan’s presentation on Linux 2.2
● Processes can be in three scheduling classes: – SCHED_FIFO – “Real time.” – SCHED_RR - “Real time.” – SCHED_OTHER – Normal process. ● Real time processes always win out over normal processes. ● Real time processes with higher priority always win out over those with with lower priority. ● FIFO/RR only applies if there are multiple RT processes with the same priority. Linux 2.2 Scheduler
NUMA Scheduling Intended for large systems with many nodes Scheduler domains created to account for proximity issues with speed Top-level domain contains all CPUs with one group for each node, with multiple CPUs possible in each. A CPU mask is provided by the group.
NUMA Task Migration First balances base domains in individual CPU Then balances the parent domain (could be multiple CPUs) and so on