Scheduling of Regular Tasks in Linux David Ferry, Chris Gill, Brian Kocoloski CSE 422S - Operating Systems Organization Washington University in St. Louis St. Louis, MO 63143
Traditional Scheduling Concerns Throughput: Maximize tasks finished per time Latency: Minimize time between creation and completion Response time: Minimize time between wakeup and execution Starvation: All tasks guaranteed some processor time Fairness: All tasks given equal processor time Overhead: Multicore scalability, efficiency A scheduler must compromise! CSE 422S – Operating Systems Organization
Important Scheduling Scenarios Pure Compute Bound (e.g. while(1) ) Wants to keep cache hot Pure I/O Bound (e.g. always waits for keyboard) Wants fast response Server Minimize outstanding requests (throughput & latency) Desktop Maximize interactivity Heterogeneous workload Real-time Minimize response time Guarantee timeliness of high priority tasks CSE 422S – Operating Systems Organization
Big Two Scheduling Operations Which task should run next? How long should it run? CSE 422S – Operating Systems Organization
Normal Task Priorities Based on niceness levels Levels range from [-20, 19], default is 0 “More nice” => “Lower Priority” (higher) “Less nice” => “Higher priority” (lower) Can be adjusted heuristically for interactive and CPU bound tasks CSE 422S – Operating Systems Organization
CSE 422S – Operating Systems Organization O(1) Scheduler Fundamental idea: Map nice values to fixed timeslices E.g., 0: 100 ms 1: 95 ms 2: 90 ms … When tasks exhaust their timeslice they move to expired array, if blocking they stay active When active array is empty we pointer swap When a task uses its timeslice, it moves to an expired array Next task to run: remaining task with highest priority When all tasks have run, the expired array becomes active, start again from the front CSE 422S – Operating Systems Organization
CSE 422S – Operating Systems Organization Simple Example Linux characterizes tasks as either compute bound or I/O bound Compute bound Makes heavy use of the processor, non-interactive, does not care about latency I/O bound Makes only sporadic use of the processor, reads/writes storage/network data, or waits for user input; cares about latency Example (LKD pp 45) App 1: text editor (I/O bound) App 2: video encoder (compute bound) CSE 422S – Operating Systems Organization
Problems with O(1) Scheduler Recall O(1) philosophy: fixed timeslices for different priority levels What would timeslice allocations be for: One video encoder (nice 19) and one text editor (nice 0)? Two video encoder tasks? Two text editor tasks? CSE 422S – Operating Systems Organization
Problems with O(1) Scheduler Inverted switching rates High priority (low nice value) tasks are generally interactive, I/O intensive Low priority (high nice value) tasks are generally compute bound Further consider two low priority processes – they will switch every 5 ms Two high priority processes – they will switch every 100 ms Additional problems? Variance across nice intervals Nice values of 0,1 get timeslices of 100,95 ms (5% decrease) Nice values of 18,19 get timesliices of 10,5 ms (50% decrease) Need absolute timeslices, limited by HW capability CSE 422S – Operating Systems Organization
Completely Fair Scheduler (CFS) Goal: All tasks receive a weighted proportion of processor time. On a system with N tasks, each task should be promised 1/N processor time I.e. “completely fair” Allows interactive tasks to run at high priority while sharing CPU equally between CPU bound tasks. Fundamental idea: Abandons notion of fixed timeslice (and varying fairness), for fixed fairness (and varying timeslice) CSE 422S – Operating Systems Organization
Same example, but with CFS Consider our video encoder and text editor once again Now, rather than fixed timeslices, we need a target latency – a single absolute value that reflects how “responsive” the system should feel e.g. 20 ms Assume nice values of 0 and 20 This works out to about 95% of the processor for nice 0 and 5% of the processor for nice 20 So, timeslices would be 19 ms and 1 ms What about two text editors? Two video encoders? CSE 422S – Operating Systems Organization
CSE 422S – Operating Systems Organization Virtual Runtime Virtual runtime: the actual running time of a process weighted by its priority, stored as nanoseconds value If all tasks have nice priority 0, their virtual runtime is equal to their actual runtime If some task has nonzero priority, then: where weights are determined by nice priority. Updated in update_curr() in fair.c CSE 422S – Operating Systems Organization
CFS Scheduling Operations Which task? Pick task with lowest virtual runtime How long to run? Keeps virtual runtime as fair as possible, so tasks get swapped out each tick Uses minimum tick length to avoid thrashing CSE 422S – Operating Systems Organization
CFS Run Queue Implementation Needs to pick the task with shortest virtual runtime in constant time. Efficient data structure to always pick the lowest value? CSE 422S – Operating Systems Organization
CSE 422S – Operating Systems Organization Red-Black Binary Tree CSE 422S – Operating Systems Organization
CFS Example Consider a video encoder and a text editor Video encoder Entitled proportion: 50% Text editor Entitled proportion:50% Used Unused Over-use Actual proportion: 95% Has low priority. Actual proportion: 5% Has high priority when it wants to run. CSE 422S – Operating Systems Organization
CSE 422S – Operating Systems Organization Today’s Studio Monitor the CFS scheduler with user-level workloads and priorities on your Raspberry Pis Gain experience with cpu pinning, priority setting, and performance monitoring CSE 422S – Operating Systems Organization