Instructor: Junfeng Yang W4118 Operating Systems Instructor: Junfeng Yang
Logistics CLIC machines are ready
Last lecture: concurrency errors Why synchronization is hard Concurrency error patterns Deadlock Race Data race Atomicity Order Concurrency error detection Deadlock detection Lock order Race detection Happens-before Eraser
Today Eraser (cont.) Intro to scheduling
Recall: Lockset algorithm v1: infer the locks Intuition: it must be one of the locks held at the time of access C(v): a set of candidate locks for protecting v Initialize C(v) to the set of all locks On access to v by thread t, refine C(v) C(v) = C(v) ^ locks_held(t) If C(v) = {}, report error Sounds good! But …
Recall: Problems with lockset v1 Initialization When shared data is first created and initialized Read-shared data Shared data is only read (once initialized) Read/write lock We’ve seen it last week Locks can be held in either write mode or read mode
Recall: State transitions Each shared data value (memory location) is in one of the four states Virgin write, first thread Exclusive write, new thread In shared state: refine C(v), but emit no warnings In shared/modified state: refine C(v) and emit warnings Shared/ Modified Refine C(v) and check Read, new thread Shared Refine C(v), no check write
Read-write locks Read-write locks allow a single writer and multiple readers Locks can be held in read mode and write mode read_lock(m); read v; read_unlock(m) write_lock(m); write v; write_unlock(m) Locking discipline Lock can be held in some mode (read or write) for read access Lock must be held in write mode for write access A write access with lock held in read mode error
Handling read-write locks Idea: distinguish read and write access when refining lockset On each read of v by thread t (same as before) C(v) = C(v) ^ locks_held(t) If C(v) = {}, report error On each write of v by thread t C(v) = C(v) ^ write_locks_held(t)
Results Eraser works However, slow Find bugs in mature software Though many limitations Major: benign races (intended races) However, slow 10-30X slow down Instrumentation each memory access is costly Can be made faster With static analysis Smarter instrumentation Lockset algorithm is influential, used by many tools E.g. Helgrind (a race detetection tool in Valgrind)
Benign race example Double-checking locking Faster if v is often 0 Doesn’t work with compiler/hardware reordering if(v) { // benign race lock(m); if(v) …; unlock(m); }
Today Eraser (cont.) Intro to scheduling Basic concepts Different scheduling algorithms Advantages and disadvantages
Direction within Course Until now: Interrupts, Processes, Threads Mostly mechanisms From now on: Resources Resources: things processes operate upon E.g., CPU time, memory, disk space Policy We’ll look at how to manage CPU time with scheduling today
Types of Resource Preemptible Non-preemptible OS can take resource away, use it for something else, and give it back later E.g., CPU Non-preemptible Once given resource, it can’t be reused until voluntarily relinquished E.g., disk space Given set of resources and set of requests for the resources, types of resource determines how OS manages it
Decisions about Resources Allocation: Which process gets which resources Which resources should each process receive? Space sharing: Control access to resource Implication: Resources are not easily preemptible E.g., Disk space Scheduling: How long process keeps resource In which order should requests be serviced? Time sharing: More resources requested than can be granted Implication: Resource is preemptible E.g., Processor scheduling
Alternating Sequence of CPU And I/O Bursts
Role of Dispatcher vs. Scheduler Low-level mechanism Responsibility: context switch Save previous process state in PCB Load next process state from PCB to registers Change scheduling state of process (running, ready, or blocked) Migrate processes between different scheduling queues Switch from kernel to user mode Scheduler High-level policy Responsibility: Deciding which process to run Could have an Allocator for CPU as well Parallel and distributed systems
Preemptive vs. Nonpreemptive Scheduling When does scheduler need to make a decision? When a process Switches from running to waiting state Switches from running to ready state Switches from waiting to ready Terminates Minimal: Nonpreemptive When? Additional circumstances: Preemptive
Scheduling Performance Metrics Min waiting time: don’t have process wait long in ready queue Max CPU utilization: keep CPU busy Max throughput: complete as many processes as possible per unit time Min turnaround time: complete as fast as possible Min response time: respond immediately Fairness: give each process (or user) same percentage of CPU Wait time: time spent waiting in the queue for service CPU utilization: % of time CPU is busy Throughput: processes completed over time Turnaround time: submission to completion Response time: submission to beginning of response
Scheduling Algorithms Next, we’ll look at different scheduling algorithms and their advantages and disadvantages
First-Come, First-Served (FCFS) Simplest CPU scheduling algorithm First job that requests the CPU gets the CPU Nonpreemptive Advantage: simple implementation with FIFO queue Disadvantage: waiting time depends on arrival order short jobs that arrive later have to wait long
Example of FCFS Process Burst Time P1 24 P2 3 P3 3 Suppose that the processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is: Waiting time for P1 = 0; P2 = 24; P3 = 27 Average waiting time: (0 + 24 + 27)/3 = 17 P1 P2 P3 24 27 30
Example of FCFS (Cont.) Suppose that the processes arrive in the order P2 , P3 , P1 The Gantt chart for the schedule is: Waiting time for P1 = 6; P2 = 0; P3 = 3 Average waiting time: (6 + 0 + 3)/3 = 3 Much better than previous case Convoy effect: short process stuck waiting for long process Also called head of the line blocking P1 P3 P2 6 3 30
Shortest Job First (SJF) Schedule the process with the shortest time FCFS if same time Advantage: minimize average wait time provably optimal Moving shorter job before longer job decreases waiting time of short job more than increases waiting time of long job Reduces average wait time Disadvantage Not practical: difficult to predict burst time Possible: past predicts future May starve long jobs
Shortest Remaining Time First (SRTF) SRTF == SJF with preemption New process arrives w/ shorter CPU burst than the remaining for current process SJF without preemption: let current process run SRTF: preempt current process Reduces average wait time
Example of Nonpreemptive SJF Process Arrival Time Burst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4 SJF Average waiting time = (0 + 6 + 3 + 7)/4 = 4 P1 P3 P2 7 3 16 P4 8 12
Example of Preemptive SJF (SRTF) Process Arrival Time Burst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4 SJF (preemptive) Average waiting time = (9 + 1 + 0 +2)/4 = 3 P1 P3 P2 4 2 11 P4 5 7 16
Round-Robin (RR) Practical approach to support time-sharing Advantages Run process for a time slice, then move to back of FIFO queue Preempted if still running at end of time-slice Advantages Fair allocation of CPU across processes Low average waiting time when job lengths vary widely
Example of RR with Time Slice = 20 Process Arrival Time Burst Time P1 0 400 P2 20 60 P3 20 60 The Gantt chart is: Average waiting time: Compare to FCFS and SJF: P1 P2 P3 20 40 60 80 100 120 140 160 180 200 P1: 40 + 40 + 40 = 120 P2: 40 + 40 = 80 P3: 20 + 40 + 40 = 100 Average waiting time: (120 + 80 + 100)/3 = 100 FCFS: (0 + (400-20) + (400+60-20) } / 3 = 273 1/3 SJF: (120 + 0 + 60) / 3 = 60
Disadvantage of Round-Robin Poor average waiting time when jobs have similar lengths Imagine N jobs each requiring T time slices RR: all complete roughly around time N*T Average waiting time is even worse than FCFS! Performance depends on length of time slice Too high degenerate to FCFS Too low too many context switch, costly How to set time-slice length?
Time slice and Context Switch Time In real OS: Time slice: 10 to 100 ms Context switch: 10 us
Priorities A priority is associated with each process Run highest priority ready job (some may be blocked) Round-robin among processes of equal priority Can be preemptive or nonpreemptive Representing priorities Typically an integer The larger the higher or the lower? Solaris: 0-59, 59 is the highest Linux: 0-139, 0 is the highest Question: what data structure to use for priority scheduling? One queue for all processes?
Setting Priorities Priority can be statically assigned Some processes always have higher priority than others Problem: starvation Priority can be dynamically changed by OS Aging: increase the priority of processes that wait in the ready queue for a long time
Priority Inversion High priority process depends on low priority process (e.g. to release a lock) What if another process with in-between priority arrives? Solution: priority inheritance inherit priority of highest process waiting for a resource Must be able to chain multiple inheritances Must ensure that priority reverts to original value Pri 0: lock() Pri 2: lock() … Pri 1: runs, will preempt process with priority 2.
Multilevel Queue Ready queue is partitioned into separate queues: foreground (interactive) background (batch) Each queue has its own scheduling algorithm foreground – RR background – FCFS Scheduling must be done between the queues Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation. Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR 20% to background in FCFS
Multilevel Queue Scheduling
Multilevel Feedback Queue A process can move between the various queues; aging can be implemented this way Multilevel-feedback-queue scheduler defined by the following parameters: number of queues scheduling algorithms for each queue method used to determine when to upgrade a process method used to determine when to demote a process method used to determine which queue a process will enter when that process needs service
Example of Multilevel Feedback Queue Three queues: Q0 – RR with time quantum 8 milliseconds Q1 – RR time quantum 16 milliseconds Q2 – FCFS Scheduling A new job enters queue Q0 which is served FCFS. When it gains CPU, job receives 8 milliseconds. If it does not finish in 8 milliseconds, job is moved to queue Q1. At Q1 job is again served FCFS and receives 16 additional milliseconds. If it still does not complete, it is preempted and moved to queue Q2.
Multilevel Feedback Queues
Implementation How to monitor variable access? How to represent state? Binary instrumentation How to represent state? For each memory word, keep a shadow word First two bits: what state the word is in How to represent lockset? The remaining 30 bits: lockset index A table maps lockset index to a set of locks Assumption: not many distinct locksets Lockset index saves memory
Example of RR with Time Slice = 20 Process Burst Time P1 53 P2 17 P3 68 P4 24 The Gantt chart is: Typically, higher average turnaround than SJF, but better response P1 P2 P3 P4 20 37 57 77 97 117 121 134 154 162