Outline for Today Objectives: Linux scheduler Lottery scheduling

Slides:



Advertisements
Similar presentations
LOTTERY SCHEDULING: FLEXIBLE PROPORTIONAL-SHARE RESOURCE MANAGEMENT
Advertisements

Embedded System Lab. 1 Embedded system Lab.
CPU Scheduling Tanenbaum Ch 2.4 Silberchatz and Galvin Ch 5.
CAS3SH3 Midterm Review. The midterm 50 min, Friday, Feb 27 th Materials through CPU scheduling closed book, closed note Types of questions: True & False,
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
SCHEDULING Kshama Desai Bijal Shah Kishore Putta Kashyap Sheth.
Review: Chapters 1 – Chapter 1: OS is a layer between user and hardware to make life easier for user and use hardware efficiently Control program.
G Robert Grimm New York University Lottery Scheduling.
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
Introduction to Operating Systems – Windows process and thread management In this lecture we will cover Threads and processes in Windows Thread priority.
Scheduling in Linux COMS W4118 Spring Scheduling Goals O(1) scheduling; 2.4 scheduler iterated through Run queue on each invocation Task queue.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Linux Scheduling CS Scheduling Policy ► The scheduling algorithm of traditional Unix systems must fulfill several conflicting objectives  Fast.
Lottery Scheduling: Flexible Proportional-Share Resource Management Sim YounSeok C. A. Waldspurger and W. E. Weihl.
Operating System Examples - Scheduling
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
CS 153 Design of Operating Systems Spring 2015 Lecture 11: Scheduling & Deadlock.
Scheduling Basic scheduling policies, for OS schedulers (threads, tasks, processes) or thread library schedulers Review of Context Switching overheads.
Implementing Synchronization. Synchronization 101 Synchronization constrains the set of possible interleavings: Threads “agree” to stay out of each other’s.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
Operating Systems CSE 411 Multi-processor Operating Systems Multi-processor Operating Systems Dec Lecture 30 Instructor: Bhuvan Urgaonkar.
CPU Scheduling Presentation by Colin McCarthy. Runqueues Foundation of Linux scheduler algorithm Keeps track of all runnable tasks assigned to CPU One.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Kernel Locking Techniques by Robert Love presented by Scott Price.
1 Review of Process Mechanisms. 2 Scheduling: Policy and Mechanism Scheduling policy answers the question: Which process/thread, among all those ready.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Real Time System with MVL. © 2007 MontaVista Confidential | Overview of MontaVista Agenda Why Linux Real Time? Linux Kernel Scheduler Linux RT technology.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
CSE 153 Design of Operating Systems Winter 2015 Midterm Review.
Managing Processors Jeff Chase Duke University. The story so far: protected CPU mode user mode kernel mode kernel “top half” kernel “bottom half” (interrupt.
CS333 Intro to Operating Systems Jonathan Walpole.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 3.
1 Outline for Today Objectives: –Scheduling (continued). –System Calls and Interrupts. Announcements.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?
Operating System Examples - Scheduling. References r er/ch10.html r bangalore.org/blug/meetings/200401/scheduler-
Process Scheduling. Scheduling Strategies Scheduling strategies can broadly fall into two categories  Co-operative scheduling is where the currently.
1 Outline for Today Objectives: –To review the process and thread abstractions. the mechanisms for implementing processes (threads), including scheduling.
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
REAL-TIME OPERATING SYSTEMS
CSE 120 Principles of Operating
Scheduling of Non-Real-Time Tasks in Linux (SCHED_NORMAL/SCHED_OTHER)
Outline for Today Objectives: Announcements Interrupts (continued)
PROCESS MANAGEMENT IN MACH
Sarah Diesburg Operating Systems COP 4610
Background on the need for Synchronization
Synchronization.
Advanced Operating Systems (CS 202) Scheduling (2)
Outline for Today Objectives: Announcements To review To detail
Chapter 2 Scheduling.
Lottery Scheduling: Flexible Proportional-Share Resource Management
Lottery Scheduling Ish Baid.
CS 3204 Operating Systems Lecture 14 Godmar Back.
Chapter 2: The Linux System Part 3
Scheduling.
Thread Implementation Issues
Lecture 2 Part 2 Process Synchronization
Threads Chapter 4.
Background and Motivation
CS703 – Advanced Operating Systems
Chapter 10 Multiprocessor and Real-Time Scheduling
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
CSE 153 Design of Operating Systems Winter 2019
Linux Scheduling CSE 2431: Introduction to Operating Systems
Sarah Diesburg Operating Systems CS 3430
Presentation transcript:

Outline for Today Objectives: Linux scheduler Lottery scheduling BRING CARDS TO SHUFFLE

Linux Scheduling Policy Runnable process with highest priority and timeslice remaining runs (SCHED_OTHER policy) Dynamically calculated priority Starts with nice value Bonus or penalty reflecting whether I/O or compute bound by tracking sleep time vs. runnable time: sleep_avg – accumulated during sleep up to MAX_SLEEP_AVG (10 ms default) decremented by timer tick while running

Linux Scheduling Policy Dynamically calculated timeslice The higher the dynamic priority, the longer the timeslice: Recalculated every round when “expired” and “active” swap Exceptions for expired interactive Go back on active unless there are starving expired tasks High priority more interactive Low priority less interactive 10ms 150ms 300ms

Runqueue for O(1) Scheduler . Higher priority more I/O 300ms priority array . priority queue active priority queue lower priority more CPU 10ms expired . priority array . priority queue priority queue

Runqueue for O(1) Scheduler . priority array . priority queue 1 active priority queue expired . priority array . priority queue priority queue

Runqueue for O(1) Scheduler . priority array . priority queue X active priority queue expired . priority array . priority queue 1 priority queue

Linux Real-time No guarantees SCHED_FIFO SCHED_RR Static priority, effectively higher than SCHED_OTHER processes* No timeslice – it runs until it blocks or yields voluntarily RR within same priority level SCHED_RR As above but with a timeslice. * Although their priority number ranges overlap

Diversion: Synchronization Disable Interrupts Busywaiting solutions - spinlocks execute a tight loop if critical section is busy benefits from specialized atomic (read-mod-write) instructions Blocking synchronization sleep (enqueued on wait queue) while critical section is busy.

Support for SMP Every processor has its own private runqueue Locking – spinlock protects runqueue Load balancing – pulls tasks from busiest runqueue into mine. Affinity – cpus_allowed bitmask constrains a process to particular set of processors Symmetric mp P P P P $ $ $ $ Memory load_balance runs from schedule( ) when runqueue is empty or periodically esp. during idle. Prefers to pull processes from expired, not cache-hot, high priority, allowed by affinity

Lottery Scheduling Waldspurger and Weihl (OSDI 94)

Claims Goal: responsive control over the relative rates of computation Support for modular resource management Generalizable to diverse resources Efficient implementation of proportional-share resource management: consumption rates of resources by active computations are proportional to relative shares allocated

Basic Idea Resource rights are represented by lottery tickets abstract, relative (vary dynamically wrt contention), uniform (handle heterogeneity) responsiveness: adjusting relative # tickets gets immediately reflected in next lottery At allocation time: hold a lottery; Resource goes to the computation holding the winning ticket.

Fairness Expected allocation is proportional to # tickets held - actual allocation becomes closer over time. Number of lotteries won by client E[w] = n p where p = t/T Response time (# lotteries to wait for first win) E[n] = 1/p w # wins t # tickets T total # tickets n # lotteries

Example List-based Lottery 10 2 5 1 2 Summing: 10 12 17 Random(0, 19) = 15

Bells and Whistles Ticket transfers - objects that can be explicitly passed in messages Can be used to solve priority inversions Ticket inflation Create more - used among mutually trusting clients to dynamically adjust ticket allocations Currencies - “local” control, exchange rates Compensation tickets - to maintain share use only f of quantum, ticket inflated by 1/f in next

Kernel Objects Backing tickets Currency name amount 1000 base C_name Active amount 300 ticket Issued tickets

base alice bob task1 task3 task2 thread1 thread3 thread4 thread2 3000 1000 base 2000 base 1 bob = 20 base 1 alice = 5 base alice bob 200 100 100 bob 200 alice 100 alice task1 1 task2= .4 alice = 2 base task3 100 task2 500 100 task1 300 task2 100 task3 200 task2 thread1 thread3 thread4 thread2

base alice bob task1 task3 task2 thread1 thread3 thread4 thread2 3000 1000 base 2000 base 1 bob = 20 base 1 alice = 3.33 base alice bob 300 100 100 bob 200 alice 100 alice task1 1 task2= .4 alice = 1.33 base task3 100 100 task2 500 100 task1 300 task2 100 task3 200 task2 thread1 thread3 thread4 thread2

Example List-based Lottery 1 base 2bob 5 task3 2bob 10 task2 Random(0, 2999) = 1500

Compensation A holds 400 base, B holds 400 base A runs full 100msec quantum, B yields at 20msec B uses 1/5 allotted time Gets 400/(1/5) = 2000 base at each subsequent lottery for the rest of this quantum a compensation ticket valued at 2000 - 400

Ticket Transfer Synchronous RPC between client and server create ticket in client’s currency and send to server to fund it’s currency on reply, the transfer ticket is destroyed

Control Scenarios Dynamic Control Conditionally and dynamically grant tickets Adaptability Resource abstraction barriers supported by currencies. Insulate tasks.

UI mktkt, rmtkt, mkcur, rmcur fund, unfund lstkt, lscur, fundx (shell)

Prototype Implemented in the Mach microkernel

Relative Rate Accuracy Figure 4: Relative Rate Accuracy. For each allocated ratio, the observed ratio is plotted for each of three 60 second runs. The gray line indicates the ideal where the two ratios are identical.

Fairness Over Time 8 second time windows over 200 sec. Execution Figure 5: Fairness Over Time. Two tasks executing the Dhry-stone benchmark with a 2 : 1 ticket allocation. Averaged over the entire run, the two tasks executed 25378 and 12619 iterations/sec., for an actual ratio of 2.01 : 1.

Client-Server Query Processing Rates Figure 7: Query Processing Rates. Three clients with an 8 : 3 : 1 ticket allocation compete for service from a multithreaded database server. The observed throughput and response time ratios closely match this allocation.

Controlling Video Rates Figure 8: Controlling Video Rates. Three MPEG viewers are given an initial A: B : C = 3 : 2 : 1 allocation, which is changed to 3 : 1 : 2 at the time indicated by the arrow. The total number of frames displayed is plotted for each viewer. The actual frame rate ratios were 1.92 : 1.50 : 1 and 1.92 : 1 : 1.53, respectively, due to distortions caused by the X server.

Insulation Figure 9: Currencies Insulate Loads. Currencies A and B are identically funded. Tasks A1 andA2 are respectively allocated tickets worth 100:A and 200:A.TasksB1 andB2 are respectively allocated tickets worth 100:B and 200:B. Halfway through the experiment, task B3 is started with an allocation of 300:B.The resulting inflation is locally contained within currency B,andaf-fects neither the progress of tasks in currency A, nor the aggregate A: B progress ratio.

Other Kinds of Resources Claim: can be used for any resource where queuing is used Control relative waiting times for mutex locks. Mutex currency funded out of currencies of waiting threads Holder gets inheritance ticket in addition to its own funding, passed on to next holder (resulting from lottery) on release. Space sharing - inverse lottery, loser is victim (e.g. in page replacement decision, processor node preemption in MP partitioning)

Lock Funding Waiting thread 1 Waiting thread 1 lock 1 holding thread 1 bt holding thread 1

Lock Funding 1 Waiting thread 1 lock 1 1 t bt New holding thread Old holding thread 1

Mutex Waiting Times Figure 11: Mutex Waiting Times. Eight threads compete to acquire a lottery-scheduled mutex. The threads are divided into two groups (A, B) of four threads each, with the ticket allocation A: B = 2 : 1. For each histogram, the solid line indicates the mean (); the dashed lines indicate one standard deviation about the mean ( ÿ). The ratio of average waiting times is A: B = 1 : 2.11; the mutex acquisition ratio is 1.80 : 1.

Synchronization

The Trouble with Concurrency in Threads... Data: x while(i<10) {xx+1; i++;} while(j<10) j++;} i j See email spring 02 jan 22 What is the value of x when both threads leave this while loop?

Range of Answers Process 0 Process1 LD x // x currently 0 Add 1 ST x // x now 1, stored over 9 Do 9 more full loops // leaving x at 10 Process1 LD x // x currently 0 Add 1 ST x // x now 1 Do 8 more full loops // x = 9 LD x // x now 1 ST x // x = 2 stored over 10

Nondeterminism while (i<10) {xx+1; i++;} What unit of work can be performed without interruption? Indivisible or atomic operations. Interleavings - possible execution sequences of operations drawn from all threads. Race condition - final results depend on ordering and may not be “correct”. while (i<10) {xx+1; i++;} load value of x into reg yield( ) add 1 to reg yield ( ) store reg value at x

Reasoning about Interleavings On a uniprocessor, the possible execution sequences depend on when context switches can occur Voluntary context switch - the process or thread explicitly yields the CPU (blocking on a system call it makes, invoking a Yield operation). Interrupts or exceptions occurring - an asynchronous handler activated that disrupts the execution flow. Preemptive scheduling - a timer interrupt may cause an involuntary context switch at any point in the code. On multiprocessors, the ordering of operations on shared memory locations is the important factor.

Critical Sections If a sequence of non-atomic operations must be executed as if it were atomic in order to be correct, then we need to provide a way to constrain the possible interleavings in this critical section of our code. Critical sections are code sequences that contribute to “bad” race conditions. Synchronization needed around such critical sections. Mutual Exclusion - goal is to ensure that critical sections execute atomically w.r.t. related critical sections in other threads or processes. How?

The Critical Section Problem Each process follows this template: while (1) { ...other stuff... //processes in here shouldn’t stop others enter_region( ); critical section exit_region( ); } The problem is to define enter_region and exit_region to ensure mutual exclusion with some degree of fairness.

Implementation Options for Mutual Exclusion Disable Interrupts Busywaiting solutions - spinlocks execute a tight loop if critical section is busy benefits from specialized atomic (read-mod-write) instructions Blocking synchronization sleep (enqueued on wait queue) while C.S. is busy Synchronization primitives (abstractions, such as locks) which are provided by a system may be implemented with some combination of these techniques.