Lec 3aOperating Systems1 Operating Systems Lecture 3a: Linux Schedulers William M. Mongan Material drawn in part from

Slides:



Advertisements
Similar presentations
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Advertisements

Abdulrahman Idlbi COE, KFUPM Jan. 17, Past Schedulers: 1.2 & : circular queue with round-robin policy. Simple and minimal. Not focused on.
Process Scheduling in Linux (Chap.7 in Understanding the Linux Kernel) J. H. Wang Oct. 1, 2009.
Sogang University Advanced Operating Systems (Process Scheduling - Linux) Advanced Operating Systems (Process Scheduling - Linux) Sang Gue Oh, Ph.D. .
Scheduling in Linux COMS W4118 Spring Scheduling Goals O(1) scheduling; 2.4 scheduler iterated through Run queue on each invocation Task queue.
The Linux Scheduler 2.4 vs 2.6 Michael McCabe Michael McCabe
MicroC/OS-II Embedded Systems Design and Implementation.
Operating System Process Scheduling (Ch 4.2, )
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
CS 4284 Systems Capstone Godmar Back Resource Allocation & Scheduling.
Parallel Processing (CS526) Spring 2012(Week 8).  Thread Status.  Synchronization in Shared Memory Programming(Java threads ) ◦ Locks ◦ Barriars.
Linux Kernel 2.6 Scheduler O(1) from O(n) A B.Sc Seminar in Software Engineering Student: Maxim Raskin Lecturer: Dr. Itzhak Aviv Tel Aviv Afeka College.
 Scheduling  Linux Scheduling  Linux Scheduling Policy  Classification Of Processes In Linux  Linux Scheduling Classes  Process States In Linux.
Scheduling in Linux and Windows 2000
Linux Scheduling CS Scheduling Policy ► The scheduling algorithm of traditional Unix systems must fulfill several conflicting objectives  Fast.
Operating System Concepts and Techniques Lecture 5 Scheduling-1 M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and Techniques, First.
Operating System Examples - Scheduling
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Outline for Today Objectives: Linux scheduler Lottery scheduling
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Seminar on Linux Process Management Presentation by Manoj Dhage MTech 1 st Year, SIT, IIT Kharagpur.
1 Scheduling The part of the OS that makes the choice of which process to run next is called the scheduler and the algorithm it uses is called the scheduling.
CPU Scheduling Presentation by Colin McCarthy. Runqueues Foundation of Linux scheduler algorithm Keeps track of all runnable tasks assigned to CPU One.
Linux Scheduler © DJ Foreman 3/8/ Objectives Response time Throughput for batch No starvation Accommodate high AND low priority © DJ Foreman 3/8/2009.
Multiprogramming. Readings r Silberschatz, Galvin, Gagne, “Operating System Concepts”, 8 th edition: Chapter 3.1, 3.2.
Process Scheduling in Linux (Chap. 11, Understanding the Linux Kernel) J. H. Wang Sep. 26, 2008.
What Every Developer Should Know about the Kernel Dr. Michael L. Collard 1.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Scheduling.
2.5 Scheduling Given a multiprogramming system. Given a multiprogramming system. Many times when more than 1 process is waiting for the CPU (in the ready.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Linux Processes Travis Willey Jeff Mihalik. What is a process? A process is a program in execution A process includes: –program counter –stack –data section.
1 Review of Process Mechanisms. 2 Scheduling: Policy and Mechanism Scheduling policy answers the question: Which process/thread, among all those ready.
Cpr E 308 Spring 2005 Process Scheduling Basic Question: Which process goes next? Personal Computers –Few processes, interactive, low response time Batch.
Lecture 6 Page 1 CS 111 Online Preemptive Scheduling Again in the context of CPU scheduling A thread or process is chosen to run It runs until either it.
We will focus on operating system concepts What does it do? How is it implemented? Apply to Windows, Linux, Unix, Solaris, Mac OS X. Will discuss differences.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Periodic scheduler for Linux OS
2.5 Scheduling. Given a multiprogramming system, there are many times when more than 1 process is waiting for the CPU (in the ready queue). Given a multiprogramming.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
CS 4284 Systems Capstone Godmar Back Resource Allocation & Scheduling.
MP2: RATE-MONOTONIC CPU SCHEDULING Based on slides by Gourav Khaneja, Raoul Rivas and Keun Yim University of Illinois at Urbana-Champaign Department of.
CSC 660: Advanced Operating Systems
MP2: Rate-Monotonic CPU Scheduling
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 3.
Linux Process Management. Linux Implementation of Threads Threads enable concurrent programming / true parallelism Linux implementation of threads.
TANNENBAUM SECTION 2.4, 10.3 SCHEDULING OPERATING SYSTEMS.
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Operating System Examples - Scheduling. References r er/ch10.html r bangalore.org/blug/meetings/200401/scheduler-
Scheduling.
W4118 Operating Systems Instructor: Junfeng Yang.
Process Scheduling 國立中正大學 資訊工程研究所 羅習五 老師 1. Outline OS schedulers Unix scheduling Linux scheduling Linux 2.4 scheduler Linux 2.6 scheduler – O(1) scheduler.
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Processes and threads.
Scheduling of Non-Real-Time Tasks in Linux (SCHED_NORMAL/SCHED_OTHER)
Multi-processor Scheduling
Process Scheduling 國立中正大學 資訊工程研究所 羅習五 老師.
CS 3204 Operating Systems Lecture 14 Godmar Back.
Linux kernel: Processes, threads and scheduling
Chapter 2: The Linux System Part 3
Thread Implementation Issues
CSCI1600: Embedded and Real Time Software
CSE 153 Design of Operating Systems Winter 19
Linux Scheduler © DJ Foreman 3/8/2009.
CSCI1600: Embedded and Real Time Software
Linux Scheduling CSE 2431: Introduction to Operating Systems
Presentation transcript:

Lec 3aOperating Systems1 Operating Systems Lecture 3a: Linux Schedulers William M. Mongan Material drawn in part from Material drawn in part from ux_Scheduler.pdf ux_Scheduler.pdf

The Linux Scheduler kernel/sched.c Information for the scheduler is stored in our struct task_struct from before: –need_resched : this is what tells the kernel that the scheduler ( schedule() ) needs to be called When? Why is this not a static or global variable, but instead stored in every structure? –counter : number of clock ticks left in this timeslice Decremented by a timer Upon reaching 0, need_resched is set –priority and rt_priority : static and realtime priority (realtime is always higher than a normal priority) –policy : The scheduling policy (typically normal or realtime) Lec 3aOperating Systems2

The Linux Scheduler kernel/sched.c Information for the scheduler is stored in our struct task_struct from before: –need_resched : this is what tells the kernel that the scheduler ( schedule() ) needs to be called When can a process be preempted? Maybe we can implement it faster this way… –counter : number of clock ticks left in this timeslice Decremented by a timer Upon reaching 0, need_resched is set –priority and rt_priority : static and realtime priority (realtime is always higher than a normal priority) –policy : The scheduling policy (typically normal or realtime) Lec 3aOperating Systems3

The Linux Scheduler kernel/sched.c Information for the scheduler is stored in our struct task_struct from before: –need_resched : this is what tells the kernel that the scheduler ( schedule() ) needs to be called Preempt when it runs out of timeslice, or a higher priority process appears Typically, we set need_resched in the currently running process only (and this task_struct is already in cache). After an interrupt, this flag is checked on the current process. –Checked by ret_from_intr() –counter : number of clock ticks left in this timeslice Decremented by a timer Upon reaching 0, need_resched is set –priority and rt_priority : static and realtime priority (realtime is always higher than a normal priority) –policy : The scheduling policy (typically normal or realtime) Lec 3aOperating Systems4

need_resched When do we set this bit?

need_resched When do we set this bit? –Timeslice expired time.c: update_process_time() –Higher priority process is awoken wake_up_process() or scheduler_tick() -> resched_task() in sched.c –try_to_wake_up() -> TASK_RUNNING -> activate_task() Preemptive user processes –On call to sched_setscheduler() or sched_yield() Kernel is also preemptive –Thread safe at all times –Lock held for unsafe operations

The Linux Scheduler Linux implements an O(1) scheduler This scheduler –Chooses the next process to run (which?) –Recalculates the timeslice of each process to be run again –Computes the effective (“dynamic”) priority of the process Based on what? Lec 3aOperating Systems7

The Linux Scheduler Linux implements an O(1) scheduler This scheduler –Chooses the next process to run The highest priority process with current > 0 (time to run) –Recalculates the timeslice of each process to be run again –Computes the effective (“dynamic”) priority of the process How long it’s used the CPU (or not) in the last timeslice Recall that this is our method to predict interactivity (CPU or I/O “boundness”) of the process This is done by giving each process a bonus of [-5, 5] to its static priority (nice value) How do we do all this in O(1) time? Lec 3aOperating Systems8

schedule() Each CPU has its own runqueue, and thus schedules independently Each runqueue has a unique subset of RUNNING processes (which can be moved by a load balancer) –Actually, it keeps 2 lists, one for the RUNNING processes yet to be run on this timeslice, and those that already have This is key for O(1) performance Each runqueue has a lock that must be locked (in order to prevent deadlocks!) Lec 3aOperating Systems9

The Runqueues kernel/sched.c struct runqueue { spinlock_t lock; unsigned long nr_running; // # running process unsigned long long nr_switches; //#context switches unsigned long expired_timestamp; //last array swap task_t *curr, //task currently running *idle; //idle task prio_array_t *active, //Active priority array *expired, //Expire priority array arrays[2]; //Priority array } Lec 3aOperating Systems10

Priority Array kernel/sched.c struct prio_array { unsigned int nr_active; unsigned long bitmap[BITMAP_SIZE]; struct list_head queue[MAX_PRIO]; }; Lec 3aOperating Systems11

schedule() A bitmap is created such that there is 1 bit for every priority level. –At 140 ( MAX_PRIO ) priority levels, that’s 140 bits/32 bits per word = words = 5 words ( BITMAP_SIZE ) If a RUNNING_PROCESS exists at that level with timeslice available, the corresponding bit is true Notice the list of queues struct list_head queue[MAX_PRIO], and the relationship between BITMAP_SIZE and MAX_PRIO – How does it work? struct prio_array { unsigned int nr_active; unsigned long bitmap[BITMAP_SIZE]; struct list_head queue[MAX_PRIO]; }; Lec 3aOperating Systems12

schedule() When a priority level has a TASK_RUNNING process with timeslice available, that bit in the bitmap is set to 1. –If a priority 50 process can be run next, bitmap[50] = 1 The first bit position to be 1 is the queue in the list of queues on which the available process is stored. These processes are run round-robin. Thus, constant time! –But wait… how do we know which ones on this queue have timeslice available? We’d potentially have to search the whole list, yielding O(n) time performance. struct prio_array { unsigned int nr_active; unsigned long bitmap[BITMAP_SIZE]; struct list_head queue[MAX_PRIO]; }; Lec 3aOperating Systems13

Timeslice Management This is the reason for having 2 arrays in the runqueue prio_array_t *active, //Active priority array *expired, //Expire priority array arrays[2]; //Priority array Initially, all the processes have timeslice, so they are stored in the active queue. As their timeslice expires, they are moved into the expired queue. Therefore, all processes in the active queue (selected by the prio_array structure previously) are guaranteed to have timeslice available. – If the list is empty, the corresponding bitmap bit would have been 0. But recalculating timeslices is an O(n) job, right? Lec 3aOperating Systems14

Timeslice Management When the timeslice reaches 0 and is moved to the expired array, its timeslice is recalculated. But wait again! Moving the processes back to the active queue is also O(n), right? Lec 3aOperating Systems15

Timeslice Management When the timeslice reaches 0 and is moved to the expired array, its timeslice is recalculated. But wait again! Moving the processes back to the active queue is also O(n), right? prio_array_t *array; array = rq->active; if (unlikely(!array->nr_active)) { rq->active = rq->expired; rq->expired = array; array = rq->active; } Lec 3aOperating Systems16

Timeslice Management When the timeslice reaches 0 and is moved to the expired array, its timeslice is recalculated. After all processes in that priority level have run, they are all moved to expired with their timeslice restored. –Therefore, just swap the lists (all of this is O(1) time). prio_array_t *array; array = rq->active; if (unlikely(!array->nr_active)) { rq->active = rq->expired; rq->expired = array; array = rq->active; } Lec 3aOperating Systems17

Timeslice Management When the timeslice reaches 0 and is moved to the expired array, its timeslice is recalculated. After all processes in that priority level have run, they are all moved to expired with their timeslice restored. –Therefore, just swap the lists (all of this is O(1) time). prio_array_t *array; array = rq->active; if (unlikely(!array->nr_active)) { rq->active = rq->expired; rq->expired = array; array = rq->active; } Lec 3aOperating Systems18

schedule() struct task_struct *prev, *next; struct list_head *queue; struct prio_array array; int idx; prev = current; if (!rq->nr_running) next = rq->idle; else { array = rq->active; idx = sched_find_first_bit(array->bitmap); queue = array->queue + idx; next = list_entry(queue->next, struct task_struct, run_list); } if (prev != next) { rq->nr_switches++; rq->curr = next; ++*switch_count; prev = context_switch(rq, prev, next); } Lec 3aOperating Systems19

scheduler_tick() if (!--p->time_slice) { dequeue_task(p, rq->active); set_tsk_need_resched(p); p->prio = effective_prio(p); p->time_slice = task_timeslice(p); p->first_time_slice = 0; if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) { enqueue_task(p, rq->expired); if (p->static_prio best_expired_prio) rq->best_expired_prio = p->static_prio; } else { enqueue_task(p, rq->active); } Recalculates time slice Lec 3aOperating Systems20

scheduler_tick() if (!--p->time_slice) { dequeue_task(p, rq->active); set_tsk_need_resched(p); p->prio = effective_prio(p); p->time_slice = task_timeslice(p); p->first_time_slice = 0; if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) { enqueue_task(p, rq->expired); if (p->static_prio best_expired_prio) rq->best_expired_prio = p->static_prio; } else { enqueue_task(p, rq->active); } Recalculates time slice Gives the process a second chance on the same queue if the process is interactive This is the reason for the previous if (unlikely(!array- >nr_active)) { Lec 3aOperating Systems21

Context switch sched.c: context_switch() Change the memory map Call switch_to to “unpack” new process and “pack up” old process –__switch_to in process.c

Other Schedulers Completely Fair Scheduler (CFS) SCHED_FIFO SCHED_RR

Waiting Wait on a wait queue (wait_queue_head_t) DECLARE_QUEUE( wait, current ); add_wait_queue( q, &wait ); set_current_state( TASK_INTERRUPTABLE ); while ( !condition ) schedule( ); set_current_state( TASK_RUNNING ); remove_wait_queue( q, &wait );

Kernel Locking Kernel defines sequences of instructions to handle a particular operation called a kcp These are like threads and can be interwoven #include to get atomic_t thread-safe integer type –ATOMIC_INIT(int i) –int atomic_read(atomic_t *v) –void atomic_set(atomic_t, *v, int i) –int atomic_sub_and_test(int i, atomic_t *v) #subtract i from v and return true if 0

Spinlock Earlier we said cli/sti is a bad idea on multicore environments (why?) We can use spinlocks on multicore environments, instead –and not on single core! (why not?) Simple true/false lock with a busy wait() #include spinlock_t type (values 0 and 1), initialized with SPIN_LOCK_UNLOCKED macro spin_lock(spinlock_t *lock) spin_unlock(spinlock_t *lock)

R/W Locks Allows multiple kcp’s to execute concurrently in a read-only fashion, blocking writes –Increases parallelism rwlock_t structure, initialized by RW_LOCK_UNLOCKED read_lock(rwlock_t *lock) / write_lock(rwlock_t *lock) read_unlock(rwlock_t *lock) / write_unlock(rwlock_t *lock)

Kernel Semaphores #include struct semaphore, down() and up() operations similar to P and V

Kernel Condition Variables #include wait_for_completion() and complete() similar to wait and signal –Used by do_fork()