Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani.

Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Scheduling Lecture 6: Scheduling

Univ. of TehranDistributed Operating Systems2 How efficiently use resources Sharing CPU and other resources of the systm. References Surplus Fair Scheduling: A Proportional-Share CPU Scheduling Algorithm for Symmetric Multiprocessors Scheduler Activations: Effective Kernel Support for User- Level Management of Parallelism", Condor- A Hunter of Idle Workstation Virtual-Time Round-Robin: An O(1) Proportional Share Scheduler A SMART Scheduler for Multimedia Applications Linux CPU scheduling,

Univ. of TehranDistributed Operating Systems3 Outline Scheduling Scheduling policies. Scheduling on Multiprocessor Thread scheduling

Univ. of TehranDistributed Operating Systems4 What is Scheduling? Policies and mechanisms to allocates resources to entities. It is a general problem in any field, why? An OS often has many pending tasks. Threads, async callbacks, device input. The order may matter. Policy, correctness, or efficiency. Providing sufficient control is not easy. Mechanisms must allow policy to be expressed. A good scheduling policy ensures that the most important entity gets the resources it needs

Univ. of TehranDistributed Operating Systems5 Why Scheduling? This topic was popular in the days of time sharing, when there was a shortage of resources. Is it irrelevant now? in era of PCs when resources are plenty. The topic is back to handle massive Internet servers with paying customers Where some customers are more important than others New area such as multicores need scheduling

Univ. of TehranDistributed Operating Systems6 Resources to Schedule? Resources : CPU time, physical memory, disk and network I/O, and I/O bus bandwidth. Entities to give resources : users, processes, threads, web requests.

Univ. of TehranDistributed Operating Systems7 Key problems ? Gap between desired policy and available mechanism. Implementation is problem. Many conflicting goals (low latency, high throughput, and fairness), must make a trade-off between the goals. Interaction between different schedulers. One have to take a systems view. Just optimizing the CPU scheduler may do little to for the overall desired policy.

Univ. of TehranDistributed Operating Systems8 Scheduling Policy Examples Allocate cycles in proportion to money. Maintain high throughput under high load. Never delay high pri thread by > 1ms. Maintain good interactive response. Can we enforce policy with the thread scheduler?

Univ. of TehranDistributed Operating Systems9 General plan Understand where scheduling is occurring. Expose scheduling decisions, allow control. Account for resource consumption, to allow intelligent control. Where to schedule? Application is the best position to know scheduling requirements Which threads run best simultaneously Which are on Critical path But Kernel must make sure all play fairly

Univ. of TehranDistributed Operating Systems10 Example (Round Robin):Each process equal CPU time, ex. 10 msec. Works if processes are compute-bound. What if a process waits for I/O? How long quantum? is 10 msec the right answer? Shorter quantum => better interactive, but lower throughput. What if the environment computes for 1 msec and sends an IPC to the file server environment? Shouldn't the file server get more CPU time because it operates on behalf of all other functions? Improvements: track "recent" CPU use and run one with least recent CPU use. Other solution: directed yield; specify next process (e.g., to the file server so that it can compute on the environment's behalf).

Univ. of TehranDistributed Operating Systems11 Scheduling is a System Problem Thread/process scheduler can’t enforce policies by itself. Needs cooperation from: All resource schedulers. Software structure. Conflicting goals may limit effectiveness.

Univ. of TehranDistributed Operating Systems12 Goals Low latency People typing at editors want fast response - Network services can be latency-bound, not CPU-bound High throughput Minimize context switches to avoid wasting CPU, TLB misses, cache misses, even page faults. Fairness

Univ. of TehranDistributed Operating Systems13 Scheduling Approaches FIFO + Fair - High latency Round robin + fair + low latency - poor throughput STCF/SRTCF (shortest time/remaining time to completion first) + low latency + high throughput - unfair: Starvation

Univ. of TehranDistributed Operating Systems14 Shortest Job First (SJF) Two types: Non-preemptive Preemptive Requirement: the elapse time needs to be known in advance Optimal if all jobs are available simultaneously (provable) Is SJF optimal if all the jobs are not available simultaneously?

Univ. of TehranDistributed Operating Systems15 Preemptive SJF Also called Shortest Remaining Time First Schedule the job with the shortest remaining time required to complete Requirement: the elapse time needs to be known in advance

Univ. of TehranDistributed Operating Systems16 Interactive Scheduling Usually preemptive Time is sliced into quantum (time intervals) Decision made at the beginning of each quantum Performance Criteria Min Response time best proportionality Representative algorithms: Priority-based Round-robin Multi Queue & Multi-level Feedback Shortest process time Guaranteed Scheduling Lottery Scheduling Fair Sharing Scheduling

Univ. of TehranDistributed Operating Systems17 Priority Scheduling Each job is assigned a priority with FCFS within each priority level. Select highest priority job over lower ones. Rational: higher priority jobs are more mission-critical Example: DVD movie player vs. send email Problems: May not give the best AWT (Ave. Waiting Time) Indefinite blocking or starvation a process

Univ. of TehranDistributed Operating Systems18 Set Priority Two approaches Static (for system with well known and regular application behaviors) Dynamic (otherwise) Priority may be based on: Cost to user. Importance of user. Aging Percentage of CPU time used in last X hours.

Univ. of TehranDistributed Operating Systems19 Pitfall: Priority Inversion Low-priority thread X holds a lock. High-priority thread Y waits for the lock. Medium-priority thread Z pre-empts X. Y is indefinitely delayed despite high priority. Solution: priority inheritance When a lower-priority process accesses a resource, it inherits high-priority until it is done with the resource in question. And then its priority reverses to its natural value.

Univ. of TehranDistributed Operating Systems20 Pitfall: Long Code Paths Large-granularity locks are convenient. Non-pre-emptable threads are an extreme case. May delay high-priority processing. Every resource with multiple waiting threads has a scheduler, Locks, disk driver, memory allocator. The schedulers may not cooperate or even be explicit

Univ. of TehranDistributed Operating Systems21 Pitfall: Efficiency Efficient disk use requires unfairness. Shortest-seek-first vs FIFO. Read-ahead vs data needed now. Efficient paging policy creates delays. O/S may swap out my idle Emacs to free memory. What happens when I type a key? Thread scheduler doesn’t control these.

Univ. of TehranDistributed Operating Systems22 Pitfall: Server Processes User-level servers schedule requests. X11, DNS, NFS. They usually don’t know about kernel’s scheduling policy. Network packet scheduling also interferes.

Univ. of TehranDistributed Operating Systems23 Pitfall: Hardware Schedulers Memory system scheduled among CPUs. I/O bus scheduled among devices. Interrupt controller chooses next interrupt. Hardware doesn’t know about O/S policy. O/S often doesn’t understand hardware.

Univ. of TehranDistributed Operating Systems24 Time Quantum Time slice too large FIFO behavior Poor response time Time slice too small Too many context switches (overheads) Inefficient CPU utilization Heuristic: 70-80% of jobs block within time-slice Typical time-slice 10 to 100 ms Time spent in system depends on size of job.

Univ. of TehranDistributed Operating Systems25 Multi-Queue Scheduling Hybrid between priority and round-robin Processes assigned to one queue permanently Scheduling between queues Fixed Priorities % CPU spent on queue Example of different priorities System processes Interactive programs Background Processes Student Processes Address the starvation and infinite blocking problems

Univ. of TehranDistributed Operating Systems26 Scheduling Approaches Multilevel feedback queues A job starts with the highest priority queue If time slice expires, lower the priority by one level If time slice does not expire, raise the priority by one level Age long-running jobs

Univ. of TehranDistributed Operating Systems27 Multi-Processor Scheduling: Load Sharing Decides Which process to run? How long does it run Where to run it? (CPU (horsepower)) Process 1 Process 2Process n I want to ride it …

Univ. of TehranDistributed Operating Systems28 Multi-Processor Scheduling Choices Self-Scheduled Each CPU dispatches a job from the ready queue Master-Slave One CPU schedules the other CPUs Asymmetric One CPU runs the kernel and the others runs the user applications. One CPU handles network and the other handles applications

Univ. of TehranDistributed Operating Systems29 Gang Scheduling for Multi- Processors A collection of processes belonging to one job are running at the same time If one process is preempted, all the processes of the gang are preempted. Helps to eliminate the time a process spends waiting for other processes in its parallel computation.

Univ. of TehranDistributed Operating Systems30 Lottery Scheduling Claim Priority-based schemes are ad hoc Lottery scheduling Randomized scheme Based on a currency abstraction, A process is scheduled to run if it has a ticket. Idea: Processes own lottery tickets CPU randomly draws a ticket and execute the corresponding process

Univ. of TehranDistributed Operating Systems31 Properties of Lottery Scheduling Guarantees fairness through probability in a long run. Guarantees no starvation, as long as each process owns one ticket To approximate SRTCF Short jobs get more tickets Long jobs get fewer

Univ. of TehranDistributed Operating Systems32 Partially Consumed Tickets What if a process is chosen, but it does not consume the entire time slice? The process receives compensation tickets Idea Get chosen more frequently But with shorter time slice Different implementation. Sort tickets and start with the large one Generate a random number and see who has the ticket.

Univ. of TehranDistributed Operating Systems33 Ticket Currencies Load Insulation A process can dynamically change its ticketing policies without affecting other processes Need to convert currencies before transferring tickets base:3000 1000 Alice:200 200 process1:500 200 thread1 300 thread2 2000 Bob:100 100 process2:100 100 thread3

Univ. of TehranDistributed Operating Systems34 Condor Identifies idle workstations and schedules background jobs on them Guarantees job will eventually complete Analysis of workstation usage patterns Only 30% Remote capacity allocation algorithms Up-Down algorithm Allow fair access to remote capacity Remote execution facilities Remote Unix (RU)

Univ. of TehranDistributed Operating Systems35 Condor Issues Leverage: performance measure Ratio of the capacity consumed by a job remotely to the capacity consumed on the home station to support remote execution Checkpointing: save the state of a job so that its execution can be resumed Transparent placement of background jobs Automatically restart if a background job fails Users expect to receive fair access Small overhead

Univ. of TehranDistributed Operating Systems36 Condor - scheduling Hybrid of centralized static and distributed approach Each workstation keeps own state information and schedule Central coordinator assigns capacity to workstations Workstations use capacity to schedule

Univ. of TehranDistributed Operating Systems37 Real time Systems Issues are scheduling and interrupts Must complete task by a particular deadline Examples: Accepting input from real time sensors Process control applications Responding to environmental events How does one support real time systems If short deadline, often use a dedicated system Give real time tasks absolute priority Do not support virtual memory Use early binding

Univ. of TehranDistributed Operating Systems38 Real time Scheduling To initiate, must specify Deadline Estimate/upper-bound on resources System accepts or rejects If accepted, agrees that it can meet the deadline Places job in calendar, blocking out the resources it will need and planning when the resources will be allocated Some systems support priorities But this can violate the RT assumption for already accepted jobs

Univ. of TehranDistributed Operating Systems39 User-level Thread Scheduling Possible Scheduling 50-msec process quantum run 5 msec/CPU burst

Univ. of TehranDistributed Operating Systems40 Kernel-level Thread Scheduling Possible scheduling 50-msec process quantum threads run 5 msec/CPU burst

Univ. of TehranDistributed Operating Systems41 Thread Scheduling Examples Solaris 2 priority-based process scheduling with four scheduling classes: real-time, system, time sharing, interactive. A set of priorities within each class. The scheduler converts the class-specific priorities into global priorities and selects to run the thread with the highest global priority. The thread runs until (1) it blocks, (2) it uses its time slice, or (3) it is preempted by a higher priority threads. JVM schedules threads using a preemptive, priority-based scheduling algorithm. schedules the ``runnable'' thread with the highest priority. If two threads have the same priority, JVM applies FIFO. schedules a thread to run if (1) other thread exits the ``runnable state'' due to block(), exit(), suspend() or stop() methods; (2) a thread with higher priority enters the ``runnable''state.

Univ. of TehranDistributed Operating Systems42 Surplus Fair Scheduling Motivation Diverse web and multimedia applications popular HTTP, Streaming, e-commerce, games, etc. Applications hosted on large servers (typically multiprocessors) Key Challenge: Design OS mechanisms for Resource Management End-stations Network Server Streaming E-commerce Web

Univ. of TehranDistributed Operating Systems43 Requirements for OS Resource Management Fair, Proportionate Allocation Eg: 20% for http, 30% for streaming, etc. Application Isolation Misbehaving/overloaded applications should not affect other applications Efficiency OS mechanisms should have low overheads Focus: Achieving these objectives for CPU scheduling on multiprocessor machines

Univ. of TehranDistributed Operating Systems44 Proportional-Share Scheduling Associate a weight with each application and allocate CPU bandwidth proportional to weight Existing Algorithms Ideal algorithm: Generalized Processor Sharing (GPS) E.g.: WFQ, SFQ, SMART, BVT, etc. Question: Are the existing algorithms adequate for multiprocessor systems? Wt=2 Wt=1 2/31/3 CPU bandwidth Applications

Univ. of TehranDistributed Operating Systems45 Starvation Problem SFQ : Start tag of a thread ( Service / weight ) Schedules the thread with minimum start tag Start tag is add with service time/weight CPU 1 CPU 2 01001000 1100 C arrives B starves Time A (Wt=100) B (Wt=1) C (Wt=1) S1=10 S1=11 S1=0 S1=1 S2=0 S2=100S2=1000 S3=10S3=110 CPU 2...

Univ. of TehranDistributed Operating Systems46 Weight Readjustment Reason for starvation: Infeasible Weight Assignment (eg: 1:100 for 2 CPUs) Accounting is different from actual allocation Observation: A thread can’t consume more than 1 CPU bandwidth A thread can be assigned at most (1/p) of total p CPU bandwidth Feasibility Constraint:

Univ. of TehranDistributed Operating Systems47 Weight Readjustment (contd.)... CPU 1CPU 2CPU 3 CPU p Decreasing Order of weights Efficient: Algorithm is O(p) Can be combined with existing algorithms Goal: Convert given weights to feasible weights

Univ. of TehranDistributed Operating Systems48 Effect of Readjustment 0 5 10 15 20 25 30 01020304050 SFQ without Readjustment Number of iterations (10 5 ) Time (s) 01020304050 SFQ with Readjustment Time (s) Weight Readjustment gets rid of starvation problem A (wt=10) B (wt=1) C (wt=1) A (wt=10) B (wt=1) C (wt=1)

Univ. of TehranDistributed Operating Systems49 Short Jobs Problem 0510152025303540 SFQ J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 Time (s) 0 5 10 15 20 0510152025303540 Ideal J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 Time (s) Frequent arrivals and departures of short jobs SFQ does unfair allocation! Number of iterations (10 5 )

Univ. of TehranDistributed Operating Systems50 Surplus Fair Scheduling Service received by thread i Ideal Actual Time Surplus t Scheduler picks the threads with least surplus values Lagging threads get closer to their due Threads that are ahead are restrained Surplus = Service Actual - Service Ideal

Univ. of TehranDistributed Operating Systems51 Surplus Fair Scheduling (contd.) Start tag (S i ) : Weighted Service of thread i S i = Service i / w i Virtual time (v) : Minimum start tag of all runnable threads Surplus (α i ) : α i = Service i - Service lagging = w i S i - w i v Scheduler selects threads in increasing order of surplus

Univ. of TehranDistributed Operating Systems52 Surplus Fair Sched with Short Jobs 0510152025303540 Surplus Fair Sched Time (s) 0 0510152025303540 Ideal Time (s) 5 10 15 20 Number of iterations (10 5 ) Surplus Fair Scheduling does proportionate allocation J1, wt=20 J2-J21, wt=1x20 J_short, wt=5 J1, wt=20 J2-J21, wt=1x20 J_short, wt=5

Univ. of TehranDistributed Operating Systems53 Proportionate Allocation 0 1 2 3 4 5 6 7 1:11:21:41:7 Processor Shares received by two web servers Processor Allocation (Normalized) Weight Assignment

Univ. of TehranDistributed Operating Systems54 Application Isolation MPEG decoder with background compilations 0 10 20 30 40 50 0246810 Frame Rate (frames/sec) Number of background compilations Surplus Fair Time-sharing

Univ. of TehranDistributed Operating Systems55 Scheduling Overhead 0 2 4 6 8 10 0 20304050 Surplus Fair Time-sharing Context switch time (microsec) Number of processes Context-switch time(~10μ s) vs. Quantum size (~100ms)

Univ. of TehranDistributed Operating Systems56 Summary Existing proportional-share algorithms inadequate for multiprocessors Readjustment Algorithm can reduce unfairness Surplus Fair Scheduling practical for multiprocessors Achieves proportional fairness, isolation Has low overhead Heuristics for incorporating processor affinity Source code available at: http://lass.cs.umass.edu/software/gms

Univ. of TehranDistributed Operating Systems57 Scheduler Activations In a multiprocessor system, threads could be managed in: User Space only Key feature: Cooperative Kernel Space only Key feature: Preemptive User Space on top of Kernel Space Some User-Level Threads ------------------------- Some Kernel-Level Threads ------------------------- Some CPUs

Univ. of TehranDistributed Operating Systems58 Scheduler activations User level scheduling of threads Application maintains scheduling queue Kernel allocates threads to tasks Makes upcall to scheduling code in application when thread is blocked for I/O or preempted Only user level involved if blocked for critical section User level will block on kernel calls Kernel returns control to application scheduler

Univ. of TehranDistributed Operating Systems59 User-Level Thread Management Sample measurements were obtained by Firefly running Topaz (in microsecs). Procedure call: 7 microsecs. Kernel Trap: 19 microsecs. OperationFastThreadsTopaz Threads Ultrix Processes Null Fork3494811300 Signal-Wait374411840

Univ. of TehranDistributed Operating Systems60 User-level on top of kernel threads Three layers: Some User-Level Threads (how many?) -------------- Some Kernel-Level Threads (how many?) -------------- Some CPUs (how many?) Problems caused by: Kernel threads are scheduled obliviously with respect to the user-level thread state Kernel threads block, resume, and are preempted without notification to user level

Univ. of TehranDistributed Operating Systems61 The way out: Scheduler Activation Processor allocation is done by the kernel Thread scheduling is done by each address space The kernel notifies the address space thread scheduler of every event affecting the address space The address space notifies kernel of the subset of user-level events that can affect processor allocation decisions

Univ. of TehranDistributed Operating Systems62 Scheduler Activation (cont.) Goal Design a kernel interface and a user-level thread package that can combine the functionality of kernel threads with performance and flexibility of user-level threads. Secondary Goal : If thread operations do not involve kernel intervention the achieved performance should be similar to user-level threads.

Univ. of TehranDistributed Operating Systems63 Scheduler Activation (cont.) The difficulty is IN achieving all the above: in a multi-programmed/multiprocessor system the required control and scheduling information is distributed between the kernel and the user-space! To be able to manage the application’s parallelism successfully: user-level support routines (software) must be aware of kernel events (processor reallocations, I/O requests and completions etc.) –this often is all HIDDEN stuff from the application.

Univ. of TehranDistributed Operating Systems64 Scheduler Activation (cont.) 1. Provide each application with a VIRTUAL MULTIPROCESSOR. Application knows how many processors are allocated. Application has total control over the processors and its own scheduling. The OS has control over the allocation of processors among address spaces and ability to change the number of processors assigned to an application during its execution. 2. To achieve the above, the kernel NOTIFIES. The address space thread scheduler for kernel events affecting it! Application has complete knowledge of its scheduling state. 3. The user-space thread system NOTIFIES the kernel for kernel operations that may affect the allocation of processors (helping in good performance!). Kernel mechanism that achieves the above: SCHEDULER ACTIVATIONS.

Univ. of TehranDistributed Operating Systems65 Scheduler Activation Data Structures Each scheduler activation maintains two execution stacks: One mapped into the kernel. Another mapped into the application address space. Each user-level thread is provided with its own stack at creation time. When a user-level thread calls into kernel, it uses its activation’s kernel stack. The user-level thread scheduler runs on the activation’s user- level stack. The kernel maintains. Activation control block: records the state of the scheduler activation’s thread when it blocks in the kernel or it is preempted. Keeps track of which thread is running on every scheduler activation.

Univ. of TehranDistributed Operating Systems66 When a new Program is started… Kernel creates a scheduler activation Assigns to it a processor Does an upcall to the user-space application (at a fixed entry point). The user-level system Receives the upcall and uses the scheduler activation as the context to initialize itself and start running the first (main) thread! The first thread may ask for the creation of more threads and additional processors. For each processor the kernel will create a new activation and upcall the user-level to say that the new processor is there The user-level picks a thread and executes it on the new processor.

Univ. of TehranDistributed Operating Systems67 Notify the user level of an event… The kernel created a new scheduler activation Assigns to it a new processor and upcalls the user-space. As soon as the upcall happens An event can be processed Run a user-level thread Trap and block into the kernel KEY DIFFERENCE between Scheduler Activation and Kernel Threads If an activation’s user-level thread is stopped by the kernel, the thread is never directly resumed by the kernel! INSTEAD a new scheduler activation is done whose prime objective is to notify the user-space that the thread has been suspended. Then… the user-level thread system removes the state of the thread from the “old” activation. Tells the kernel that the “old” activation can be reused. Decides which thread to run on the processor INVARIANT: number of activations = number of processors allocated to a job.

Univ. of TehranDistributed Operating Systems68 Kernel vectors to user-Space as Activations Add-this-processor(processor #) /* Execute a runnable user-level thread */ Processor-has-been-preempted(preempted activation # and its machine state) /* Return to the ready list the user-level thread that was executing in the context of the preempted scheduler activation */ Scheduler-activation-has-blocked(blocked activation #) /* the blocked activation no longer uses its processor */ Scheduler-activation-has-been-unlocked(unblocked activation # and its machine state) /*Return to the ready list the user-level thread that was executing in context of the blocked scheduler activation */

Univ. of TehranDistributed Operating Systems69 I/O happens for Thread (1)….. (4) (3) (2) (1) User Program User-Level Runtime System Operating System Kernel Processors Add Processor Add Processor (A)(B) T1

Univ. of TehranDistributed Operating Systems70 A’s Thread has blocked on an I/O request (4) (3) (2) (1) User Program User-Level Runtime System Processors B (A)(B) ( C ) A’s thread has blocked Operating System Kernel T2

Univ. of TehranDistributed Operating Systems71 (4) (3) (2) (1) User Program User-Level Runtime System Processors (A)(B) ( C ) Operating System Kernel (1) (D) A’s Thread I/O completed T3

Univ. of TehranDistributed Operating Systems72 A’s Thread resumes on Scheduler Activation D (4) (3) (2) User Program User-Level Runtime System Processors ( C ) Operating System Kernel (1) (D) (1) T4

Univ. of TehranDistributed Operating Systems73 User-Level Events Notifying the Kernel Add-more-processors (additional # of processor needed) /* allocate more processors to this address space and start them running scheduler activations */ This-processor-is-idle() /* preempt this processor if another address space needs it */ The kernel’s processor allocator can favor address spaces that use fewer processors (and penalize those that use more).

Univ. of TehranDistributed Operating Systems74 Thread Operation Latencies (  sec.)

Univ. of TehranDistributed Operating Systems75 Speedup

Univ. of TehranDistributed Operating Systems76 Next Lecture Concurrency References Fast Mutual Exclusion for Uniprocessors On Optimistic Methods for Concurrency Control

Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani.

Similar presentations

Presentation on theme: "Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani.

Similar presentations

Presentation on theme: "Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani."— Presentation transcript:

Similar presentations

About project

Feedback