Dynamic Processor Allocation for Adaptively Parallel Jobs

Slides:



Advertisements
Similar presentations
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Advertisements

SCHEDULING Kshama Desai Bijal Shah Kishore Putta Kashyap Sheth.
CS 162 Discussion Section Week 4 9/ /4. Today’s Section ●Project administrivia ●Quiz ●Lecture Review ●Worksheet and Discussion.
A. Frank - P. Weisberg Operating Systems Advanced CPU Scheduling.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
1 CMSC421: Principles of Operating Systems Nilanjan Banerjee Principles of Operating Systems Acknowledgments: Some of the slides are adapted from Prof.
Ceng Operating Systems Chapter 2.2 : Process Scheduling Process concept  Process scheduling Interprocess communication Deadlocks Threads.
Avishai Wool lecture Priority Scheduling Idea: Jobs are assigned priorities. Always, the job with the highest priority runs. Note: All scheduling.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
A. Frank - P. Weisberg Operating Systems CPU Scheduling.
Scheduler Activations Jeff Chase. Threads in a Process Threads are useful at user-level – Parallelism, hide I/O latency, interactivity Option A (early.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
What is Concurrent Programming? Maram Bani Younes.
 Scheduling  Linux Scheduling  Linux Scheduling Policy  Classification Of Processes In Linux  Linux Scheduling Classes  Process States In Linux.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
Today  Table/List operations  Parallel Arrays  Efficiency and Big ‘O’  Searching.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
Overview Work-stealing scheduler O(pS 1 ) worst case space small overhead Narlikar scheduler 1 O(S 1 +pKT  ) worst case space large overhead Hybrid scheduler.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Kernel Locking Techniques by Robert Love presented by Scott Price.
1 Real-Time Scheduling. 2Today Operating System task scheduling –Traditional (non-real-time) scheduling –Real-time scheduling.
Process Control Management Prepared by: Dhason Operating Systems.
Process Scheduling. Scheduling Strategies Scheduling strategies can broadly fall into two categories  Co-operative scheduling is where the currently.
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Processes and threads.
PROCESS MANAGEMENT IN MACH
Wayne Wolf Dept. of EE Princeton University
Accounting information and limits
VDK Concepts and Features How to Create a Project with VDK support
Chapter 2 Scheduling.
Chapter 8 – Processor Scheduling
Operating Systems Processes Scheduling.
Chapter 2.2 : Process Scheduling
Main Memory Management
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Chapter 6: CPU Scheduling
Lottery Scheduling Ish Baid.
Process Scheduling B.Ramamurthy 11/18/2018.
CPU Scheduling G.Anuradha
Process Scheduling B.Ramamurthy 12/5/2018.
Chapter5: CPU Scheduling
Smita Vijayakumar Qian Zhu Gagan Agrawal
Chapter 5: CPU Scheduling
What is Concurrent Programming?
Virtual-Time Round-Robin: An O(1) Proportional Share Scheduler
Outline Scheduling algorithms Multi-processor scheduling
CPU scheduling decisions may take place when a process:
Threads Chapter 4.
Richard Anderson Lecture 6 Greedy Algorithms
What is Concurrent Programming?
Chapter 5: CPU Scheduling
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 2/23/2019.
Richard Anderson Lecture 7 Greedy Algorithms
Process Scheduling B.Ramamurthy 4/11/2019.
Process Scheduling B.Ramamurthy 4/7/2019.
Uniprocessor scheduling
VDK Concepts and Features How to Create a Project with VDK support
Process Scheduling B.Ramamurthy 4/19/2019.
Process Scheduling B.Ramamurthy 4/24/2019.
VDK Concepts and Features How to Create a Project with VDK support
Shortest-Job-First (SJR) Scheduling
Process Scheduling B.Ramamurthy 5/7/2019.
Richard Anderson Autumn 2015 Lecture 7
Chapter 3: Processes Process Concept Process Scheduling
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
COMP755 Advanced Operating Systems
Chapter 3: Process Management
Presentation transcript:

Dynamic Processor Allocation for Adaptively Parallel Jobs Kunal Agrawal Siddhartha Sen

What is the problem? Allocate the processors fairly and efficiently [kunal@ygg ~]$ ./strassen --nproc 4 [sidsen@ygg ~]$ ./nfib --nproc 32 [bradley@ygg ~]$ ./nfib --nproc 16 Allocate the processors fairly and efficiently

Why so Dynamic Scheduling? Considers all the jobs in the system. Programmer doesn’t have to specify the number of processors. Parallelism can change during execution. [kunal@ygg ~]$ ./strassen --nproc 4 [kunal@ygg ~]$ ./strassen

Allocation vs. Scheduling Job 2 Job n Job 1 … Operating System P1 P2 P3 P4 P5 P6 …… Pk

Terminology The parallelism of a job is dynamic adaptively parallel jobs—jobs for which the number of processors that can be used without waste varies during execution. At any given time, each job j has a desire—the maximum number of efficiently usable processors, or the parallelism of the job (dj). allocation—the number of processors allotted to the job (aj).

Terminology We want to allocate processors to jobs in a way that is fair—whenever a job receives fewer processors than it desires, all other jobs receive at most one more processor than this job received. aj < dj  (aj + 1) is a max efficient—no job receives more processors than it desires, and we use as many processors as possible. j aj ≤ dj j aj < dj  there are no free processors

Overall Goal Design and implement a fair and efficient dynamic processor allocation system for adaptively parallel jobs.

Example: Fair and Efficient Allocation Job 1 Job 2 Job 3 Job 4 Job 5 Job 6

Assumptions All jobs are Cilk jobs. Jobs can enter and leave the system at will. All jobs are mutually trusting, in that they will stay within the bounds of their allocations. communicate their desires honestly. Each job has at least one processor. Jobs have some amount of time to reach their allocations.

High-Level Sequence of Events Processor Allocation System Job 1 … Job N 2. Report current desire 3. Recalculate allocations 4. Get allocation 1. Estimate desire 5. Adjust allocation (add/remove processors)

Processor Allocation System Main Algorithms (1, 2) Dynamically estimate the current desire of a job. Steal rate (Bin Song) Number of threads in ready deque (3) Dynamically determine the allotment for each job such that the resulting allocation is fair and efficient. SRLBA algorithm (Bin Song) Global allocation algorithm (4, 5) Converge to the granted allocation by increasing/decreasing number of processors in use. While work-stealing? Periodically by a background thread? Processor Allocation System Job j  3. … 1. … 2. … 4. … 5. …  

Processor Allocation System Desire Estimation (1) Estimate processor desire dj: add up the number of threads in the ready deques of each processor and divide by a constant. Processor Allocation System + H T k > 3 2. … Job j 1. … (2) Report the desire to the processor allocation system.

Adjusting the Allocation (4) Get the allocation anew. (5) Adjust the allocation. If anew < aold, remove (aold – anew) processors If anew > aold, add (anew – aold) processors Processor Allocation System Job j 1. … 3. … 2. … 4. … 5. …

Implementation Details Adding up the number of threads in the ready deques While work-stealing Periodically by a background thread Removing processors Adding processors  Too late!    Complicated  Bad idea 

Processor Allocation Start-up Job 1 Job 2 Job 3 Job 4 Desire=4 Alloc= 4 6 5 5 4 1 4 2 Add another slide Free Processors 16 12 6 1

Processor Allocation Job 2 decreases desire. No Reallocation !! Job 1 4 Desire=5 Alloc=4 Desire=5 Alloc=4 Add another slide Free Processors No Reallocation !!

Processor Allocation Job 1 decreases desire. Reallocate !! Job 1 Job 2 2 Desire=6 Alloc=4 Desire=5 Alloc=4 Desire=5 Alloc=4 2 5 5 Add another slide Free Processors 1 2 Reallocate !!

Processor Allocation Job 2 Increases desire. No Reallocation !! Job 1 8 Desire=5 Alloc=5 Desire=5 Alloc=4 Add another slide Free Processors No Reallocation !!

Processor Allocation Job 1 Increases desire. Reallocate !! Job 1 Job 2 5 Desire=8 Alloc=5 Desire=5 Alloc=5 Desire=5 Alloc=4 3 4 4 4 Add another slide Free Processors Reallocate !!

Implementation Details min_depr_alloc:4 max_alloc:5 When desire of job j decreases: if (new_desire<alloc) take processors from j and give to jobs having min_depr_alloc. Job Id:1 Desire:6 Alloc:4 Job Id:2 Desire:2 Alloc:2 Job Id:3 Desire:7 Alloc:5

Processor Allocation Job 1 decreases desire. mda=4 ma= 4 5 Job 1 Job 2 2 Desire=6 Alloc=4 Desire=5 Alloc=4 Desire=5 Alloc=4 2 5 5 Add another slide Free Processors 2 1

Implementation min_depr_alloc:4 max_alloc:5 When desire of job j decreases: if (new_desire<alloc) take processors from j and give to jobs having min_depr_alloc. When desire of job j increases: if (alloc<mda) take processors from jobs having max_alloc and give them to j until j reaches min_depr_alloc or new_desire. Job Id:1 Desire:6 Alloc:4 Job Id:2 Desire:2 Alloc:2 Job Id:3 Desire:7 Alloc:5

Processor Allocation Job 1 Increases desire. mda=4 ma= 4 5 Job 1 Job 2 5 Desire=8 Alloc=5 Desire=5 Alloc=5 Desire=5 Alloc=4 3 4 4 4 Add another slide Free Processors

Experiments Correctness: Does it work? Effectiveness: Are there cases where it is better than the static allocation? Responsiveness: How long does it take the jobs to reach their allocation?

Conclusions The desire estimation and processor allocation algorithms are simple and easy to implement. We’ll see how well they do in practice once we’ve performed the experiments. There are many ways of improving the algorithms and in many cases it is not clear what we should do.

Job Tasks (Extensions) Incorporate heuristics on steal-rate (Bin Song’s idea). Remove processors in the background thread, not while work stealing. Need a mechanism for putting processors with pending work to sleep When adding processors, wake up processors with pending work first Processor Allocation System Job j 1. … 2. … 4. … 5. …

Processor Allocation System (Extensions) Use a sorted data structure for job entries. Sort by desires Sort by allocations Group jobs: Desires satisfied (aj = dj) Minimum deprived allocation (aj = min_depr_alloc) Maximum allocation (aj = max_alloc) Need fast inserts/deletes and fast sequential walk. Processor Allocation System 3. … Job j

Processor Allocation System (Extensions) Rethink definitions of fairness and efficiency. Incorporate histories of processor usage for each job Implement a mechanism for assigning different priorities to users or jobs Move the processor allocation system into the kernel. Jobs still report desires since they know best How to group the jobs? Make classes of jobs (Cilk, Emacs, etc.) Group by user (sidsen, kunal, etc.)

Questions?