Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,

Slides:



Advertisements
Similar presentations
Processes Management.
Advertisements

Copyright © 2000, Daniel W. Lewis. All Rights Reserved. CHAPTER 8 SCHEDULING.
Multiprocessor OS The functional capabilities often required in an OS for a multiprogrammed computer include the resource allocation and management schemes,
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Akbar.
Chap 4 Multithreaded Programming. Thread A thread is a basic unit of CPU utilization It comprises a thread ID, a program counter, a register set and a.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
CS 5204 – Operating Systems 1 Scheduler Activations.
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Modified from Silberschatz, Galvin and Gagne ©2009 Lecture 7 Chapter 4: Threads (cont)
CMPT 300: Operating Systems I Dr. Mohamed Hefeeda
Chapter 4: Threads. Overview Multithreading Models Threading Issues Pthreads Windows XP Threads.
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris.
SCHEDULER ACTIVATIONS Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, Henry.
User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.
1: Operating Systems Overview
Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.
Threads vs. Processes April 7, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
1 Thursday, June 15, 2006 Confucius says: He who play in root, eventually kill tree.
3.5 Interprocess Communication
USER LEVEL INTERPROCESS COMMUNICATION FOR SHARED MEMORY MULTIPROCESSORS Presented by Elakkiya Pandian CS 533 OPERATING SYSTEMS – SPRING 2011 Brian N. Bershad.
Chapter 11 Operating Systems
Scheduler Activations : Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska.
1 Process Description and Control Chapter 3 = Why process? = What is a process? = How to represent processes? = How to control processes?
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
G Robert Grimm New York University Scheduler Activations.
Scheduler Activations Jeff Chase. Threads in a Process Threads are useful at user-level – Parallelism, hide I/O latency, interactivity Option A (early.
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
CS533 Concepts of Operating Systems Class 9 Lightweight Remote Procedure Call (LRPC) Rizal Arryadi.
14.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 4: Threads.
Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism by Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Silberschatz, Galvin and Gagne ©2011Operating System Concepts Essentials – 8 th Edition Chapter 4: Threads.
Process Management. Processes Process Concept Process Scheduling Operations on Processes Interprocess Communication Examples of IPC Systems Communication.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Threads, Thread management & Resource Management.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
CSE 60641: Operating Systems Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. Thomas E. Anderson, Brian N.
ITFN 3601 Introduction to Operating Systems Lecture 3 Processes, Threads & Scheduling Intro.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
Department of Computer Science and Software Engineering
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
Processes & Threads Introduction to Operating Systems: Module 5.
Brian Bershad, Thomas Anderson, Edward Lazowska, and Henry Levy Presented by: Byron Marohn Published: 1991.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Jonas Johansson Summarizing presentation of Scheduler Activations – A different approach to parallelism.
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
Process Management Process Concept Why only the global variables?
Processes and Threads Processes and their scheduling
Chapter 4: Multithreaded Programming
Chapter 6: CPU Scheduling
Chapter 4: Threads.
Scheduler Activations
Process & its States Lecture 5.
Thread Implementation Issues
Fast Communication and User Level Parallelism
Presented by: SHILPI AGARWAL
Thomas E. Anderson, Brian N. Bershad,
Threads vs. Processes Hank Levy 1.
CS510 Operating System Foundations
CS703 – Advanced Operating Systems
CSE 542: Operating Systems
Threads CSE 2431: Introduction to Operating Systems
Presentation transcript:

Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, and Henry M. Levy Presenter: Yi Qiao

Outline Introduction User-level threads: advantages and limitations Effective kernel support for user-level management of parallelism Implementation Performance Summary

Introduction Effectiveness of parallel computing –Largely depends on the performance and cost of primitives used to express and control the parallelism within programs Shared memory between multiple processes –Better for uniprocessor environment Use of threads –Separate the notion of a sequential execution stream from other aspects such as address spaces and I/O descriptors –A significant performance advantage over traditional processes

Problem with Threads User-level threads –Execute within the context of traditional processes Thread management requires no kernel intervention Flexible, easily customized without kernel modification Each process – virtual processor –Multiprogramming, I/O, page faults can lead to poor performance or incorrect behavior of user-level threads Kernel-level threads –Avoids system integration problem Directly mapped onto physical processor –Too heavyweight An order of magnitude worse than the best performance user-level threads

Goal of the Work A kernel interface and a user-level thread package that combine the functionality of kernel threads and the performance and flexibility of user-level threads –When no kernel intervention needed, same performance as best user-level thread –When kernel needs to be involved, mimic a kernel thread management system No idle processors No high priority thread waits for low-priority ones Trap of a thread won’t block others –Simple and easy application-specific cutomization Challenge –Necessary control and scheduling information is distributed between the kernel and application address space

Approach Each application provided with a virtual multiprocessor, and control which of its threads run on these processors The OS kernel control the allocation of processors among address spaces Kernel notifies the address space scheduler of relevant kernel event –Scheduler activation Vectors control to the thread scheduler on a kernel event Thread system notified kernel of user-level thread events that affect processor allocation decisions –Thread scheduler Execute user-level threads Make requests to the kernel

User-level Threads: Performance Advantages and Functionality Limitations Inherent cost in kernel threads management –Accessing thread management operations Kernel trap, parameter copy and checking –Cost of generality A single underlying implementation used by all applications User-level threads improve both performance and flexibility

User-level Threads: Performance Advantages and Functionality Limitations (Cont.) Poor integration of user-level threads on kernel interface –Kernel threads are wrong abstraction of supporting user-level systems Kernel threads block, resume and preempted without notification to user level Kernel threads are scheduled obliviously to user-level thread state Cause problems both for uniprogrammed systems and multiprogrammed systems –I/O –Page faults

Effective Kernel Support for the User- level Management of Parallelism A new kernel interface + user-level thread system –Functionality of kernel threads –Performance and flexibility of user-level threads –Each user-level thread system is provided with its own virtual multiprocessor, the abstraction of dedicated physical machine Kernel allocates processors to address spaces – complete control Each user-level thread system has complete control over which threads to run on allocated processors Kernel vectors events to appropriate thread scheduler –# of processors change, I/O, page fault User-level thread system notified kernel when needed –Only a subset of user-level operations which may affect processor allocation Application programmer does the same thing as if programming with kernel threads –Programmers provided with a normal Topaz thread interface

Explicit Vectoring of Kernel Events to the User-level Thread Scheduler Scheduler Activation –Each vectored event causes the user-level thread system to reconsider its scheduling decision –Three roles Serves as a vessel (execution context) for running user- level threads Notify the user-level thread of a kernel event Saving processor context of the activation’s current user- level thread when the thread is stopped by the kernel (I/O or processor preemption) –Similar data structure as a traditional kernel thread

Scheduler Activation (Cont.) Distinction between scheduler activations and kernel threads –Once an activation’s user-level thread is stopped by the kernel, the thread is never directly resumed by the kernel –Maintains the invariant that there are always as many running scheduler activations as processors assigned to the address space Events are vectored where a scheduling decision needs to be made

Example: I/O Request/Completion T1: Two processors allocated by kernel, two upcalls T2: Thread 1 blocks in the kernel, another upcall T3: I/O completes, preempts one processor and do the upcall T4: The upcall takes a thread off the ready list and run it Same mechanism to reallocate a processor from one address space to another

Scheduler Activations (Cont.) Reallocate a processor from one address space to another (multiprogramming) –Stop the old activation, use the processor to do an upcall into the new address space with a new activation –Need a second processor in old address space for an upcall here, notifying stop of two use-level threads Some minor points –If threads have priorities, an additional preemption may be needed –Application is free to build any other concurrency model on top of scheduler activations –Sometimes a user-level thread blocked in the kernel may need to execute further in kernel mode when the I/O completes

Notifying the Kernel of User-level Events Only a small subset of user-level events that affect the kernel processor allocation decision need to be notified –Transition to the state where the address space has more runnable threads than processors –Transition to the state where the address space has more processors than runnable threads How to keep applications honest?

Critical Sections Block or preempt a user-level thread in a critical section –Poor performance –Deadlock Solution –Prevention Requires kernel to yield control over processor allocation to the user-level –Recovery The thread system checks if the thread was executing in a critical section –If so, continue temporarily via a user-level context switch –Then another context switch and relinquished control back to the original upcall

Implementation Modifying Topaz –Change the Topaz thread management routines to implement scheduler activations –Explicit allocation of processors to address spaces Modifying FastThreads –Process upcalls and provide Topaz with information related to processor allocation decisions A few handred lines of code added to FastThreads, 1200 lines to Topaz

Implementation (Cont.) Processor Allocation Policy –Processors divided evenly among highest priority address spaces –Then are divided evenly among the remainder –Time-sliced only if the available processors is not an integer multiple of the number of address spaces that want them –Possible to for an address space to use kernel threads instead of scheduler activations Binary compatibility with existing Topaz applications Thread Scheduling Policy –Application can choose any scheduling policy Default: per-processor ready lists following FIFO

Implementation (Cont.) Performance Enhancements –Critical Sections – need to check whether the preempted user-level thread has a lock Thread set a flag when entering a critical section and clear it when finish –Overhead + latency Making a copy of every low-level critical section with post-processing of complier-generated code, and continues the preempted thread at the copy of the critical section –No overhead on lock latency in the common case –Management of scheduler activations Caching discarded scheduler activations for later reuse

Performance Goal: Combining the functionality of kernel threads with the performance and flexibility of user-level threads Evaluation questions –What is the cost of user-level thread operations? Fork, block –What is the cost of communication between kernel and the user level? –What is the overall effect on the performance of applications?

Performance (Cont.) Thread Performance –Cost of user-level thread operations close to those of the FastThreads package Preserve the order of magnitude advantage over kernel threads Upcall Performance –Help determine the “break-even” point to outperform kernel threads –Two user-level threads signal and wait through the kernel 2.4 milliseconds, five times worse than Topaz threads –Built as a quick modification to existing Topaz thread system –Written in Modula-2+, much slower than assembler Production scheduler activation could be faster

Application Performance Compare Topaz kernel threads, FastThreads, and FastThreads on top of scheduler activations –Application An O(N logN) solution to the N-body problem Can be either compute or I/O bound –Memory used by the application can be controlled All tests run on a six processor CVAX Firefly

Application Performance Case 1- Application makes minimal use of kernel services –Enough memory, negligible I/O and no other applications Run as fast as original FastThreads –1 processor, all perform worse than sequential implementation –More processors, kernel threads prevent good performance –Slight divergence of FastThreads and new FastThreads for 4 or 5 processors

Application Performance Case 2 – Kernel involvement required for I/O purposes –New FastThreads performs best When less and less memory available, all three systems degrade fast –Old FastThreads is the worst one »When a user-level thread blocks, the kernel thread also blocks –New FastThreads and Topaz threads can overlap I/O with useful computation

Application Performance Case 3 – Multiprogramming environment –Two copies of the N-body application on the six processors Speedup of new FastThread is within 5% of uniprogramming environment with 3 processors Old FastThread and Topaz perform much worse –Old FastThread - Physical processors idling waiting for a lock to be released while the lock holder is descheduled –Topaz – common thread operations are more expensive Limitation of the experiments –Limited number of processors makes it impossible for large parallel applications or higher multiprogramming levels

Conclusion Scheduler Activation – a kernel interface that combines with user-level thread package –Achieves the performance of user-level threads (in the common case) with the functionality of kernel threads (correct behavior for infrequent case) –Responsibility division Kernel –Processor allocation –Kernel event notification Application address space –Thread scheduling –Subset of user-level events affecting processor allocation decisions –Any user-level concurrency model can be supported