User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

URPC for Shared Memory Multiprocessors Brian Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy ACM TOCS 9 (2), May 1991.
Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Akbar.
CS 5204 – Operating Systems 1 Scheduler Activations.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
Lightweight Remote Procedure Call BRIAN N. BERSHAD THOMAS E. ANDERSON EDWARD D. LAZOWSKA HENRY M. LEVY Presented by Wen Sun.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Computer Systems/Operating Systems - Class 8
G Robert Grimm New York University Lightweight RPC.
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris.
CS533 Concepts of Operating Systems Class 8 Shared Memory Implementations of Remote Procedure Call.
CS533 Concepts of Operating Systems Class 4 Remote Procedure Call.
Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy ACM Transactions Vol. 8, No. 1, February 1990,
Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University.
USER LEVEL INTERPROCESS COMMUNICATION FOR SHARED MEMORY MULTIPROCESSORS Presented by Elakkiya Pandian CS 533 OPERATING SYSTEMS – SPRING 2011 Brian N. Bershad.
CS533 Concepts of Operating Systems Class 9 User-Level Remote Procedure Call.
1 Chapter 4 Threads Threads: Resource ownership and execution.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
CS533 Concepts of Operating Systems Class 4 Remote Procedure Call.
CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.
Scheduler Activations On BSD: Sharing Thread Management Between Kernel and Application Christopher Small and Margo Seltzer Harvard University Presenter:
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Scheduler Activations Jeff Chase. Threads in a Process Threads are useful at user-level – Parallelism, hide I/O latency, interactivity Option A (early.
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
CS533 Concepts of Operating Systems Class 9 Lightweight Remote Procedure Call (LRPC) Rizal Arryadi.
CS510 Concurrent Systems Jonathan Walpole. Lightweight Remote Procedure Call (LRPC)
Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
CS533 Concepts of Operating Systems Jonathan Walpole.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
1 Threads, SMP, and Microkernels Chapter 4. 2 Focus and Subtopics Focus: More advanced concepts related to process management : Resource ownership vs.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.
Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Lightweight Remote Procedure Call BRIAN N. BERSHAD, THOMAS E. ANDERSON, EDWARD D. LASOWSKA, AND HENRY M. LEVY UNIVERSTY OF WASHINGTON "Lightweight Remote.
Threads G.Anuradha (Reference : William Stallings)
The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
The Mach System Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne Presented by: Jee Vang.
CSE 451: Operating Systems Winter 2015 Module 5 1 / 2 User-Level Threads & Scheduler Activations Mark Zbikowski 476 Allen Center.
Remote Procedure Call Andy Wang Operating Systems COP 4610 / CGS 5765.
Networking Implementations (part 1) CPS210 Spring 2006.
Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. Presented by: Tim Fleck.
Mark Stanovich Operating Systems COP Primitives to Build Distributed Applications send and receive Used to synchronize cooperating processes running.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Brian Bershad, Thomas Anderson, Edward Lazowska, and Henry Levy Presented by: Byron Marohn Published: 1991.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
CS533 Concepts of Operating Systems
B. N. Bershad, T. E. Anderson, E. D. Lazowska and H. M
Sarah Diesburg Operating Systems COP 4610
By Brian N. Bershad, Thomas E. Anderson, Edward D
Lecture 4- Threads, SMP, and Microkernels
Fast Communication and User Level Parallelism
Multithreaded Programming
Presented by Neha Agrawal
Presented by: SHILPI AGARWAL
Thomas E. Anderson, Brian N. Bershad,
Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M
CS533 Concepts of Operating Systems Class 11
Presentation transcript:

User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.

Introduction RPC RPC Help in implementing distributed applications by eliminating the need to implement communication mechanism. Help in implementing distributed applications by eliminating the need to implement communication mechanism. Decomposed system provides advantages of failure isolation, extensibility and modularity. So RPC is used even when the call is in the same machine. Decomposed system provides advantages of failure isolation, extensibility and modularity. So RPC is used even when the call is in the same machine.

Introduction RPC Costs RPC Costs Stub overhead Stub overhead Message buffer overhead (4 copies) Message buffer overhead (4 copies) Access validation Access validation Message transfer Message transfer Scheduling Scheduling Context switch Context switch Dispatch Dispatch

Introduction LRPC Costs LRPC Costs Stub overhead Stub overhead Message buffer overhead (1 copy) Message buffer overhead (1 copy) Only necessary access validation Only necessary access validation Message transfer Message transfer Only necessary scheduling Only necessary scheduling Context switch is minimized by using domain caching Context switch is minimized by using domain caching

Introduction IPC IPC Main components (All work in Kernel) Main components (All work in Kernel) Processor reallocation (process context switch) Processor reallocation (process context switch) Data transfer Data transfer Thread management Thread management Problems Problems Processor reallocation is expensive Processor reallocation is expensive Parallel applications need user-level thread management Parallel applications need user-level thread management

URPC User-Level Remote Procedure Call User-Level Remote Procedure Call Shared memory multiprocessors Shared memory multiprocessors Processor reallocation - minimize Processor reallocation - minimize Data transfer - user-level (Package called URPC) Data transfer - user-level (Package called URPC) Thread management - user-level (Package called FastThreads) Thread management - user-level (Package called FastThreads)

User-level components

Processor Reallocation Limit the frequency of processor reallocation Limit the frequency of processor reallocation Why Why Cost of process context switch is more expensive than thread context switch Cost of process context switch is more expensive than thread context switch Cost of invoking kernel Cost of invoking kernel -Client makes procedure call in server address space -Invoke kernel -Kernel reallocates processor to server address space -Server finishes the job -Invoke kernel -Kernel reallocates processor to client address space -Client resumes the work

Processor Reallocation Limit the frequency of processor reallocation Limit the frequency of processor reallocation How How Optimistic reallocation policy Optimistic reallocation policy Client has other works Client has other works Server has or will soon has a processor to do the job Server has or will soon has a processor to do the job Uniprocessor can delay processor reallocation Uniprocessor can delay processor reallocation -Client makes procedure call in server address space -Client does something else -Server finishes the job -Client resumes the work

Processor Reallocation Problems Problems Inappropriate situations Inappropriate situations Single-threaded client, real time applications & high- latency I/O applications Single-threaded client, real time applications & high- latency I/O applications Solve: Allow client to force processor reallocation Solve: Allow client to force processor reallocation Underpowered Underpowered No processor to handle the pending request from client No processor to handle the pending request from client Solve: Donate – idle processor donates itself to underpowered address space Solve: Donate – idle processor donates itself to underpowered address space

Processor Reallocation Problems Problems Voluntary return of processor Voluntary return of processor Processor working in server never return to client because it is too busy working on the request of other clients. Processor working in server never return to client because it is too busy working on the request of other clients. Solve: enforce the process reallocation when necessary such as high priority waiting while low priority job is running and processor is idling Solve: enforce the process reallocation when necessary such as high priority waiting while low priority job is running and processor is idling

Processor Reallocation LRPC VS URPC LRPC VS URPC Domain caching looks for idle processor in server context Domain caching looks for idle processor in server context Optimistic reallocation assume there will be an available processor in server context and queue the request to be done later Optimistic reallocation assume there will be an available processor in server context and queue the request to be done later URPC needs two level scheduling decisions including looking for idle processor and underpowered address space while LRPC does not. URPC needs two level scheduling decisions including looking for idle processor and underpowered address space while LRPC does not.

Data Transfer Use pair-wise shared memory to avoid the need of copying in kernel. Use pair-wise shared memory to avoid the need of copying in kernel. Both give the same level of security since data need to be passed into stubs before it can be used Both give the same level of security since data need to be passed into stubs before it can be used

Thread Management Arguments Arguments Fine-grained parallel application needs high performance thread management which could only be achieved by implementing in user-level Fine-grained parallel application needs high performance thread management which could only be achieved by implementing in user-level Communication & Thread management can achieve very good performances when both are implemented at user-level Communication & Thread management can achieve very good performances when both are implemented at user-level

Thread Management Features of kernel such as time slicing degrade performance of applications Features of kernel such as time slicing degrade performance of applications To invoke thread management operation, kernel traps are required To invoke thread management operation, kernel traps are required Thread management policy implemented in kernel is unlikely to be efficient for all parallel applications Thread management policy implemented in kernel is unlikely to be efficient for all parallel applications

Thread Management Threads block in order to Threads block in order to Synchronize their activities in same address space Synchronize their activities in same address space Wait for external events from different address space Wait for external events from different address space Communication implemented at kernel level will result in synchronization at both user level and kernel level Communication implemented at kernel level will result in synchronization at both user level and kernel level

URPC

Performance Thread management faster at user level Thread management faster at user level Component breakdown Component breakdown

Performance Call latency & throughput is at worst when S=0 Call latency & throughput is at worst when S=0

Conclusion Moving the possible functionality from kernel into user-lever to improve performance Moving the possible functionality from kernel into user-lever to improve performance In order to achieve great performance on multiprocessors, system need to be designed to support its functionality In order to achieve great performance on multiprocessors, system need to be designed to support its functionality