Presented by: SHILPI AGARWAL

Slides:

Advertisements

Similar presentations

Threads, SMP, and Microkernels

Advertisements

URPC for Shared Memory Multiprocessors Brian Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy ACM TOCS 9 (2), May 1991.

Scheduler Activations: Effective Kernel Support for the User-level Management of Parallelism Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,

User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Akbar.

Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.

Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.

Lightweight Remote Procedure Call BRIAN N. BERSHAD THOMAS E. ANDERSON EDWARD D. LAZOWSKA HENRY M. LEVY Presented by Wen Sun.

Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-

Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.

G Robert Grimm New York University Lightweight RPC.

User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Chris.

CS533 Concepts of Operating Systems Class 8 Shared Memory Implementations of Remote Procedure Call.

User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.

Dawson R. Engler, M. Frans Kaashoek, and James O'Tool Jr.

CS533 Concepts of Operating Systems Class 4 Remote Procedure Call.

Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.

3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.

Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy ACM Transactions Vol. 8, No. 1, February 1990,

3.5 Interprocess Communication

USER LEVEL INTERPROCESS COMMUNICATION FOR SHARED MEMORY MULTIPROCESSORS Presented by Elakkiya Pandian CS 533 OPERATING SYSTEMS – SPRING 2011 Brian N. Bershad.

CS533 Concepts of Operating Systems Class 9 User-Level Remote Procedure Call.

User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.

1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.

CS533 Concepts of Operating Systems Class 4 Remote Procedure Call.

CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.

1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.

CS533 Concepts of Operating Systems Class 9 Lightweight Remote Procedure Call (LRPC) Rizal Arryadi.

CS510 Concurrent Systems Jonathan Walpole. Lightweight Remote Procedure Call (LRPC)

Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.

Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.

CS533 Concepts of Operating Systems Jonathan Walpole.

Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,

Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,

Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.

Lightweight Remote Procedure Call BRIAN N. BERSHAD, THOMAS E. ANDERSON, EDWARD D. LASOWSKA, AND HENRY M. LEVY UNIVERSTY OF WASHINGTON "Lightweight Remote.

The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.

Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. Presented by: Tim Fleck.

The Mach System Silberschatz et al Presented By Anjana Venkat.

Brian Bershad, Thomas Anderson, Edward Lazowska, and Henry Levy Presented by: Byron Marohn Published: 1991.

Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.

Jonas Johansson Summarizing presentation of Scheduler Activations – A different approach to parallelism.

Kernel Design & Implementation

Copyright ©: Nahrstedt, Angrave, Abdelzaher

CS 6560: Operating Systems Design

Copyright ©: Nahrstedt, Angrave, Abdelzaher

Operating System Concepts

CS533 Concepts of Operating Systems

Chapter 4 Threads.

CS490 Windows Internals Quiz 2 09/27/2013.

KERNEL ARCHITECTURE.

B. N. Bershad, T. E. Anderson, E. D. Lazowska and H. M

Operating Systems Processes and Threads.

Sarah Diesburg Operating Systems COP 4610

Chapter 1 Introduction to Operating System Part 5

By Brian N. Bershad, Thomas E. Anderson, Edward D

Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.

Fast Communication and User Level Parallelism

Threads Chapter 4.

Multithreaded Programming

Presented by Neha Agrawal

CSE 451: Operating Systems Autumn 2001 Lecture 2 Architectural Support for Operating Systems Brian Bershad 310 Sieg Hall 1.

Thomas E. Anderson, Brian N. Bershad,

Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M

Chapter 2 Processes and Threads 2.1 Processes 2.2 Threads

CS533 Concepts of Operating Systems Class 11

Operating Systems Structure

Operating System Overview

Presentation transcript:

Presented by: SHILPI AGARWAL User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by: SHILPI AGARWAL

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Its problem URPC ITS COMPONENTS Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

IPC: INTERPROCESS COMMUNICATION Central to the design of OS. Communication between different address spaces on the same machine. Allows system decomposition across address space boundaries. Failure Isolation Extensibility Modularity Usability of the address spaces depends on the performance of the communication primitives.

Problems: IPC is traditionally the responsibility of Kernel. To switch from one address space to the other on the calling processor in order to run the receiving thread there and then returning back to the caller thread requires the kernel Intervention. High cost for invoking the kernel and reallocating processor to a different address space. LRPC indicates 70% of overhead can be attributed to kernel mediation. Degraded performance and added complexity when user-level threads communicate across boundaries

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Solution: URPC for shared memory multiprocessors. The user level thread packages on each machine can be used to efficiently switch to a different thread whenever caller or callee threads block. Thus kernel can be eliminated from the path of cross-address space communication. Use shared memory to send messages directly between address spaces. Avoid Processor Reallocation (use processor already active in the target address space).

URPC: Client thread invokes a procedure at server. It gets block, waiting for the reply. While blocked, it can run another ready thread in the same address space. When reply arrives, the blocked thread can be rescheduled to any processor allocated to its address space. In the server side, execution can be done by the processor already executing in the same address space. IN LRPC: The blocked thread and the ready thread are the same except running in the different address space. IN URPC: It schedules another thread from same address space on the clients processor. Advantage: Less overhead in Context switch then Processor Reallocation

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

URPC Division Of Responsibilities: Processor Reallocation Thread management Data transfer Only Processor reallocation requires kernel. Move Thread Management and data Transfer to User level.

Components of URPC

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Processor Reallocation Why should be avoided? Deciding and transferring the processor between threads of different address spaces Requires privileged kernel mode to access protected mapping registers Diminished cost of cache and TLB. Minimal latency same-address space context switch takes about 15 microseconds on the C-VAX while cross-address space processor reallocation takes 55 microseconds (doesn’t consider long-term costs!).

URPC: Optimistic reallocation policy Assumptions: Client has other work to do Server will soon have a processor to service a message Doesn't perform well in all situations Uniprocessors Real-time applications High-latency I/O operations (require early initialization ) Priority invocations URPC allows forced processor reallocation to solve some of these problems

Advantages over: Handoff scheduling: a single kernel operation blocks the client and reallocates its processor directly to the server. Kernel centralized data structure: creates performance bottleneck (lock contention, thread run queues and message channels)

If needed, Processor reallocation is done via Kernel. Needed for Load balancing Problem: Idle processor at the client side can donate itself to underpowered address space Kernel required to change the processor’s virtual memory context to underpowered address space. The identity of the donating processor is made known to the receiver.

Voluntary Return of Processors It states: A processor needs to be returned back to the client. when all outstanding messages from the client have generated replies. when the client has become “underpowered. Voluntary return of processors cannot be Enforced. URPC deals with Load balancing only for communicating applications Preemptive policies, which forcibly reallocate processors from one address space to other is required to avoid starvation. No need for global Processor allocator (it could be done by the client itself)

Sample execution Client: Editor Two servers: A window manager A File cache manager Two threads: T1 & T2

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Data transfer using shared memory In traditional RPC: Clients and servers can overhead each other (deny service, fail to release channel locks, provide bogus results. Up to higher-level protocols to filter abuses up to application layer. Kernel copies the data between address spaces In URPC: Logical channels of pair-wise shared memory Applications access URPC procedures through Stubs layer Stubs copy data in/out, no direct use of shared memory Arguments are passed in buffers that are allocated and pair-wise mapped during binding Data queues monitored by application level thread management Channels created & mapped once for every client/server pairing A bidirectional shared memory queue with test and set locks is used for data flow.

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Thread Management Strong interaction between thread management (start….stop) and cross address space communication (send…receive). This close interaction can be exploited to achieve extremely good performance for both (implemented together at user level) Thread management facilities can be provided either kernel or User level but high performance can be provided by user level. Threads overhead can be decided over three points of reference: Heavyweight: no distinction between a thread and its address space. Middleweight: threads and address spaces are decoupled. Lightweight: threads managed by user-level libraries. : implies two level scheduling (light weight threads on the top of weightier threads)

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Performance of URPC

Call latency And Throughput Latency increases when T> C + S Pure latency =T=C=S=1= 93 micro secs, Latency is proportional to the number of threads per CPU T = C = S = 1 call latency is 93 microseconds C = 1, S = 0, worst performance (need to reallocate processors frequently) In both cases, C = 2, S = 2 yields best performance

Problems with URPC: When T=1,latency is 373microsecs. Every call requires two traps and two processor reallocations. At this point, URPC performs worse than LRPC (157 microsecs) Why? 1. Processor reallocation in URPC is based on LRPC. 2. URPC integrated with two level scheduling Is there an idle processor ? and is there an underpowered address space to which it can be reallocated ? Two processors for single computation, only one active at a time. (Due to synchronous nature of RPC) Not ideal for all application types Single-threaded applications High-latency I/O

OUTLINE InterProcess Communication URPC ITS COMPONENTS Performance Processor Reallocation Data Transfer Thread Management Performance Latency Throughput Conclusion

Conclusion Better Performance and flexibility when move traditional OS functions out of kernel. URPC designs a appropriate division of responsibility between user level and kernel URPC demonstratres a design specific to a multiprocessor, and not just uniprocessor design that runs on multiprocessor hardware