Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012.

Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012

Introduction Preliminary definitions – Monolithic kernels and microkernels – Capability systems – Remote procedure calls (RPC) Motivation – Analysis of the common case – Performance of RPC – Sources of overhead Lightweight RPC (LRPC) Performance

OS Kernel Paradigms Monolithic OS – All (or nearly all) services built into the kernel – One level of protection, but typically no internal firewalls – E.g., BSD UNIX (millions of LOC) – HW is exposed to a great deal of complex software – Hard to debug, extend Microkernel (or “small-kernel”) OS – The kernel provides only the minimum services necessary to support independent application programs (address space management, thread management, IPC) – Additional services are provided by user-space daemons – Certain daemons are imbued with special permission (e.g., HW access) by the kernel – Service requests utilize IPC

http://en.wikipedia.org/wiki/File:OS-structure.svg Separate address spaces are used to establish protection domains 1.No cross-domain read/write 2.The kernel mediates IPC 3.RPC can be used to give a procedural interface to IPC

Capability Systems A capability system is a security model – A security model specifies and enforces a security policy Provides a fine-grained protection model A capability is a communicable, unforgeable token of authority representing an access right In a capability system, the exchange of capabilities among mutually untrusting entities is used to manage privileged access throughout the system kernel ACL write(“/etc/passwd”) On whose authority do we write /etc/passwd? Need to consult an access control list One Possibility kernel fd = open(“/etc/passwd”, O_RDWR) write(fd) The open file descriptor proves write access was previously granted Capability System

Remote Procedure Call An IPC mechanism that allows a program to invoke a subroutine in another address space – The receiver might reside on the same physical system or over a network Provides a large-grained protection model The call semantics make it appear as though only a normal procedure call was performed – Stubs interface to a runtime environment, which handles data marshalling; the OS handles low-level IPC – Protection domain boundaries are hidden by stubs

Steps in a Traditional RPC Client Application Client Stub Client Runtime Library Client Kernel Server Application Server Stub Server Runtime Library Server Kernel sending path return path Potentially shared in a single-system RPC transport layer

The Use of RPC in Microkernel Systems Small-kernel systems can and do use RPC to borrow its large-grained protection model – Separate components are placed in disjoint address spaces (protection domains) – Communication between components is mediated by RPC, using messages – Advantages include: modularity, design simplification, failure isolation, and transparency (of network services) But this approach simultaneously borrows the control transfer facilities of RPC – Those are not optimized for same-machine control transfer – This leads to an unnecessary loss of efficiency

The Use of RPC Systems (I) Bershad argues that the common case for RPC: – is cross-domain (not cross-machine) – involves relatively simple parameters – can be optimized Frequency of Remote Activity Operation SystemPercentage of operations that cross machine boundaries V3.0 Taos5.3 Sun UNIX+NFS0.6 1.Frequency of Cross-Machine Activity

The Use of RPC Systems (II) 2.Parameter Size and Complexity – 1,487,105 cross-domain procedure RPCs observed during one four-day period – 95% were to 10 procedures; 75% were to 3 procedures – None of them involved complex arguments – Furthermore, most RPCs involve a relatively small amount of data transfer

The Use of RPC Systems (III) 3.The Performance of Cross-Domain RPC – The theoretical minimum time for a null cross- domain operation includes time for Two procedure calls Two traps Two virtual memory context switches – The cross-domain performance, measured across six systems using the Null RPC, varies from over 300% to over 800% of the theoretical minimum

Sources of Overhead in Cross-Domain RPC Stub Overhead: stubs are general enough for cross-machine RPC, but inefficient for the common case of local RPC calls Message Buffer Overhead: client/kernel, kernel/server, server/kernel, kernel/client Access Validation: the kernel must validate the message sender on call and again in return Message Transfer: messages are enqueued by the sender and dequeued by the receiver Scheduling: separate, concrete threads run in client and server domains Context Switch: in going from client to server Dispatch: the server must receive and interpret the message

Lightweight RPC (LRPC) LRPC aims to improve the performance of cross-domain communication relative to RPC The execution model is borrowed from a protected procedure call – Control transfer proceeds by way of a kernel trap; the kernel validates the call and establishes a linkage – The client provides an argument stack and its own concrete thread of execution The programming semantics and large-grained protection model are borrowed from RPC – Servers execute in private protection domains – Each one exports a specific set of interfaces to which clients may bind – By allowing a binding, the server authorizes a client to access its procedures

LRPC High-Level Design Physical Memory Virtual Memory A-stack E-stack Control Flow sending path return path Control Flow Kernel thread

Implementation Details Execution of the server procedure is made by way of a kernel trap The client provides the server with an argument stack and its own concrete thread of execution The argument stacks (A-stacks) are shared between client and server; the execution stacks (E-stacks) belong exclusively in the server domain – A-stacks and E-stacks are associated at call time – Each A-stack queue is guarded by a single lock The client must bind to an LRPC interface before using it; binding: – establishes shared segments between client and server – allocates bookkeeping structures in the kernel – returns a non-forgeable binding object to the client, which serves as the key for accessing the server (recall capability systems) On multiprocessors, domains are cached on idle processors (to reduce latency)

Performance The measurements below were taken across 100,000 cross-domain calls in a tight loop LRPC/MP uses the domain-caching optimization for multiprocessors LRPC performs a context switch on each call Table IV. LRPC Performance of Four Tests (in microseconds) TestDescriptionLRPC/MPLRPCTaos NullThe Null cross-domain call125157464 AddA procedure taking two 4-byte arguments and returning one 4-byte argument 130164480 BigInA procedure taking one 200-byte argument173192539 BigInOutA procedure taking and returning one 200- byte argument 219227636

Discussion Items When the client thread is executing an LRPC, does the scheduler know it has changed context? Who is the parent of the server process? What is its main thread doing?

Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012.

Similar presentations

Presentation on theme: "Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012.

Similar presentations

Presentation on theme: "Lightweight Remote Procedure Call (Bershad, et. al.) Andy Jost CS 533, Winter 2012."— Presentation transcript:

Similar presentations

About project

Feedback