1 Reliable Distributed Systems RPC and Client-Server Computing Chapter 4.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tam Vu Remote Procedure Call CISC 879 – Spring 03 Tam Vu March 06, 03.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Computer Systems/Operating Systems - Class 8
Distributed Processing, Client/Server, and Clusters
G Robert Grimm New York University Lightweight RPC.
Reliable Distributed Systems
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Implementing Remote Procedure Calls Authors: Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Presenter: Jim Santmyer Thanks to:
User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
3.5 Interprocess Communication
I/O Systems CS 3100 I/O Hardware1. I/O Hardware Incredible variety of I/O devices Common concepts ◦Port ◦Bus (daisy chain or shared direct access) ◦Controller.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
.NET Mobile Application Development Remote Procedure Call.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
CS533 Concepts of Operating Systems Class 9 Lightweight Remote Procedure Call (LRPC) Rizal Arryadi.
CS510 Concurrent Systems Jonathan Walpole. Lightweight Remote Procedure Call (LRPC)
Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.
Chapter 8 Windows Outline Programming Windows 2000 System structure Processes and threads in Windows 2000 Memory management The Windows 2000 file.
Networked File System CS Introduction to Operating Systems.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
I/O Systems I/O Hardware Application I/O Interface
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Chapter 13: I/O Systems. 13.2/34 Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware.
3.1 Silberschatz, Galvin and Gagne ©2009Operating System Concepts with Java – 8 th Edition Chapter 3: Processes.
 Remote Procedure Call (RPC) is a high-level model for client-sever communication.  It provides the programmers with a familiar mechanism for building.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC Tanenbaum.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
Reliable Distributed Systems RPC and Client-Server Computing.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
The Client-Server Model And the Socket API. Client-Server (1) The datagram service does not require cooperation between the peer applications but such.
Remote Procedure Call RPC
The Mach System Silberschatz et al Presented By Anjana Venkat.
13-1 Chapter 13 Concurrency Topics Introduction Introduction to Subprogram-Level Concurrency Semaphores Monitors Message Passing Java Threads C# Threads.
Remote Procedure Call and Serialization BY: AARON MCKAY.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
Implementing Remote Procedure Calls Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Published: ACM Transactions on Computer Systems,
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O.
Computer Science Lecture 4, page 1 CS677: Distributed OS Last Class: RPCs RPCs make distributed computations look like local computations Issues: –Parameter.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
Module 12: I/O Systems I/O hardware Application I/O Interface
03 – Remote invoaction Request-reply RPC RMI Coulouris 5
CS533 Concepts of Operating Systems
Threads, SMP, and Microkernels
Lecture 4: RPC Remote Procedure Call Coulouris et al: Chapter 5
Operating System Concepts
Lecture 4- Threads, SMP, and Microkernels
Chapter 2: Operating-System Structures
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 2: Operating-System Structures
Last Class: Communication in Distributed Systems
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

1 Reliable Distributed Systems RPC and Client-Server Computing Chapter 4

2 Chapter 3: Defining Reliability High assurance: Guarantees that system will remain continuously available despite minor disruptions Rapid restart of failed components or rollover to healthy environment, with service continuously maintained for applications.

3 Chapter 3: Defining Reliability Scalability and Performance Replication of data, soft state and services and maintain consistency among replications Security: Secure communications: https Secure servers Application security policies, compliance and enforcement

4 Remote Procedure Call Basic concepts Implementation issues, usual optimizations Where are the costs? Firefly RPC, Lightweight RPC, Winsock Direct and VIA Reliability and consistency Multithreading debate

5 A brief history of RPC Introduced by Birrell and Nelson in 1985 Pre-RPC: Most applications were built directly over the Internet primitives Their idea: mask distributed computing system using a “transparent” abstraction Looks like normal procedure call Hides all aspects of distributed interaction Supports an easy programming model Today, RPC is the core of many distributed systems

6 More history Early focus was on RPC “environments” Culminated in DCE (Distributed Computing Environment), standardizes many aspects of RPC Then emphasis shifted to performance, many systems improved by a factor of 10 to 20 Today, RPC often used from object-oriented systems employing CORBA or COM standards. Reliability issues are more evident than in the past.

7 The basic RPC protocol clientserver “binds” to server registers with name service

8 The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request

9 The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request invokes handler

10 The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request invokes handler sends reply

11 The basic RPC protocol clientserver “binds” to server prepares, sends request unpacks reply registers with name service receives request invokes handler sends reply

12 Compilation stage Server defines and “exports” a header file giving interfaces it supports and arguments expected. Uses “interface definition language” (IDL) Client includes this information Client invokes server procedures through “stubs” provides interface identical to the server version responsible for building the messages and interpreting the reply messages passes arguments by value, never by reference may limit total size of arguments, in bytes

13 Binding stage Occurs when client and server program first start execution Server registers its network address with name directory, perhaps with other information Client scans directory to find appropriate server Depending on how RPC protocol is implemented, may make a “connection” to the server, but this is not mandatory

14 Data in messages We say that data is “marshalled” into a message and “demarshalled” from it Representation needs to deal with byte ordering issues (big-endian versus little endian), strings (some CPUs require padding), alignment, etc Goal is to be as fast as possible on the most common architectures, yet must also be very general

15 Request marshalling Client builds a message containing arguments, indicates what procedure to invoke Do to need for generality, data representation a potentially costly issue! Performs a send I/O operation to send the message Performs a receive I/O operation to accept the reply Unpacks the reply from the reply message Returns result to the client program

16 Costs in basic protocol? Allocation and marshalling data into message (can reduce costs if you are certain client, server have identical data representations) Two system calls, one to send, one to receive, hence context switching Much copying all through the O/S: application to UDP, UDP to IP, IP to ethernet interface, and back up to application

17 Schroeder and Burroughs Studied RPC performance in O/S kernel Suggested a series of major optimizations Resulted in performance improvements of about 10-fold for Xerox firefly workstation (from 10ms to below 1ms)

18 Typical optimizations? Compile the stub “inline” to put arguments directly into message Two versions of stub; if (at bind time) sender and dest. found to have same data representations, use host-specific rep. Use a special “send, then receive” system call (requires O/S extension) Optimize the O/S kernel path itself to eliminate copying – treat RPC as the most important task the kernel will do

19 RPC “semantics” At most once: request is processed 0 or 1 times Exactly once: request is always processed 1 time At least once: request processed 1 or more times... but exactly once is impossible because we can’t distinguish packet loss from true failures! In both cases, RPC protocol simply times out.

20 RPC versus local procedure call Restrictions on argument sizes and types New error cases: Bind operation failed Request timed out Argument “too large” can occur if, e.g., a table grows Costs may be very high... so RPC is actually not very transparent!

21 RPC costs in case of local destination process Often, the destination is right on the caller’s machine! Caller builds message Issues send system call, blocks, context switch Message copied into kernel, then out to dest. Dest is blocked... wake it up, context switch Dest computes result Entire sequence repeated in reverse direction If scheduler is a process, context switch 6 times!

22 Important optimizations: LRPC Lightweight RPC (LRPC): for case of sender, dest on same machine (Bershad et. al.)Bershad et. al.) Uses memory mapping to pass data Reuses same kernel thread to reduce context switching costs (user suspends and server wakes up on same kernel thread or “stack”) Single system call: send_rcv or rcv_send

23 LRPC performance impact On same platform, offers about a 10-fold improvement over a hand-optimized RPC implementation Does two memory remappings, no context switch Runs about 50 times faster than standard RPC by same vendor (at the time of the research) Semantics stronger: easy to ensure exactly once

24 Active messages Concept developed by Culler and von Eicken for parallel machines Assumes the sender knows all about the dest, including memory layout, data formats Message header gives address of handler Applications copy directly into and out of the network interface

25 Performance impact? Even with optimizations, standard RPC requires about 1000 instructions to send a null message Active messages: as few as 6 instructions! One-way latency as low as 35usecs But model works only if “same program” runs on all nodes and if application has direct control over communication hardware

26 U/Net Low latency/high performance communication for ATM on normal UNIX machines, later extended to fast Ethernet Developed by Von Eicken, Vogels and others at Cornell (1995) Idea is that application and ATM controller share memory-mapped region. I/O done by adding messages to queue or reading from queue Latency 50-fold reduced relative to UNIX, throughput 10-fold better for small messages!

27 U/Net concepts Normally, data flows through the O/S to the driver, then is handed to the device controller In U/Net the device controller sees the data directly in shared memory region Normal architecture gets protection from trust in kernel U/Net gets protection using a form of cooperation between controller and device driver

28 U/Net implementation Reprogram ATM controller to understand special data structures in memory-mapped region Rebuild ATM device driver to match this model Pin shared memory pages, leave mapped into I/O DMA map Disable memory caching for these pages (else changes won’t be visible to ATM)

29 U-Net Architecture User’s address space has a direct-mapped communication region ATM device controller sees whole region and can transfer directly in and out of it... organized as an in-queue, out- queue, freelist

30 U-Net protection guarantees No user can see contents of any other user’s mapped I/O region (U-Net controller sees whole region but not the user programs) Driver mediates to create “channels”, user can only communicate over channels it owns U-Net controller uses channel code on incoming/outgoing packets to rapidly find the region in which to store them

31 U-Net reliability guarantees With space available, has the same properties as the underlying ATM (which should be nearly 100% reliable) When queues fill up, will lose packets Also loses packets if the channel information is corrupted, etc

32 Minimum U/Net costs? Build message in a preallocated buffer in the shared region Enqueue descriptor on “out queue” ATM immediately notices and sends it Remote machine was polling the “in queue” ATM builds descriptor for incoming message Application sees it immediately: 35usecs latency

33 Protocols over U/Net Von Eicken, Vogels support IP, UDP, TCP over U/Net These versions run the TCP stack in user space!

34 VIA and Winsock Direct Windows consortium (MSFT, Intel, others) commercialized U/Net: Virtual Interface Architecture (VIA) Runs in NT Clusters But most applications run over UNIX-style sockets (“Winsock” interface in NT) Winsock direct automatically senses and uses VIA where available Today is widely used on clusters and may be a key reason that they have been successful

35 Broad comments on RPC RPC is not very transparent Failure handling is not evident at all: if an RPC times out, what should the developer do? Reissuing the request only makes sense if there is another server available Anyhow, what if the request was finished but the reply was lost? Do it twice? Try to duplicate the lost reply? Performance work is producing enormous gains: from the old 75ms RPC to RPC over U/Net with a 75usec round-trip time: a factor of 1000!

36 Contents of an RPC environment Standards for data representation Stub compilers, IDL databases Services to manage server directory, clock synchronization Tools for visualizing system state and managing servers and applications

37 Closely Related Topic Multithreading is a common performance- enhancing technique Idea is that server is often idle while doing I/O for one client, so use extra threads to allow concurrent request processing In the limit, leads to database transactional concurrency model, but many non- transactional servers use threads for enhanced performance

38 Multithreading debate Three major options: Single-threaded server: only does one thing at a time, uses send/recv system calls and blocks while waiting Multi-threaded server: internally concurrent, each request spawns a new thread to handle it Upcalls: event dispatch loop does a procedure call for each incoming event, like for X11 or PC’s running Windows.

39 Single threading: drawbacks Applications can deadlock if a request cycle forms: I’m waiting for you and you send me a request, which I can’t handle Much of system may be idle waiting for replies to pending requests Harder to implement RPC protocol itself (need to use a timer interrupt to trigger acks, retransmission, which is awkward)

40 Multithreading Idea is to support internal concurrency as if each process was really multiple processes that share one address space Thread scheduler uses timer interrupts and context switching to mimic a physical multiprocessor using the smaller number of CPU’s actually available

41 Multithreaded RPC Each incoming request is handled by spawning a new thread Designer must implement appropriate mutual exclusion to guard against “race conditions” and other concurrency problems Ideally, server is more active because it can process new requests while waiting for its own RPC’s to complete on other pending requests

42 Recent RPC history RPC was once touted as the transparent answer to distributed computing Today the protocol is very widely used... but it isn’t very transparent, and reliability issues can be a major problem Today the strongest interest is in Web Services and CORBA, which use RPC as the mechanism to implement object invocation