Reliable Distributed Systems RPC and Client-Server Computing.

Slides:



Advertisements
Similar presentations
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Advertisements

RPC Robert Grimm New York University Remote Procedure Calls.
Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tam Vu Remote Procedure Call CISC 879 – Spring 03 Tam Vu March 06, 03.
Computing Systems 15, 2015 Next up Client-server model RPC Mutual exclusion.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
Remote Procedure CallCS-4513, D-Term Remote Procedure Call CS-4513 Distributed Computing Systems (Slides include materials from Operating System.
Implementing Remote Procedure Calls Andrew Birrell and Bruce Nelson Presented by Kai Cong.
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Computer Systems/Operating Systems - Class 8
Ameoba Designed by: Prof Andrew S. Tanenbaum at Vrija University since 1981.
G Robert Grimm New York University Lightweight RPC.
Reliable Distributed Systems
Tutorials 2 A programmer can use two approaches when designing a distributed application. Describe what are they? Communication-Oriented Design Begin with.
CS533 - Concepts of Operating Systems 1 Remote Procedure Calls - Alan West.
Implementing Remote Procedure Calls Authors: Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Presenter: Jim Santmyer Thanks to:
CS490T Advanced Tablet Platform Applications Network Programming Evolution.
.NET Mobile Application Development Remote Procedure Call.
1 Reliable Distributed Systems RPC and Client-Server Computing Chapter 4.
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
Networked File System CS Introduction to Operating Systems.
Operating System 4 THREADS, SMP AND MICROKERNELS
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Introduction to Distributed Systems Slides for CSCI 3171 Lectures E. W. Grundke.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Implementing Remote Procedure Calls Authored by Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Presented by Lars Larsson.
Chapter 4: Interprocess Communication‏ Pages
3.1 Silberschatz, Galvin and Gagne ©2009Operating System Concepts with Java – 8 th Edition Chapter 3: Processes.
 Remote Procedure Call (RPC) is a high-level model for client-sever communication.  It provides the programmers with a familiar mechanism for building.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC Tanenbaum.
CSE 451: Operating Systems Winter 2015 Module 22 Remote Procedure Call (RPC) Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Chapter 5: Distributed objects and remote invocation Introduction Remote procedure call Events and notifications.
Remote Procedure CallCS-502 Fall Remote Procedure Call (continued) CS-502, Operating Systems Fall 2007 (Slides include materials from Operating System.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
The Client-Server Model And the Socket API. Client-Server (1) The datagram service does not require cooperation between the peer applications but such.
Remote Procedure Call RPC
- Manvitha Potluri. Client-Server Communication It can be performed in two ways 1. Client-server communication using TCP 2. Client-server communication.
09/14/05 1 Implementing Remote Procedure Calls* Birrell, A. D. and Nelson, B. J. Presented by Emil Constantinescu *ACM Trans. Comput. Syst. 2, 1 (Feb.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Reliable Client-Server Communication. Reliable Communication So far: Concentrated on process resilience (by means of process groups). What about reliable.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
Implementing Remote Procedure Calls Andrew D. Birrell and Bruce Jay Nelson 1894 Xerox Palo Alto Research Center EECS 582 – W16.
Implementing Remote Procedure Calls Andrew D. Birrell and Bruce Jay Nelson Xerox Palo Alto Research Center Published: ACM Transactions on Computer Systems,
Distributed Systems Lecture 8 RPC and marshalling 1.
Computer Science Lecture 4, page 1 CS677: Distributed OS Last Class: RPCs RPCs make distributed computations look like local computations Issues: –Parameter.
Lecture 5: RPC (exercises/questions). 26-Jun-16COMP28112 Lecture 52 First Six Steps of RPC TvS: Figure 4-7.
Topic 3: Remote Invocation Dr. Ayman Srour Faculty of Applied Engineering and Urban Planning University of Palestine.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
03 – Remote invoaction Request-reply RPC RMI Coulouris 5
CS533 Concepts of Operating Systems
Implementing RPC by Birrell & Nelson
CSE 451: Operating Systems Winter 2006 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
DISTRIBUTED COMPUTING
Lecture 4: RPC Remote Procedure Call Coulouris et al: Chapter 5
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
CSE 451: Operating Systems Winter 2007 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Lecture 4: RPC Remote Procedure Call CDK: Chapter 5
CSE 451: Operating Systems Winter 2004 Module 19 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Spring 2012 Module 22 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Autumn 2009 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
EECE.4810/EECE.5730 Operating Systems
Lecture 6: RPC (exercises/questions)
CSE 451: Operating Systems Autumn 2010 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Lecture 6: RPC (exercises/questions)
Lecture 7: RPC (exercises/questions)
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
Last Class: Communication in Distributed Systems
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
Presentation transcript:

Reliable Distributed Systems RPC and Client-Server Computing

Remote Procedure Call Basic concepts Implementation issues, usual optimizations Where are the costs? Reliability and consistency Multithreading debate

A brief history of RPC Introduced by Birrell and Nelson in 1985 Pre-RPC: Most applications were built directly over the Internet primitives Their idea: mask distributed computing system using a “transparent” abstraction Looks like normal procedure call Hides all aspects of distributed interaction Supports an easy programming model Today, RPC is the core of many distributed systems

More history Early focus was on RPC “environments” Culminated in DCE (Distributed Computing Environment), standardizes many aspects of RPC Then emphasis shifted to performance, many systems improved by a factor of 10 to 20 Today, RPC often used from object-oriented systems employing CORBA or COM standards. Reliability issues are more evident than in the past.

The basic RPC protocol clientserver “binds” to server registers with name service

The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request

The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request invokes handler

The basic RPC protocol clientserver “binds” to server prepares, sends request registers with name service receives request invokes handler sends reply

The basic RPC protocol clientserver “binds” to server prepares, sends request unpacks reply registers with name service receives request invokes handler sends reply

Compilation stage Server defines and “exports” a header file giving interfaces it supports and arguments expected. Uses “interface definition language” (IDL) Client includes this information Client invokes server procedures through “stubs” provides interface identical to the server version responsible for building the messages and interpreting the reply messages passes arguments by value, never by reference may limit total size of arguments, in bytes

Binding stage Occurs when client and server program first start execution Server registers its network address with name directory, perhaps with other information Client scans directory to find appropriate server Depending on how RPC protocol is implemented, may make a “connection” to the server, but this is not mandatory

Data in messages We say that data is “marshalled” into a message and “demarshalled” from it Representation needs to deal with byte ordering issues (big-endian versus little endian), strings (some CPUs require padding), alignment, etc Goal is to be as fast as possible on the most common architectures, yet must also be very general

Request marshalling Client builds a message containing arguments, indicates what procedure to invoke Due to the need for generality, data representation a potentially costly issue! Performs a send I/O operation to send the message Performs a receive I/O operation to accept the reply Unpacks the reply from the reply message Returns result to the client program

Costs in basic protocol? Allocation and marshalling data into message (can reduce costs if you are certain client, server have identical data representations) Two system calls, one to send, one to receive, hence context switching Much copying all through the O/S: application to UDP, UDP to IP, IP to ethernet interface, and back up to application

Schroeder and Burroughs Studied RPC performance in O/S kernel Suggested a series of major optimizations Resulted in performance improvments of about 10-fold for Xerox firefly workstation (from 10ms to below 1ms)

Typical optimizations? Compile the stub “inline” to put arguments directly into message Two versions of stub; if (at bind time) sender and dest. found to have same data representations, use host-specific rep. Use a special “send, then receive” system call (requires O/S extension) Optimize the O/S kernel path itself to eliminate copying – treat RPC as the most important task the kernel will do

Fancy argument passing RPC is transparent for simple calls with a small amount of data passed “Transparent” in the sense that the interface to the procedure is unchanged But exceptions thrown will include new exceptions associated with network What about complex structures, pointers, big arrays? These will be very costly, and perhaps impractical to pass as arguments Most implementations limit size, types of RPC arguments. Very general systems less limited but much more costly.

Overcoming lost packets clientserver sends request

Overcoming lost packets clientserver sends request retransmit ack for request duplicate request: ignored Timeout!

Overcoming lost packets clientserver sends request retransmit ack for request reply Timeout!

Overcoming lost packets clientserver sends request retransmit ack for request reply ack for reply Timeout!

Costs in fault-tolerant version? Acks are expensive. Try and avoid them, e.g. if the reply will be sent quickly supress the initial ack Retransmission is costly. Try and tune the delay to be “optimal” For big messages, send packets in bursts and ack a burst at a time, not one by one

Big packets clientserver sends request as a burst ack entire burst reply ack for reply

RPC “semantics” At most once: request is processed 0 or 1 times Exactly once: request is always processed 1 time At least once: request processed 1 or more times... but exactly once is impossible because we can’t distinguish packet loss from true failures! In both cases, RPC protocol simply times out.

Implementing at most/least once Use a timer (clock) value and a unique id, plus sender address Server remembers recent id’s and replies with same data if a request is repeated Also uses id to identify duplicates and reject them Very old requests detected and ignored by checking time Assumes that the clocks are working In particular, requires “synchronized” clocks

RPC versus local procedure call Restrictions on argument sizes and types New error cases: Bind operation failed Request timed out Argument “too large” can occur if, e.g., a table grows Costs may be very high... so RPC is actually not very transparent!

RPC costs in case of local destination process Often, the destination is right on the caller’s machine! Caller builds message Issues send system call, blocks, context switch Message copied into kernel, then out to dest. Dest is blocked... wake it up, context switch Dest computes result Entire sequence repeated in reverse direction If scheduler is a process, context switch 6 times!

RPC example Source does xyz(a, b, c) Dest on same site O/S

RPC in normal case Source does xyz(a, b, c) Dest on same site O/S Destination and O/S are blocked

RPC in normal case Source does xyz(a, b, c) Dest on same site O/S Source, dest both block. O/S runs its scheduler, copies message from source out- queue to dest in-queue

RPC in normal case Source does xyz(a, b, c) Dest on same site O/S Dest runs, copies in message Same sequence needed to return results

Broad comments on RPC RPC is not very transparent Failure handling is not evident at all: if an RPC times out, what should the developer do? Reissuing the request only makes sense if there is another server available Anyhow, what if the request was finished but the reply was lost? Do it twice? Try to duplicate the lost reply? Performance work is producing enormous gains: from the old 75ms RPC to RPC over U/Net with a 75usec round-trip time: a factor of 1000!

Contents of an RPC environment Standards for data representation Stub compilers, IDL databases Services to manage server directory, clock synchronization Tools for visualizing system state and managing servers and applications

Closely Related Topic Multithreading is a common performance- enhancing technique Idea is that server is often idle while doing I/O for one client, so use extra threads to allow concurrent request processing In the limit, leads to database transactional concurrency model, but many non- transactional servers use threads for enhanced performance

Multithreading debate Three major options: Single-threaded server: only does one thing at a time, uses send/recv system calls and blocks while waiting Multi-threaded server: internally concurrent, each request spawns a new thread to handle it Upcalls: event dispatch loop does a procedure call for each incoming event, like for X11 or PC’s running Windows.

Single threading: drawbacks Applications can deadlock if a request cycle forms: I’m waiting for you and you send me a request, which I can’t handle Much of system may be idle waiting for replies to pending requests Harder to implement RPC protocol itself (need to use a timer interrupt to trigger acks, retransmission, which is awkward)

Multithreading Idea is to support internal concurrency as if each process was really multiple processes that share one address space Thread scheduler uses timer interrupts and context switching to mimic a physical multiprocessor using the smaller number of CPU’s actually available

Multithreaded RPC Each incoming request is handled by spawning a new thread Designer must implement appropriate mutual exclusion to guard against “race conditions” and other concurrency problems Ideally, server is more active because it can process new requests while waiting for its own RPC’s to complete on other pending requests

Negatives to multithreading Users may have little experience with concurrency and will then make mistakes Concurrency bugs are very hard to find due to non- reproducible scheduling orders Reentrancy can come as an undesired surprise Threads need stacks hence consumption of memory can be very high Deadlock remains a risk, now associated with concurrency control Stacks for threads must be finite and can overflow, corrupting the address space

Threads: can spawn too many SCHED event

Threads: can spawn too many SCHED event Thread spawned, but blocks

Threads: can spawn too many SCHED event Eventually, application becomes bloated, begins to thrash. Performance drops and clients may think the server has failed

Upcall model Common in windowing systems Each incoming “event” is encoded as a small descriptive data structure User registers event handling procedures Dispatch loop calls the procedures as new events arrive, waits for the call to finish, then dispatches a new event

Upcalls combined with threads Perhaps the best model for RPC programming Each handler can be tagged: needs thread, or can be executed “unthreaded” Developer must still be very careful where threads are used

Recent RPC history RPC was once touted as the transparent answer to distributed computing Today the protocol is very widely used... but it isn’t very transparent, and reliability issues can be a major problem Today the strongest interest is in Web Services and CORBA, which use RPC as the mechanism to implement object invocation