Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS Z03 / 4030 10 th October, 2006.

Slides:



Advertisements
Similar presentations
CSE 486/586 Distributed Systems Remote Procedure Call
Advertisements

Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
1 UNIX Internals – the New Frontiers Distributed File Systems.
Tam Vu Remote Procedure Call CISC 879 – Spring 03 Tam Vu March 06, 03.
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
Computing Systems 15, 2015 Next up Client-server model RPC Mutual exclusion.
Remote Procedure CallCS-4513, D-Term Remote Procedure Call CS-4513 Distributed Computing Systems (Slides include materials from Operating System.
L-7 RPC 1. Last Lecture Important Lessons - Naming Naming is a powerful tool in system design  A layer of indirection can solve many problems Name+context.
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
Distributed Systems Lecture #3: Remote Communication.
Lecture 4 Remote Procedure Calls (cont). EECE 411: Design of Distributed Software Applications [Last time] Building Distributed Applications: Two Paradigms.
CS490T Advanced Tablet Platform Applications Network Programming Evolution.
Jan 28, 2003CS475: Internetworking with UNIX TCP/IP1 XDR, RPC, NFS Internetworking with UNIX TCP/IP Winter
PRASHANTHI NARAYAN NETTEM.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Distributed Systems Lecture # 3. Administrivia Projects –Design and Implement a distributed file system Paper Discussions –Discuss papers as case studies.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Network File System (NFS) Brad Karp UCL Computer Science CS GZ03 / M030 6 th, 7 th October, 2008.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Remote Procedure Call Steve Ko Computer Sciences and Engineering University at Buffalo.
Networked File System CS Introduction to Operating Systems.
Distributed File Systems
CSE 486/586 CSE 486/586 Distributed Systems Remote Procedure Call Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Problems with Send and Receive Low level –programmer is engaged in I/O –server often not modular –takes 2 calls to get what you want (send, followed by.
Chapter 4: Interprocess Communication‏ Pages
Background: Operating Systems Brad Karp UCL Computer Science CS GZ03 / M th November, 2008.
Politecnico di Milano © 2001 William Fornaciari Operating Systems R P C Remote Procedure Call Lecturer: William Fornaciari Politecnico di Milano
1 Lecture 5 (part2) : “Interprocess communication” n reasons for process cooperation n types of message passing n direct and indirect message passing n.
 Remote Procedure Call (RPC) is a high-level model for client-sever communication.  It provides the programmers with a familiar mechanism for building.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC Tanenbaum.
Lecture 4-5-6: Remote Procedure Calls. EECE 411: Design of Distributed Software Applications Academic honesty is essential to the continued functioning.
CSE 451: Operating Systems Winter 2015 Module 22 Remote Procedure Call (RPC) Mark Zbikowski Allen Center 476 © 2013 Gribble, Lazowska,
Page 1 Remote Procedure Calls Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS GZ03 / nd October, 2007.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Remote Procedure Call RPC
Mark Stanovich Operating Systems COP Primitives to Build Distributed Applications send and receive Used to synchronize cooperating processes running.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Manish Kumar,MSRITSoftware Architecture1 Remote procedure call Client/server architecture.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
RPC Model, Stubs and Skeletons Divya Nampalli
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Implementing Remote Procedure Call Landon Cox February 12, 2016.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Distributed Systems Lecture 8 RPC and marshalling 1.
Computer Science Lecture 4, page 1 CS677: Distributed OS Last Class: RPCs RPCs make distributed computations look like local computations Issues: –Parameter.
Lecture 5: RPC (exercises/questions). 26-Jun-16COMP28112 Lecture 52 First Six Steps of RPC TvS: Figure 4-7.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
Prof. Leonardo Mostarda University of Camerino
CSE 451: Operating Systems Winter 2006 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
DISTRIBUTED COMPUTING
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
CSE 451: Operating Systems Winter 2007 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Lecture 4: RPC Remote Procedure Call CDK: Chapter 5
CSE 451: Operating Systems Winter 2004 Module 19 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Spring 2012 Module 22 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Autumn 2009 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Chapter 15: File System Internals
Lecture 6: RPC (exercises/questions)
CSE 451: Operating Systems Autumn 2010 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Lecture 6: RPC (exercises/questions)
Lecture 7: RPC (exercises/questions)
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
Last Class: Communication in Distributed Systems
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
Presentation transcript:

Remote Procedure Call (RPC) and Transparency Brad Karp UCL Computer Science CS Z03 / th October, 2006

2 Transparency in Distributed Systems Programmers accustomed to writing code for a single box Transparency: retain “feel” of writing for one box, when writing code that runs distributedly Goals: –Preserve original, unmodified client code –Preserve original, unmodified server code –RPC should glue together client and server without changing behavior of either –Programmer shouldn’t have to think about network How achievable is true transparency? We will use NFS as a case study. But first, an introduction to RPC itself.

3 Remote Procedure Call: Central Idea Within a single program, running on a single box, well-known notion of procedure call (aka function call): –Caller pushes arguments onto stack –Jumps to address of callee function –Callee reads arguments from stack –Callee executes, puts return value in register –Callee returns to next instruction in caller RPC aim: let distributed programming look no different from local procedure calls

4 RPC Abstraction Library makes an API available to locally running applications Let servers export their local APIs to be accessible over the network, as well On client, procedure call generates request over network to server On server, called procedure executes, result returned in response to client

5 RPC Implementation Details Data types may be different sizes on different machines (e.g., 32-bit vs. 64-bit integers) Little-endian vs. big-endian machines –Big-endian: 0x is 0x11, 0x22, 0x33, 0x44 –Little-endian is 0x44, 0x33, 0x22, 0x11 Need mechanism to pass procedure parameters and return values in machine-independent fashion Solution: Interface Description Language (IDL)

6 Interface Description Languages Compile interface description, produces: –Types in native language (e.g., Java, C, C++) –Code to marshal native data types into machine-neutral byte streams for network (and vice-versa) –Stub routines on client to forward local procedure calls as requests to server For Sun RPC, IDL is XDR (eXternal Data Representation) Now, back to our NFS case study…

7 “Non-Distributed” NFS Applications Syscalls Kernel filesystem implementation Local disk RPC must “split up” the above Where does NFS make the split?

8 NFS Structure on Client NFS splits client at vnode interface, below syscall implementation Client-side NFS code essentially stubs for system calls: –Package up arguments, send them to server

9 NFS and Syntactic Transparency Does NFS preserve the syntax of the client function call API (as seen by applications)? –Yes! –Arguments and return values of system calls not changed in form or meaning

10 NFS and Server-Side Transparency Does NFS require changes to pre-existing filesystem code on server? –Some, but not much. –NFS adds in-kernel threads (to block on I/O, much like user-level processes do) –Server filesystem implementation changes: File handles over wire, not file descriptors Generation numbers added to on-disk i-nodes User IDs carried as arguments, rather than implicit in process owner Support for synchronous updates (e.g., for WRITE)

11 NFS and File System Semantics You don’t get transparency merely by preserving the same API System calls must mean the same thing! If they don’t, pre-existing code may compile and run, but yield incorrect results! Does NFS preserve the UNIX filesystem’s semantics? No! Let us count the ways…

12 NFS’ New Semantics: Server Failure On one box, open() only fails if file doesn’t exist Now open() and all other syscalls can fail if server has died! –Apps must know how to retry or fail gracefully Or open() could hang forever—never the case before! –Apps must know how to set own timeouts if don’t want to hang This is not a quirk of NFS—it’s fundamental!

13 NFS’ New Semantics: close() Might Fail Suppose server out of disk space But client WRITEs asynchronously, only on close(), for performance Client waits in close() for WRITEs to finish close() never returns error for local fs! –Apps must check not only write(), but also close(), for disk full! Reason: NFS batches WRITEs –If WRITEs were synchronous, close() couldn’t fill disk, but performance would be awful

14 NFS’ New Semantics: Errors Returned for Successful Operations Suppose you call rename(“a”, “b”) on file in NFS-mounted fs Suppose server completes RENAME, crashes before replying NFS client resends RENAME “a” doesn’t exist; error returned! Never happens on local fs… Side effect of statelessness of NFS server: –Server could remember all ops it’s completed, but that’s hard –Must keep that state consistent and persistent across crashes (i.e., on disk)! –Update the state first, or perform the operation first?

15 NFS’ New Semantics: Deletion of Open Files Client A open()s file for reading Client B deletes it while A has it open Local UNIX fs: A’s subsequent reads work NFS: A’s subsequent reads fail Side effect of statelessness of NFS server: –Could have fixed this—server could track open()s –AFS tracks state required to solve this problem

16 Semantics vs. Performance Insight: preserving semantics produces poor performance e.g., for write() to local fs, UNIX can delay actual write to disk –Gather writes to multiple adjacent blocks, and so write them with one disk seek –If box crashes, you lose both the running app and its dirty buffers in memory Can we delay WRITEs in this way on NFS server?

17 NFS Server and WRITE Semantics Suppose WRITE RPC stores client data in buffer in memory, returns success to client Now server crashes and reboots –App doesn’t crash—in fact, doesn’t notice! –And written data mysteriously disappear! Solution: NFS server does synchronous WRITEs –Doesn’t reply to WRITE RPC until data on disk –If write() returns on client, even if server crashes, data safe on disk –Per previous lecture: 3 seeks, 45 ms, 22 WRITES/s, 180 KB/s max throughput! –< 10% of max disk throughput NFS v3 and AFS fix this problem (more complex)

18 Semantics vs. Performance (2) Insight: improving performance changes consistency semantics! Suppose clients cache disk blocks when they read them But writes always go through to server Not enough to get consistency! –Write editor buffer on one box, make on other –Do make/compiler see changes? Ask server “has file changed?” at every read()? –Almost as slow as just reading from server…

19 NFS: Semantics vs. Performance NFS’ solution: close-to-open consistency –Ask server “has file changed?” at each open() –Don’t ask on each read() after open() –If B changes file while A has it open, A doesn’t see changes OK for emacs/make, but not always what you want: –make > make.log (on server) –tail –f make.log (on my desktop) Side effect of statelessness of NFS server –Server could track who has cached blocks on reads –Send “invalidate” messages to clients on changes

20 Security Radically Different Local system: UNIX enforces read/write protections per-user –Can’t read my files without my password How does NFS server authenticate user? Easy to send requests to NFS server, and to forge NFS replies to client Does it help for server to look at source IP address? So why aren’t NFS servers ridiculously vulnerable? –Hard to guess correct file handles! Fixable: SFS, AFS, some NFS versions use cryptography to authenticate client Very hard to reconcile with statelessness!

21 NFS Still Very Useful People fix programs to handle new semantics –Must mean NFS useful enough to motivate them to do so! People install firewalls for security NFS still gives many advantages of transparent client/server

22 Multi-Module Distributed Systems NFS in fact rather simple: –One server, one data type (file handle) What if symmetric interaction, many data types? Say you build system with three modules in one address space: –Web front end, customer DB, order DB Represent user connections with object: class connection { int fd; int state; char *buf; } Easy to pass object references among three modules (e.g., pointer to current connection) What if we split system into three separate servers?

23 Multi-Module Systems: Challenges How do you pass class connection between servers? –Could RPC stub just send object’s elements? What if processing flow for connection goes: order DB -> customer DB -> front end to send reply? Front end only knows contents of passed connection object; underlying connection may have changed! Wanted to pass object references, not object contents NFS solution: file handles –No support from RPC to help with this!

24 Summary: RPC Non-Transparency Partial failure, network failure Latency Efficiency/semantics tradeoff Security—rarely transparent! Pointers: write-sharing, portable object references Concurrency (if multiple clients) Solutions: –Expose “remoteness” of RPC to application, or –Work harder to achieve transparent RPC

25 Conclusions Of RPC’s goals, automatic marshaling most successful Mimicking procedure call interface in practice not so useful Attempt at full transparency mostly a failure! –(You can try hard: consider Java RMI) Next time: implicit communication through distributed shared memory!

26 RPC: Failure Happens New failure modes not seen in simple, same- host procedure calls: –Remote server failure –Communication (network) failure RPCs can return “failure” instead of results Possible failure outcomes: –Procedure didn’t execute –Procedure executed once –Procedure executed multiple times –Procedure partially executed Generally, “at most once” semantics preferred

27 Achieving At-Most-Once Semantics Risk: Request message lost –Client must retransmit requests when no reply received Risk: Reply message lost –Client may retransmit previously executed request –OK when operations idempotent; some aren’t, though (e.g., “charge customer”) –Server can keep “replay cache” to reply to repeated requests without re-executing them