Networking Implementations (part 1) CPS210 Spring 2006
Papers The Click Modular Router Robert Morris Lightweight Remote Procedure Call Brian Bershad
Procedure Calls main (int, char**) { char *p = malloc (64); foo (p); } foo (char *p) { p[0] = ‘\0’; } Code + Data Heap main: argc, argv Stack 0x xf324 0xfa3964 foo: 0xf33c, 0xfa3964
RPC basics Want network code to look local Leverage language support 3 components on each side User program (client or server) Stub procedures RPC runtime support
Building an RPC server Define interface to server IDL (Interface Definition Language) Use stub compiler to create stubs Input: IDL, Output: client/server stub code Server code linked with server stub Client code linked with client stub
RPC Binding Binding connects clients to servers Two phases: server export, client import In Java RMI rmic compiles IDL into ServerObj_{Skel,Stub} Export looks like this Naming.bind (“Service”, new ServerObj()); ServerObj_Skel dispatches requests to input ServerObj Import looks like this Naming.lookup("rmi://host/Service"); Returns a ServerObj_Stub (subtype of ServerObj)
Remote Procedure Calls (RPC) main (int, char**) { char *p = malloc (64); foo (p); } // client stub foo (char *p) { // bind to server socket s (“remote”); // invoke remote server s.send(FOO); s.send(marsh(p)); // copy reply memcpy(p,unmarsh(s.rcv())); // terminate s.close(); } Code + Data Heap main: argc, argv Stack foo: 0xf33c, 0xfa3964 Code + Data Heap RPC_dispatch: socket Stack stub: 0xd23c, &s // server foo (char *p) { p[0] = ‘\0’; } foo_stub (s) { // alloc, unmarshall char *p2 = malloc(64); s.recv(p2, 64); // call server foo(p2); // return reply s.send(p2, 64); } RPC_dispatch (s) { int call; s.recv (&call); // do dispatch switch (call) { … case FOO: // call stub foo_stub(s); …} s.close (); } foo: 0xd23c, 0xfb3268 1)Bind 2)Invoke and reply 3)Terminate
RPC Questions Does this abstraction make sense? You always know when a call is remote What is the advantage over raw sockets? When are sockets more appropriate? What about strongly typed languages? Can type info be marshaled efficiently?
LRPC Context In 1990, micro-kernels were all the rage Split OS functionality between “servers” Each server runs in a separate addr space Use RPC to communicate Between apps and micro-kernel Between micro-kernel and servers
Micro-kernels argument Easy to protect OS from applications Run in separate protection modes Use HW to enforce Easy to protect apps from each other Run in separate address spaces Use naming to enforce How do we protect OS from itself? Why is this important?
Mach architecture Kernel User process File server Pager Memory server Process sched. Comm. Network
LRPC Motivation Overwhelmingly, RPCs are intra-machine RPC on a single machine is very expensive Many context switches Much data copying between domains Result: monolithic kernels make a comeback Run servers in kernel to minimize overhead Sacrifices safety of isolation How can we make intra-machine RPC fast? (without chucking microkernels altogether)
Baseline RPC cost Null RPC call void null () { return; } 1.Procedure call 2.Client to server: trap + context switch 3.Server to client: trap + context switch 4.Return to client
Sources of extra overhead Stub code Marshaling and unmarshaling arguments User1 Kernel, Kernel User2, back again Access control (binding validation) Message enquing and dequeuing Thread scheduling Client and server have separate thread pools Context switches Change virtual memory mappings Server dispatch
LRPC Approach Optimize for the common case: Intra-machine communication Idea: decouple threads from address spaces For LRPC call, client provides server Argument stack (A-frame) Concrete thread (one of its own) Kernel regulates transitions between domains
1) Binding Code + Data Heap Stack Code + Data Heap Stack LRPC runtime Kernel CS Clerk PDL(S) set_name{addr:0x54320, conc:1, A_stack_sz:12} import “S” main: argc, argv // server code char name[8]; set_name(char *newname) { int i, valid=0; for (i=0;i<8;i++) { if(newname[i]==‘\0’){ valid=1; break; } if (valid) return strcpy(name, newname); return –EINVAL; } 0x54320 C import req. A-stack (12 bytes) 0x74a28 LR{} 0x74a28 0x761c2 BindObj
2) Calling Code + Data Heap Stack Code + Data Heap Stack LRPC runtime Kernel CS Clerk PDL(S) set_name{addr:0x54320, conc:1, A_stack_sz:12} main: argc, argv // server code char name[8]; set_name(char *newname) { int i, valid=0; for (i=0;i<8;i++) { if(newname[i]==‘\0’){ valid=1; break; } if (valid) return strcpy(name, newname); return –EINVAL; } 0x54320 A-stack (12 bytes) ”foo” “foo” 0x74a28 LR{} 0x74a28 0x761c2 BindObj set_name: “foo” &BindObj,0x7428,set_name 0x74a28 LR{Csp,Cra} server_stub: set_name: 0x761c2 ”foo”, 0 “foo”, 0
Data copying
Questions Is fast IPC still important? Are the ideas here useful for VMs? Just how safe are servers from clients?