Distributed (Operating) Systems -Communication in Distributed Systems- Fall 2011 Kocaeli University Computer Engineering Department
Communication in Distributed Systems Communication is done through message passing Expressing communication through message passing is harder than using primitives based on shared memory Remote Procedure Calls – Transparency but poor for passing references – Ideal for client server applications Message-oriented Communication Stream-oriented Communication – Continuous media
Communication Between Processes Unstructured communication – Use shared memory or shared data structures Structured communication – Use explicit messages (IPCs) Distributed Systems: both need low-level communication support (why?)
Types of Communication -1 Transient Communication – A message is stored by the communication system only as long as the sending and the receiving application are executing – Typically all transport level communication is transient communication Persistent Communication – A message that has been submitted for transmission is kept by middleware as long as it takes to deliver it to the receiver – Receiving application don’t need to be executing when the message when the message is submitted
Types of Communication -2 Asynchronous Communication – Sender continues immediately after it has submitted its message for transmission – It means, message is temporarily stored in middleware upon submission Synchronous Communication – Sender is blocked until its request is known to be accepted – Three points Sender informs message is delivered Middleware informs message is delivered Sender informs with the response after processing
Types of Communication -3 Connection-oriented (telephone) – TCP Connectionless (mailbox – dropping a letter) – IP, UDP
Persistence and Synchronicity in Communication
Persistence Persistent communication – Messages are stored until (next) receiver is ready – Examples: , pony express
Transient Communication Transient communication – Message is stored only so long as sending/receiving application are executing – Discard message if it can’t be delivered to next server/receiver – Example: transport-level communication services offer transient communication – Example: Typical network router – discard message if it can’t be delivered next router or destination
Synchronicity Asynchronous communication – Sender continues immediately after it has submitted the message – Need a local buffer at the sending host Synchronous communication – Sender blocks until message is stored in a local buffer at the receiving host or actually delivered to sending – Variant: block until receiver processes the message Six combinations of persistence and synchronicity
Persistence and Synchronicity Combinations a) Persistent asynchronous communication (e.g., ) b) Persistent synchronous communication
Persistence and Synchronicity Combinations c) Transient asynchronous communication (e.g., UDP) d) Receipt-based transient synchronous communication
Persistence and Synchronicity Combinations e) Delivery-based transient synchronous communication at message delivery (e.g., asynchronous RPC) f) Response-based transient synchronous communication (RPC)
Layered Protocols 1 Communication messaging rules are defined in protocols Due to the absence of shared memory, all communication in distributed systems is based on sending and receiving (low-level) messages Many different agreements – IBM’s EBCDIC and ASCII – How many volts should be used to signal 0 – How does the receiver know which is the last bit – How can the receiver detect if a message has been damaged
Layered Protocols 2 ISO/OSI never widely used and are essentially dead. In the OSI model communication is divided up into 7 layers OSI layers were never popular. In contrast, protocols developed for the internet, such as TCO and IP, are mostly used
Communication Protocols There are some rules that communicating processes must adhere to – known as protocols Protocols are agreements/rules on communication Protocols could be connection-oriented or connectionless
Layered Protocols A typical message as it appears on the network.
Layers -1 Physical layer – Transmitting 0s and 1s – How many bits per sec transfer – Can transmission take place in both direction Data Link Layer – Puts special bit pattern on the start and end of each frame – Computing checksum Network Layer – Routing
Layers -2 Transport Protocol – Turns the underlying network into something than an application developer can use – Messages from the application is broken into small pieces- packets – Which packets have been sent, which have been received, which should be retransmitted – Connection-oriented: messages arrive in the same order (as in TCP) – Connectionless: message can arrive in different order (as in UDP) – Another example of transport protocol is RTP – The combination of TCP/IP is now used as a de facto standard for network communication
Client-Server TCP
Layers -3 Session Layer – Enhanced version if transport layer – Keeps track of which party is currently talking and it provides a synchronization – Puts checking points – in case of crash going to the last check point Presentation Layer – Concerned with the meaning of the bits
Middleware Protocols Middleware: layer that resides between an OS and an application – May implement general-purpose protocols that warrant their own layers
OSI Model summary In this model, clear distinction between applications, application-specific protocols and general purpose protocols. Application-specific protocol – FTP – HTTP General-purpose protocols – Useful to many applications but cannot be qualified as transport protocols. These protocols fall into the category of middleware protocols.
Middleware Protocols 1 Some services are not tied to any specific applications, but instead can be integrated into a middleware system as a general service – Authentication protocols Proof of claimed identity – Atomicity protocols Widely applied in transactions – Distributed locking protocols Resources can be protected against simultaneous accesses
Middleware Protocols 2 Also supports high-level communication services – Remote Procedure Call (RPC) – Message Oriented Middleware (MOM) – RTP: Streams for transferring real-time data, such as needed for multimedia applications
Remote Procedure Calls Goal: Make distributed computing look like centralized computing Allow remote services to be called as procedures – Transparency with regard to location, implementation, language Issues – How to pass parameters – Bindings – Semantics in face of errors
Example of an RPC No message passing at all is visible to the programmer.
Divide programs up and add communication protocols blah, blah, blah bar = add(i,j); blah, blah, blah bar = add(i,j); blah, blah, blah Int add(int x, int y ) { if (x>100) return(y-2); else if (x>10) return(y-x); else return(x+y); } Int add(int x, int y ) { if (x>100) return(y-2); else if (x>10) return(y-x); else return(x+y); }ClientServer protocol
RPC Semantics Principle of RPC between a client and server program [Birrell&Nelson 1984]
Other RPC Models Asynchronous RPC – Request-reply behavior often not needed – Server can reply as soon as request is received and execute procedure later Deferred-synchronous RPC – Use two asynchronous RPCs – Client needs a reply but can’t wait for it; server sends reply via another asynchronous RPC One-way RPC – Client does not even wait for an ACK from the server – Limitation: reliability not guaranteed (Client does not know if procedure was executed by the server).
Asynchronous RPC a)The interconnection between client and server in a traditional RPC b)The interaction using asynchronous RPC
Deferred Synchronous RPC A client and server interacting through two asynchronous RPCs
Conventional Procedure Call a) Parameter passing in a local procedure call: the stack before the call to read b) The stack while the called procedure is active Count = read(fd, buf, nbytes)
Possible Issues Calling and called procedures run on different machines They execute in different address spaces Parameters and results have to be passed, it can be complicated when the machines are not identical. – How do you represent integers – big-endian little- endian Either or both machines can crash and each of the possible failures causes different problems.
Parameter Passing Local procedure parameter passing – Call-by-value – Call-by-reference: arrays, complex data structures No sense BUT in RMI it is OK Remote procedure calls simulate this through: – Stubs – proxies – Flattening – marshalling Related issue: global variables are not allowed in RPCs
Client and Server Stubs Client makes procedure call (just like a local procedure call) to the client stub Server is written as a standard procedure Stubs take care of packaging arguments and sending messages Packaging parameters is called marshalling Stub compiler generates stub automatically from specs in an Interface Definition Language (IDL) – Simplifies programmer task
Steps of a Remote Procedure Call 1.Client procedure calls client stub in normal way 2.Client stub builds message, calls local OS 3.Client's OS sends message to remote OS 4.Remote OS gives message to server stub 5.Server stub unpacks parameters, calls server 6.Server does work, returns result to the stub 7.Server stub packs it in message, calls local OS 8.Server's OS sends message to client's OS 9.Client's OS gives message to client stub 10.Stub unpacks result, returns to client
Marshalling Problem: different machines have different data formats – Intel: little endian, SPARC: big endian Solution: use a standard representation – Example: external data representation (XDR)
Binding Problem: how does a client locate a server? – Use Bindings Server – Export server interface during initialization – Send name, version no, unique identifier, handle (address) to binder Client – First RPC: send message to binder to import server interface – Binder: check to see if server has exported interface Return handle and unique identifier to client
Binder: Port Mapper Server start-up: create port Server stub calls svc_register to register prog #, version # with local port mapper Port mapper stores prog #, version #, and port Client start-up: call clnt_create to locate server port Upon return, client can call procedures at the server
Case Study: SUNRPC One of the most widely used RPC systems Developed for use with NFS Built on top of UDP or TCP Multiple arguments marshaled into a single structure At-least-once semantics if reply received, at-least-zero semantics if no reply. With UDP tries at-most-once Use SUN’s eXternal Data Representation (XDR) – Big endian order for 32 bit integers, handle arbitrarily large data structures XDR was originally designed for specifying eXternal Data Representation XDR has been extended to become Sun RPC IDL An interface contains a program number, version number, procedure definition and required type definitions
Rpcgen: generating stubs Q_xdr.c: do XDR conversion Detailed example: later in this course