Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud.

Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud

Today…  Last Session:  A Complete Routing Example using the Bellman-Ford Algorithm  Communication in Distributed Systems  Inter-Process Communication  Today’s Session:  Part I: Communication Paradigms  Remote Invocation and Indirect Communication  Part II: Naming  Naming Conventions and Name Resolution Algorithms  Announcements:  P1 design report grades are out  PS1 grades will be out on Wednesday  PS2 is now posted. It is due on Saturday, Sep 27 by 11:59PM  Quiz 1 will be on Thursday, Sep 25 (during the recitation time)  Midterm exam will be on Wednesday, Oct 15 (during the class time)

Today… Part I: Remote Invocation and Indirect Communication

Communication Paradigms Socket communication Remote invocation Indirect communication

Remote Invocation Remote invocation enables an entity to call a procedure that typically executes on an another computer without the programmer explicitly coding the details of communication The underlying middleware will take care of raw-communication Programmer can transparently communicate with remote entity We will study two types of remote invocations: a.Remote Procedure Calls (RPC) b.Remote Method Invocation (RMI)

Machine B – Server Machine A – Client Remote Procedure Calls (RPC) RPC enables a sender to communicate with a receiver using a simple procedure call No communication or message-passing is visible to the programmer Basic RPC Approach int add(int x, int y) { return x+y; } … add(a,b) ; … Client Stub Server Stub (Skeleton) Communication Module Client Program Server Procedure Communication Module Client process Server process Request Response

Challenges in RPC Parameter passing via Marshaling Procedure parameters and results have to be transferred over the network as bits Data representation Data representation has to be uniform Architecture of the sender and receiver machines may differ

Parameter Passing via Marshaling Packing parameters into a message that will be transmitted over the network is called parameter marshalling The parameters to the procedure and the result have to be marshaled before transmitting them over the network Two types of parameters can passed 1. Value parameters 2. Reference parameters

1. Passing Value Parameters Value parameters have complete information about the variable, and can be directly encoded into the message e.g., integer, float, character Values are passed through call-by-value The changes made by the callee procedure are not reflected in the caller procedure

2. Passing Reference Parameters Passing reference parameters like value parameters in RPC leads to incorrect results due to two reasons: a.Invalidity of reference parameters at the server Reference parameters are valid only within client’s address space Solution: Pass the reference parameter by copying the data that is referenced b.Changes to reference parameters are not reflected back at the client Solution: “Copy/Restore” the data Copy the data that is referenced by the parameter Copy-back the value at server to the client

Challenges in RPC Parameter passing via Marshaling Procedure parameters and results have to be transferred over the network as bits Data representation Data representation has to be uniform Architecture of the sender and receiver machines may differ

Data Representation Computers in DS often have different architectures and operating systems The size of the data-type differ e.g., A long data-type is 4-bytes in 32-bit Unix, while it is 8-bytes in 64-bit Unix systems The format in which the data is stored differ e.g., Intel stores data in little-endian format, while SPARC stores in big-endian format The client and server have to agree on how simple data is represented in the message e.g., format and size of data-types such as integer, char and float

Remote Procedure Call Types Remote procedure calls can be: Synchronous Asynchronous (or Deferred Synchronous)

Synchronous vs. Asynchronous RPCs An RPC with strict request-reply blocks the client until the server returns Blocking wastes resources at the client Asynchronous RPCs are used if the client does not need the result from server The server immediately sends an ACK back to client The client continues the execution after an ACK from the server Synchronous RPCsAsynchronous RPCs

Deferred Synchronous RPCs Asynchronous RPC is also useful when a client wants the results, but does not want to be blocked until the call finishes Client uses deferred synchronous RPCs Single request-response RPC is split into two RPCs First, client triggers an asynchronous RPC on server Second, on completion, server calls-back client to deliver the results

Remote Method Invocation (RMI) In RMI, a calling object can invoke a method on a potentially remote object RMI is similar to RPC, but in a world of distributed objects The programmer can use the full expressive power of object-oriented programming RMI not only allows to pass value parameters, but also pass object references

Remote Objects and Supporting Modules In RMI, objects whose methods can be invoked remotely are known as “remote objects” Remote objects implement remote interface During any method call, the system has to resolve whether the method is being called on a local or a remote object Local calls should be called on a local object Remote calls should be called via remote method invocation Remote Reference Module is responsible for translating between local and remote object references

RMI Control Flow Machine B – Server Machine A – Client Proxy for B Communication Module Skeleton and Dispatcher for B’s class Request Response Remote Obj B Obj A Remote Reference Module

Communication Paradigms Socket communication Remote invocation Indirect communication

Indirect Communication Recall: Indirect communication uses middleware to Provide one-to-many communication Mechanisms eliminate space and time coupling Space coupling: Sender and receiver should know each other’s identities Time coupling: Sender and receiver should be explicitly listening to each other during communication Approach used: Indirection Sender  A Middle-Man  Receiver

Middleware for Indirect Communication Indirect communication can be achieved through: 1.Message-Queuing Systems 2.Group Communication Systems

Message-Queuing (MQ) Systems Message Queuing (MQ) systems provide space and time decoupling between sender and receiver They provide intermediate-term storage capacity for messages (in the form of Queues), without requiring sender or receiver to be active during communication Traditional Request Model Receiver Message-Queuing Model MQ Sender 1. Put message into the queue 2. Get message from the queue ReceiverSender 1. Send message to the receiver

Space and Time Decoupling MQ enables space and time decoupling between sender and receivers Sender and receiver can be passive during communication However, MQ has other types of coupling Sender and receiver have to know the identity of the queue The middleware (queue) should be always active

Space and Time Decoupling (cont’d) Four combinations of loosely-coupled communications are possible in MQ: Recv MQ Sender 1. Sender active; Receiver active Recv MQ Sender 2. Sender active; Receiver passive Recv MQ Sender 3. Sender passive; Receiver active Recv MQ Sender 4. Sender passive; Receiver passive

Interfaces Provided by the MQ System Message Queues enable asynchronous communication by providing the following primitives to the applications: PrimitiveMeaning PUTAppend a message to a specified queue GETBlock until the specified queue is nonempty, and remove the first message POLLCheck a specified queue for messages, and remove the first. Never block NOTIFYInstall a handler (call-back function) to be called when a message is put into the specified queue

Architecture of an MQ System The architecture of an MQ system has to address the following challenges: a.Placement of the Queue Is the queue placed near to the sender or receiver? b.Identity of the Queue How can sender and receiver identify the queue location? c.Intermediate Queue Managers Can MQ be scaled to a large-scale distributed system?

a. Placement of the Queue Each application has a specific pattern of inserting and receiving the messages MQ system is optimized by placing the queue at a location that improves performance Typically, a queue is placed in one of the two locations Source queues: Queue is placed near the source Destination queues: Queue is placed near the destination Examples: “Email Messages” is optimized by the use of destination queues “RSS Feeds” requires source queuing

b. Identity of the Queue In MQ systems, queues are generally addressed by names However, the sender and the receiver should be aware of the network location of the queue A naming service for queues is necessary Database of queue names to network locations is maintained Database can be distributed (similar to DNS)

c. Intermediate Queue Managers Queues are managed by Queue Managers Queue Managers directly interact with sending and receiving processes However, Queue Managers are not scalable in dynamic large-scale Distributed Systems (DSs) Computers participating in a DS may change (thus changing the topology of the DS) There is no general naming service available to dynamically map queue names to network locations Solution: To build an overlay network (e.g., Relays)

c. Intermediate Queue Managers (Cont’d) Relay queue managers (or relays) assist in building dynamic scalable MQ systems Relays act as “routers” for routing the messages from sender to the queue manager Machine A Application 1 Machine B Application Machine C Application Application 2 Relay 1

Middleware for Indirect Communication Indirect communication can be achieved through: 1.Message-Queuing Systems 2.Group Communication Systems

Group Communication Systems Group Communication systems enable one-to-many communication A prime example is multicast communication support for applications Multicast can be supported using two approaches 1.Network-level multicasting 2.Application-level multicasting

1. Network-Level Multicast Each multicast group is assigned a unique IP address Applications “join” the multicast group Multicast tree is built by connecting routers and computers in the group Network-level multicast is not scalable Each DS may have a number of multicast groups Each router on the network has to store information for multicast IP address for each group for each DS Sender Recv 2Recv 1 Recv 3

2. Application-Level Multicast (ALM) ALM organizes the computers involved in a DS into an overlay network The computers in the overlay network cooperate to deliver messages to other computers in the network Network routers do not directly participate in the group communication The overhead of maintaining information at all the Internet routers is eliminated Connections between computers in an overlay network may cross several physical links. Hence, ALM may not be optimal

Building Multicast Tree in ALM Initiator (root) generates a multicast identifier mid If node P wants to join, it sends a join request to the root When request arrives at Q (any node): If Q has not seen a join request before, it becomes a forwarder P becomes child of Q. Join request continues to be forwarded N0 N1 N2 N5N4 N1 N3 N6 N4 N7 = Forwarder = Member = Non-member = Overlay link = Active link for Multicasting

Today… Part II: Naming (Introduction)

Naming Names are used to uniquely identify entities in Distributed Systems Entities may be processes, remote objects, newsgroups, … Names are mapped to entities’ locations using name resolution An example of name resolution Name http://www.cdk5.net:8888/WebExamples/earth.html 55.55.55.55 WebExamples/earth.html8888 DNS Lookup 02:60:8c:02:b0:5a Resource ID (IP Address, Port, File Path) MAC address Host

Names, Addresses and Identifiers An entity can be identified by three types of references 1.Name A name is a set of bits or characters that references an entity Names can be human-friendly (or not) 2.Address Every entity resides on an access point, and access point has an address Addresses may be location-dependent (or not) e.g., IP Address + Port 3.Identifier Identifiers are names that uniquely identify entities A true identifier is a name with the following properties: a.An identifier refers to at-most one entity b.Each entity is referred to by at-most one identifier c.An identifier always refers to the same entity (i.e. it is never reused)

Naming Systems A naming system is simply a middleware that assists in name resolution Naming systems are classified into three classes based on the type of names used: a.Flat naming b.Structured naming c.Attribute-based naming

Classes of Naming Flat naming Structured naming Attribute-based naming

Flat Naming In Flat Naming, identifiers are simply random bits of strings (known as unstructured or flat names) A flat name does not contain any information on how to locate an entity We will study four types of name resolution mechanisms for flat names: 1.Broadcasting 2.Forwarding pointers 3.Home-based approaches 4.Distributed Hash Tables (DHTs)

Approach: Broadcast the name/address to the complete network. The entity associated with the name responds with its current identifier Example: Address Resolution Protocol (ARP) Resolve an IP address to a MAC address In this application, IP address is the address of the entity MAC address is the identifier of the access point Challenges: Not scalable in large networks This technique leads to flooding the network with broadcast messages Requires all entities to listen to all requests 1. Broadcasting Who has the address 192.168.0.1? I am 192.168.0.1. My identifier is 02:AB:4A:3C:59:85

2. Forwarding Pointers Forwarding Pointers enable locating mobile entities Mobile entities move from one access point to another When an entity moves from location A to location B, it leaves behind (at A) a reference to its new location at B Name resolution mechanism Follow the chain of pointers to reach the entity Update the entity’s reference when the present location is found Challenges: Reference to at-least one pointer is necessary Long chains lead to longer resolution delays Long chains are prone to failure due to broken links

Forwarding Pointers – An Example Stub-Scion Pair (SSP) chains implement remote invocation for mobile entities using forwarding pointers Server stub is referred to as Scion in the original paper Each forwarding pointer is implemented as a pair: (client stub, server stub) The server stub contains a local reference to the actual object or a local reference to another client stub When object moves from A to B, It leaves a client stub at A It installs a server stub at B Process P1Process P2Process P3Process P4 = Client stub = Server stub; n = Process n; = Remote Object; = Caller Object;

3. Home-Based Approaches Each entity is assigned a home node Home node is typically static (has fixed access point and address) Home node keeps track of current address of the entity Entity-home interaction: Entity’s home address is registered at a naming service Entity updates the home about its current address (foreign address) whenever it moves Name resolution Client contacts the home to obtain the foreign address Client then contacts the entity at the foreign location

3. Home-Based Approaches – An example Example: Mobile-IP Home node Mobile entity 1. Update home node about the foreign address 2. Client sends the packet to the mobile entity at its home node 3a. Home node forwards the message to the foreign address of the mobile entity 3b. Home node replies the client with the current IP address of the mobile entity 4. Client directly sends all subsequent packets directly to the foreign address of the mobile entity

3. Home-Based Approaches – Challenges Home address is permanent for an entity’s lifetime If the entity permanently moves, then a simple home-based approach incurs higher communication overhead Connection set-up overheads due to communication between the client and the home can be excessive Consider the scenario where the clients are nearer to the mobile entity than the home entity

Next Class Flat naming (Cont’d) Structured naming Attribute-based naming

Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud.

Similar presentations

Presentation on theme: "Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud.

Similar presentations

Presentation on theme: "Distributed Systems CS 15-440 Communication Paradigms (Cont’d) & Naming Lecture 7, Sep 15, 2014 Mohammad Hammoud."— Presentation transcript:

Similar presentations

About project

Feedback