20101 Chapter 7 The Application Layer Message Passing
20102 Message passing Message passing is a form of interaction between two or more processes, which provides both communication and synchronization: a message can only be received after it has been sent. The actual function of message passing is usually provided with 2 primitives: send (destination, message) and receive (source, message). After a send the process can continue directly (non-blocking) or after the message has been received (blocking). Also a receive can be blocking or non-blocking. Usually the send is non-blocking, allowing the sender to send one or more messages to various destinations as quickly as possible. The receive is usually blocked, as the receiving process needs input data before it can do useful work.
20103 Parallel Virtual Machine PVM is a de-facto standard system for message passing, another is MPI.
20104 PVM principles User-configured host pool: selected by the user for a given run of the PVM program. The host pool may be altered by adding and deleting machines during operation (an important feature for fault tolerance). Translucent access to hardware: Application programs either may view the hardware environment as an attribute less collection of virtual processing elements or may choose to exploit the capabilities of specific machines in the host pool. Process-based computation: The unit of parallelism in PVM is a task (often but not always a Unix process). No process-to-processor mapping is implied or enforced by PVM; in particular, multiple tasks may execute on a single processor. Explicit message-passing model Tasks cooperate by explicitly sending and receiving messages to one another. Heterogeneity support: in terms of machines, networks, and applications. PVM permits messages containing more than one data type to be exchanged between machines having different data representations. Multiprocessor support: PVM uses the native message-passing facilities on multiprocessors to take advantage of the underlying hardware. Vendors often supply their own optimized PVM for their systems, which can still communicate with the public PVM version.
20105 Support for Resource Management add/delete hosts from a virtual machine Process Control spawn/kill tasks dynamically Message Passing blocking send, blocking and non-blocking receive, multicast messages Dynamic Task Groups task can join or leave a group at any time Fault Tolerance VM automatically detects faults and adjusts
20106 Popular PVM Uses Poor man’s Supercomputer –Beowulf (PC) clusters, Linux, Solaris, NT –Cobble together whatever resources you can get Metacomputer linking multiple Supercomputers ultimate performance: eg. have combined nearly 3000 processors and up to 53 supercomputers Education Tool teaching parallel programming academic and thesis research
20107 Message buffers int bufid = pvm_initsend( int encoding ) If the user is using only a single send buffer (and this is the typical case) then this is the only required buffer routine. The new buffer identifier is returned in bufid. The encoding options are as follows: PvmDataDefault -XDR encoding is used by default. This encodes integers, floats, etc. in a machine independent format, thus the message can be read by any machine in a heterogeneous environment. PvmDataRaw -no encoding is done. Messages are sent in their original format, only the same type of machine can read it. PvmDataInPlace - data left in place to save on packing costs. Buffer contains only sizes and pointers to the items to be sent. When pvm_send() is called, the items are copied directly out of the user's memory.
20108 Packing of data Each of the following C routines packs an array of the given data type into the active send buffer. They can be called multiple times to pack data into a single message. Thus, a message can contain several arrays each with a different data type. The arguments for each of the routines are a pointer to the first item to be packed, nitem which is the total number of items to pack from this array, and stride which is the stride to use when packing. A stride of 1 means a contiguous vector is packed, a stride of 2 means every other item is packed, and so on. int info = pvm_pkbyte( char *cp, int nitem, int stride ) int info = pvm_pkcplx( float *xp, int nitem, int stride ) int info = pvm_pkdcplx( double *zp, int nitem, int stride ) int info = pvm_pkdouble( double *dp, int nitem, int stride ) int info = pvm_pkfloat( float *fp, int nitem, int stride ) etc. There are also tools to ease packing of structures.
20109 Sending and receiving messages int info = pvm_send( int tid, int msgtag ) int info = pvm_mcast( int *tids, int ntask, int msgtag ) The routine pvm_send() labels the message with an positive integer identifier msgtag and sends it immediately to the process TID. The routine pvm_mcast() broadcasts the message to all tasks specified in the integer array tids (except itself). int bufid = pvm_recv( int tid, int msgtag ) This blocking receive routine will wait until a message with label msgtag has arrived from TID. A value of -1 in msgtag or TID matches anything (wildcard). It then places the message in a new active receive buffer that is created. Also a non-blocking and a time-out version of receive are provided. With pvm_probe() the receive queue can be checked to see if messages of certain type or sender have arrived. Functions which combine packing / sending and receiving / unpacking, are also provided. On multiprocessor systems they can often make better use of the native message passing facilities, so they can work faster
XPVM Graphical Console and Monitor XPVM provides a graphical interface to the PVM console commands and information, along with several animated views to monitor the execution of PVM programs. This is used to assist in debugging and performance tuning. XPVM provides point-and-click access to the PVM console commands. A pull-down menu allows users to add or delete hosts to configure the virtual machine. Tasks can be spawned using a dialog box that prompts for all spawn options, including the trace mask to determine which PVM routines to trace for XPVM. The Active color implies that at least one task on that host is busy executing useful work. The System color means that at least one task is busy executing PVM system routines.
Space-time view The Space- Time View shows the status of individual tasks as they execute across all hosts. The Computing color shows those times when the task is busy executing useful user computations. The Overhead color marks the places where the task executes PVM system routines for communication, task control, etc. The Waiting color indicates those time periods spent waiting for messages from other tasks.
Remote Procedure Calls For the calling program on the client this looks like a normal function, e.g. int func (parameter list). But instead of a local function a stub function is called, with passes information to a similar stub function on the server machine. There the function func (parameter list) is called, and its return value is transported back to the calling program on the client. Similar approaches are now also available for object oriented languages and systems.
SUN RPC One of the first available commercial examples. It is used in the Network File System (NFS) allowing workstations to easily use file systems on other workstations or servers. This consists of the following parts: 1.RPCGEN: a compiler that takes the definition of a remote procedure interface in a C like language, and generates the client stubs and the server stubs. 2.XDR (eXternal Data Representation): a standard way of encoding data in a portable fashion between different systems. It imposes a big-endian byte ordering and the minimum size of any field is 32 bits. This means that both the client and the server have to perform some translation. 3.A run-time library. The RPC protocol can be implemented on any transport protocol. In the case of TCP/IP, it can use either TCP or UDP as the transport vehicle. In case UDP is used, remember that this does not provide reliability, so it will be up to the caller program itself to ensure this (using timeouts and retransmissions, usually implemented in RPC library routines). Note that even with TCP, the caller program still needs a timeout routine to deal with exceptional situations such as a server crash.
RPC message The RPC call message consists of several fields: Identification: The remote program number identifies a functional group of procedures, for instance a file system, which would include individual procedures like read and write. The individual procedures are identified by a unique procedure number. As the remote program evolves, a version number is assigned to the different releases. Authentication fields. Two fields, credentials and verifier, are provided for the authentication of the caller to the service. It is up to the server to use this information for user authentication. Some authentication protocols are: Null authentication. UNIX authentication. DES authentication. Procedure parameters. Data (parameters) in XDR format passed to the remote procedure. Portmap is a server application (on port 111) that will map a program number and its version number to the TCP port number used by the program.
Client and server code
RMI (Remote Method Invocation) Objects are remote if they reside in a different JVM (Java Virtual Machine). If the marshalled parameters are local objects, they are passed by copy using object serialization. These objects must implement the java.io.Serializable interface. Remote objects are passed by reference. The stub and skeleton class files are generated using the rmic compiler Remote objects are defined by first declaring the interface that specifies the methods that may be invoked remotely. This interface must extend java.rmi.Remote, and each method must throw java.rmi.RemoteException. The implementation class must extend java.rmi.server.UnicastRemoteObject, allowing the creation of a single remote object that listens for network requests using RMI's default scheme of sockets for network communication. A client can get a reference to a remote object using the Naming.lookup(name) method, where the name is URL like: rmi://host/objectName.
CORBA CORBA allows heterogeneous client and server applications to communicate, e.g. a C++ program accessing a database written in COBOL. The stubs and skeletons are generated by an IDL compiler. Microsoft has its own standard Common Object Model (COM), the basis for Object Linking and Embedding (OLE).
(A)synchronous RPC