Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 1 Introduction Elliott mods with Comic Sans font 1.0
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Definition of a Distributed System (1) A distributed system is: A collection of independent computers that appears to its users as a single coherent system.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Definition of a Distributed System (2) Figure 1-1. A distributed system organized as middleware. The middleware layer extends over multiple machines, and offers each application the same interface.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transparency in a Distributed System Figure 1-2. Different forms of transparency in a distributed system (ISO, 1995).
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transparency not always good It might be useful to know that the T3 server in your local city is down, and you are now connected over a phone line in Kenya. So – make an informed decision about whether you are getting good service.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Open system Offer services Publish the interface requirements – often captured in an IDL – Interface Definition Language. IDL – rules are so carefully specified that we can actually use them as a language fit for input to a compiler. BUT --- Main purpose is to define the interface, and we could do this with pencil and paper. The rest is serendipity for computer scientists.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved IDL -- example Name of procedure Arg one is 32-bit unsigned integer describing the length of arg two Arg two is ascii character string of length contained in arg one.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Hello-IDL-World Procedure name: Hello-IDL-World Arg one: Length-of-string Arg two: Hello-Message-String
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Client Stub Could generate this client stub for calling the server: Char Hello-Message-String [] int Hello-IDL-World { –(unsigned int) len, –(char *) Hello-Message-String – –Return(Remote-opsys-marshaling-call-FB23AA(len, Hello-Message- String)); }
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Server Skeleton (Stub) Could generate this server skeleton stub called by the (un)marshaling RPC OpSys subsystem on server: Void Remote-Hello ((unsigned int) len, (char *) HelloMsg){ int return-val == 0; return (int) }
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Policy separated from Implementation Not only the interfaces defined, but also the parts of the system itself. Replace or update one part of the local system without affecting the other parts.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scalability Problems Figure 1-3. Examples of scalability limitations.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scalability Problems Characteristics of decentralized (distributed) algorithms: No machine has complete information about the system state. Machines make decisions based only on local information. Failure of one machine does not ruin the algorithm. There is no implicit assumption that a global clock exists.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scaling Size Geography Administration Example: SE435
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scale-up of SE435 Size: how many students can fit in a classroom? Size: How many DL students can the server handle? –Bandwidth for the AV –Room in the grading links –Etc
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scaling of Geography Instructor drives to remote campuses DL: what happens when DL becomes global? –Timing problems: deadlines are in the middle of the night, fast turn- around mail, foreign customs –Suppose there is a lecture on security?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scaling of admin How to give grades for 4,000 students? Where are the assignments stored? Hierarchy of instructors – how do they coordinate? Administration is often forgotten – Elliott notes that this is often the most critical bottleneck in scaling up DS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Geography - blocking Blocking: make a procedure call, and do nothing until the call returns Works fine on a local system – if the called procedure fails, the calling procedure is probably failing too – so who cares? Distributed system: Caller still alive, but called procedure on remote system fails. Geography – more failures, longer wait for call to return.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Synchronous vs. Asynchronous Blocking calls are synchronous Solution is non-blocking, or asynchronous calls. New programming paradigm. Logic is potentially much more complex. Idea is the calling process continues, without waiting for the called process to return.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Software Concepts An overview between DOS (Distributed Operating Systems) NOS (Network Operating Systems) Middleware SystemDescriptionMain Goal DOS Tightly-coupled operating system for multi-processors and homogeneous multicomputers Hide and manage hardware resources NOS Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Offer local services to remote clients Middleware Additional layer atop of NOS implementing general-purpose services Provide distribution transparency
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scaling Techniques (1) Figure 1-4. The difference between letting (a) a server or (b) a client check forms as they are being filled.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Scaling Techniques (2) Figure 1-5. An example of dividing the DNS name space into zones.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Pitfalls when Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Cluster Computing Systems Figure 1-6. An example of a cluster computing system.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Grid Computing Systems Figure 1-7. A layered architecture for grid computing systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transaction Processing Systems (1) Figure 1-8. Example primitives for transactions.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transaction Processing Systems (2) Characteristic properties of transactions: Atomic: To the outside world, the transaction happens indivisibly. Consistent: The transaction does not violate system invariants. Isolated: Concurrent transactions do not interfere with each other. Durable: Once a transaction commits, the changes are permanent.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transaction Processing Systems (2b) Transfer Sam’s money from checking to savings: Add $3,000 to Sam’s savings account. Remove $3,000 from Sam’s checking account. During the process, there is a time when there is an extra $3,000 in the bank.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transaction Processing Systems (3) Figure 1-9. A nested transaction.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Transaction Processing Systems (4) Figure The role of a TP monitor in distributed systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Shared memory IPCs (InterProcess Communications) are used to communicate between processes through shared memory. One writes, the other reads, then vice versa. In a multicomputer OS if we do not have some emulation of shared memory we have to send messages. Messages require buffering. Classic producer – consumer problem. Requires blocking if the buffer is full.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Multicomputer Operating Systems – hey, where’s my IPC? General structure of a multicomputer operating system 1.14
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Atomic Action The unit cannot be broken down further. Derives from the idea of an atom, which at the time we believed was the basic building block of the universe. For application systems, this means that any process will be allowed to complete an atomic section of code as though it were without being swapped out of the processor and superseded by another process, or interleaved with another process at the software level.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Monitor – The toaster example Wait means go to sleep. Signal means to wake someone up. Suppose you have a four slot toaster, and ten people in the kitchen who want to drop bread slices into the toaster. If the toaster is full, you have to wait, and while you are waiting, you get some sleep. If you are done you wake up the next person that is waiting for a toaster slot. Count keeps track of how many slots are available: starts at 4, max is 4. Min is 0.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Monitor – example cont. When you are done, you increment count to show that another slot is now available. When you put bread in a slot, you first decrement count. If you cannot because it is already zero, you wait.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Monitor -- Increment If no one is waiting, count = count + 1 showing that you freed a slot. If someone is waiting, wake them up. Leave count alone, which is the same as adding 1 for the slot that you have now freed, and subtracting 1 for the person you woke up.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Monitor -- Decrement If no one is waiting, count = count -1. You are now using one of the slots. If there are no slots, then –blocked = blocked + 1 showing that you are waiting. –Wait. When you wake up, someone else will have already decremented count for you, but you have to set blocked = blocked-1.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Test and Set A.Look at gatekeeper. B.If TRUE (busy) then return to A C.If FALSE (free) then D.Set gatekeeper to TRUE E.Complete Critical Section F.Set gatekeeper back to FALSE 1.Process X completes to C, then is swapped out 2.Process Y completes D and enters CS, swapped out 3.Process X completes D, enters CS – big trouble! Hardware support for atomic “Look and set to TRUE” inst.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Monitors in DS – hey – where’s my hardware support? Locked / shared resources may be on a remote machine. They may involve processes on different machines with different memory spaces. Relevant to transactions and distributed transactions covered later. Dead processes locking a resource can be a big problem, especially on a remote machine. No test and set instruction!
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Distributed Shared Memory Systems (2 False sharing of a page between two independent processes. 1.18
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Positioning Middleware General structure of a distributed system as middleware. 1-22
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Middleware and Openness In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications. 1.23
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Middleware and Openness In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications. 1.23
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Enterprise Application Integration Figure Middleware as a communication facilitator in enterprise application integration.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC / RMI / etc. MOM Problems using send/receive Sender and receiver have to be active S and R have to know each other’s address or endpoint Buffering concerns must always be considered. Push toward MOMs that handle some of these problems. Trade control for convenience
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Multicomputer Operating Systems (1) Relation between blocking, buffering, and reliable communications. Synchronization pointSend bufferReliable comm. guaranteed? Block sender until buffer not fullYesNot necessary Block sender until message sentNoNot necessary Block sender until message receivedNoNecessary Block sender until message deliveredNoNecessary
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Multicomputer Operating Systems (2) Alternatives for blocking and buffering in message passing. 1.15
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Clients and Servers General interaction between a client and a server. 1.25
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Comparison between Systems A comparison between multiprocessor operating systems, multicomputer operating systems, network operating systems, and middleware based distributed systems. Item Distributed OS Network OS Middleware- based OS Multiproc.Multicomp. Degree of transparencyVery HighHighLowHigh Same OS on all nodesYes No Number of copies of OS1NNN Basis for communicationShared memoryMessagesFilesModel specific Resource managementGlobal, centralGlobal, distributedPer node ScalabilityNoModeratelyYesVaries OpennessClosed Open
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Distributed Pervasive Systems Requirements for pervasive systems -- wireless, small, battery-powered, mobile Discover environment Embrace contextual changes. Encourage ad hoc composition. Recognize sharing as the default. Personal space vs. shared space No central admin
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Electronic Health Care Systems (1) Questions to be addressed for health care systems: Where and how should monitored data be stored? How can we prevent loss of crucial data? What infrastructure is needed to generate and propagate alerts? How can physicians provide online feedback? How can extreme robustness of the monitoring system be realized? What are the security issues and how can the proper policies be enforced?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Electronic Health Care Systems (2) Figure Monitoring a person in a pervasive electronic health care system, using (a) a local hub or (b) a continuous wireless connection.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sensor Networks (1) Questions concerning sensor networks: How do we (dynamically) set up an efficient tree in a sensor network? How does aggregation of results take place? Can it be controlled? What happens when network links fail?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sensor Networks (2) Figure Organizing a sensor network database, while storing and processing data (a) only at the operator’s site or …
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sensor Networks (3) Figure Organizing a sensor network database, while storing and processing data … or (b) only at the sensors.