Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline Announcement Existing distributed systems Final Review

Similar presentations


Presentation on theme: "Outline Announcement Existing distributed systems Final Review"— Presentation transcript:

1 Outline Announcement Existing distributed systems Final Review
12/2/2018 COP5611

2 Announcement The final exam will be
5:30 – 7:30 pm on April in LOV 103 The exam will be open-book and open-note You can have at most three hours (if necessary) 12/2/2018 COP5611

3 Existing Distributed Systems
There are commercial as well as experimental distributed systems available In general, theses systems are not generic distributed systems as defined in this class Rather, they are designed to offer some good specific services (with the hope to generate profits from those services) These are mainly based on the chapters from the book by A. S. Tanenbaum and M. van Steen 12/2/2018 COP5611

4 Distributed Object-based Systems
In these systems, everything is treated as an object and clients are offered services and resources in the form of objects they can invoke Examples CORBA Microsoft’s DCOM Globe (Global Object-Based Environment) 12/2/2018 COP5611

5 CORBA 12/2/2018 COP5611

6 Distributed Component Object Model
12/2/2018 COP5611

7 Distributed Component Object Model – cont.
12/2/2018 COP5611

8 GLOBE 12/2/2018 COP5611

9 Object Model The general organization of a CORBA system. 12/2/2018
COP5611

10 Distributed Document-based Systems
A simple paradigm Everything is a document A distributed document-based system allows a user to exchange information Examples World wide web Can be viewed as a huge distributed system consisting of millions of clients and servers Lotus notes Primarily based on databases 12/2/2018 COP5611

11 World Wide Web WWW is essentially a huge client-server system with millions of servers distributed worldwide Each server maintains a collection of documents Each document is stored as a file Or can be generated on request (ASP, JSP, PHP, ....) The simplest way to refer to a document is by means of a reference called Uniform Resource Locator A client interacts with servers with browsers 12/2/2018 COP5611

12 The World Wide Web Overall organization of the Web. 12/2/2018 COP5611

13 Architectural Overview
The principle of using server-side CGI programs. 12/2/2018 COP5611

14 Architectural Overview – cont.
Architectural details of a client and server in the Web. 12/2/2018 COP5611

15 Web Proxy Caching The principle of cooperative caching 12/2/2018
COP5611

16 Server Replication The principle working of the Akami CDN. 12/2/2018
COP5611

17 Security 12/2/2018 COP5611

18 Lotus Notes The general organization of a Lotus Notes system.
12/2/2018 COP5611

19 Processes in Lotus Notes
12/2/2018 COP5611

20 Processes in Lotus Notes – cont.
Request handling in a cluster of Domino servers. 12/2/2018 COP5611

21 Distributed Coordination-based Systems
Newer generation of distributed systems Various components of a system are inherently distributed One of the major issues is coordination of activities 12/2/2018 COP5611

22 TIB / Rendezvous 12/2/2018 COP5611

23 TIB / Rendezvous – cont. 12/2/2018 COP5611

24 Overview of Jini 12/2/2018 COP5611

25 Jini Architecture 12/2/2018 COP5611

26 Communication Events in Jini
12/2/2018 COP5611

27 Announcement Final exam will be on April 29, 2003
From 5:30-7:30 PM (8:30 PM at most you can have) At Love 103 It is an open-book, open-note exam Please plan to come around 5:20 pm so that we can start as early as possible 12/2/2018 COP5611

28 Important Topics These are the main focus for the purpose of the final exam However, other topics are not excluded entirely Important ones Logical clocks/vector clocks Causal ordering of messages Consistent global state/checkpoint Lamport’s algorithm for mutual exclusion (Lab 1 also included) Lab 2 Distributed scheduling algorithms / issues Synchronous checkpointing and recovery/asynchronous checkpointing and recovery Commit protocols / non-blocking commit protocols Static voting protocols Access matrix model / Implementation issues 12/2/2018 COP5611

29 Distributed Systems A distributed system is a collection of independent computers that appears to its users as a single coherent system Independent computers mean that they do not share memory or clock The computers communicate with each other by exchanging messages over a communication network Distributed operating systems are much like the traditional operating systems Resource management User friendliness The key concept is transparency 12/2/2018 COP5611

30 Clients and Servers General interaction between a client and a server.
12/2/2018 COP5611

31 Layered Protocols Layers, interfaces, and protocols in the OSI model.
12/2/2018 COP5611

32 Network Layer The primary task of a network layer is routing
The most widely used network protocol is the connection-less IP (Internet Protocol) Each IP packet is routed to its destination independent of all others A connection-oriented protocol is gaining popularity Virtual channel in ATM networks 12/2/2018 COP5611

33 Transport Layer This layer is the last part of a basic network protocol stack In other words, this layer can be used by application developers An important aspect of this layer is to provide end-to-end communication The Internet transport protocol is called TCP (Transmission Control Protocol) The Internet protocol also supports a connectionless transport protocol called UDP (Universal Datagram Protocol) 12/2/2018 COP5611

34 Sockets – cont. Connection-oriented communication pattern using sockets. 12/2/2018 COP5611

35 A Multithreaded Server
12/2/2018 COP5611

36 The Message Passing Model
The message passing model provides two basic communication primitives Send and receive Send has two logical parameters, a message and its destination Receive has two logical parameters, the source and a buffer for storing the message 12/2/2018 COP5611

37 Semantics of Send and Receive Primitives
There are several design issues regarding send and receive primitives Buffered or un-buffered Blocking vs. non-blocking primitives With blocking primitives, the send does not return control until the message has been sent or received and the receive does not return control until a message is copied to the buffer With non-blocking primitives, the send returns control as the message is copied and the receive signals its intention to receive a message and provide a buffer for it 12/2/2018 COP5611

38 Semantics of Send and Receive Primitives – cont.
Synchronous vs. asynchronous primitives With synchronous primitives, a SEND primitive is blocked until a corresponding RECEIVE primitive is executed With asynchronous primitives, a SEND primitive does not block if there is no corresponding execution of a RECEIVE primitive The messages are buffered 12/2/2018 COP5611

39 Remote Procedure Call RPC is designed to hide all the details from programmers Overcome the difficulties with message-passing model It extends the conventional local procedure calls to calling procedures on remote computers 12/2/2018 COP5611

40 Steps of a Remote Procedure Call – cont.
12/2/2018 COP5611

41 Inherent Limitations of a Distributed System
Absence of a global clock In a centralized system, time is unambiguous In a distributed system, there exists no system wide common clock In other words, the notion of global time does not exist Impact of the absence of global time Difficult to reason about temporal order of events Makes it harder to collect up-to-date information on the state of the entire system 12/2/2018 COP5611

42 Inherent Limitations of a Distributed System
Absence of shared memory An up-to-date state of the entire system is not available to any individual process This information, however, is necessary to reason about the system’s behavior, debugging, recovering from failures 12/2/2018 COP5611

43 Lamport’s Logical Clocks
For a wide of algorithms, what matters is the internal consistency of clocks, not whether they are close to the real time For these algorithms, the clocks are often called logical locks Lamport proposed a scheme to order events in a distributed system using logical clocks 12/2/2018 COP5611

44 Lamport’s Logical Clocks – cont.
Definitions Happened before relation Happened before relation () captures the causal dependencies between events It is defined as follows a  b, if a and b are events in the same process and a occurred before b. a  b, if a is the event of sending a message m in a process and b is the event of receipt of the same message m by another process If a  b and b  c, then a  c, i.e., “” is transitive 12/2/2018 COP5611

45 Lamport’s Logical Clocks – cont.
Definitions – continued Causally related events Event a causally affects event b if a  b Concurrent events Two distinct events a and b are said to be concurrent (denoted by a || b) if a  b and b  a For any two events, either a  b, b  a, or a || b 12/2/2018 COP5611

46 Lamport’s Logical Clocks – cont.
Implementation rules [IR1] Clock Ci is incremented between any two successive events in process Pi Ci := Ci + d ( d > 0) [IR2] If event a is the sending of message m by process Pi, then message m is assigned a timestamp tm = Ci(a). On receiving the same message m by process Pj, Cj is set to Cj := max(Cj, tm + d) 12/2/2018 COP5611

47 An Example 12/2/2018 COP5611

48 Total Ordering Using Lamport’s Clocks
If a is any event at process Pi and b is any event at process Pj, then a => b if and only if either Where is any arbitrary relation that totally orders the processes to break ties 12/2/2018 COP5611

49 A Limitation of Lamport’s Clocks
In Lamport’s system of logical clocks If a  b, then C(a) < C(b) The reverse if not necessarily true if the events have occurred on different processes 12/2/2018 COP5611

50 A Limitation of Lamport’s Clocks
12/2/2018 COP5611

51 Vector Clocks Implementation rules
[IR1] Clock Ci is incremented between any two successive events in process Pi Ci[i] := Ci[i] + d ( d > 0) [IR2] If event a is the sending of message m by process Pi, then message m is assigned a timestamp tm = Ci(a). On receiving the same message m by process Pj, Cj is set to Cj[k] := max(Cj[k], tm[k]) 12/2/2018 COP5611

52 Vector Clocks – cont. 12/2/2018 COP5611

53 Vector Clocks – cont. Assertion
At any instant, Events a and b are casually related if ta < tb or tb < ta. Otherwise, these events are concurrent In a system of vector clocks, 12/2/2018 COP5611

54 Causal Ordering of Messages
The causal ordering of messages tries to maintain the same causal relationship that holds among “message send” events with the corresponding “message receive” events In other words, if Send(M1) -> Send(M2), then Receive(M1) -> Receive(M2) This is different from causal ordering of events 12/2/2018 COP5611

55 Causal Ordering of Messages – cont.
12/2/2018 COP5611

56 Causal Ordering of Messages – cont.
The basic idea It is very simple Deliver a message only when no causality constraints are violated Otherwise, the message is not delivered immediately but is buffered until all the preceding messages are delivered 12/2/2018 COP5611

57 Birman-Schiper-Stephenson Protocol
12/2/2018 COP5611

58 Schiper-Eggli-Sando Protocol
12/2/2018 COP5611

59 Schiper-Eggli-Sando Protocol – cont.
12/2/2018 COP5611

60 Schiper-Eggli-Sando Protocol – cont.
12/2/2018 COP5611

61 Local State Local state More notations
For a site Si, its local state at a given time is defined by the local context of the distributed application, denoted by LSi. More notations mij denotes a message sent by Si to Sj send(mij) and rec(mij) denote the corresponding sending and receiving event. 12/2/2018 COP5611

62 Definitions – cont. 12/2/2018 COP5611

63 Definitions – cont. 12/2/2018 COP5611

64 Global State – cont. 12/2/2018 COP5611

65 Definitions – cont. Strongly consistent global state:
A global state is strongly consistent if it is consistent and transitless 12/2/2018 COP5611

66 Global State – cont. 12/2/2018 COP5611

67 Chandy-Lamport’s Global State Recording Algorithm
12/2/2018 COP5611

68 Cuts of a Distributed Computation
A cut is a graphical representation of a global state A consistent cut is a graphical representation of a consistent global state Definition A cut of a distributed computation is a set C={c1, c2, ...., cn}, where ci is a cut event at site Si in the history of the distributed computation 12/2/2018 COP5611

69 Cuts of a Distributed Computation – cont.
12/2/2018 COP5611

70 Cuts of a Distributed Computation – cont.
12/2/2018 COP5611

71 Cuts of a Distributed Computation – cont.
12/2/2018 COP5611

72 Cuts of a Distributed Computation – cont.
12/2/2018 COP5611

73 Cuts of a Distributed Computation – cont.
12/2/2018 COP5611

74 The Critical Section Problem
When processes (centralized or distributed) interact through shared resources, the integrity of the resources may be violated if the accesses are not coordinated The resources may not record all the changes A process may obtain inconsistent values The final state of the shared resource may be inconsistent 12/2/2018 COP5611

75 Mutual Exclusion One solution to the problem is that at any time at most only one process can access the shared resources This solution is known as mutual exclusion A critical section is a code segment in a process which shared resources are accessed A process can have more than one critical section There are problems which involve shared resources where mutual exclusion is not the optimal solution 12/2/2018 COP5611

76 The Structure of Processes
Structure of process Pi repeat entry section critical section exit section reminder section until false; 12/2/2018 COP5611

77 Requirements of Mutual Exclusion Algorithms
Freedom from deadlocks Two or more sites should not endlessly wait for messages Freedom from starvation A site would wait indefinitely to execute its critical section Fairness Requests are executed in the order based on logical clocks Fault tolerant It continues to work when some failures occur 12/2/2018 COP5611

78 Performance Measure for Distributed Mutual Exclusion
The number of messages per CS invocation Synchronization delay The time required after a site leaves the CS and before the next site enters the CS System throughput 1/(sd+E), where sd is the synchronization delay and E the average CS execution time Response time The time interval a request waits for its CS execution to be over after its request messages have been sent out 12/2/2018 COP5611

79 Performance Measure for Distributed Mutual Exclusion
12/2/2018 COP5611

80 Distributed Solutions
Non-token-based algorithms Use timestamps to order requests and resolve conflicts between simultaneous requests Lamport’s algorithm and Ricart-Agrawala Algorithm Token-based algorithms A unique token is shared among the sites A site is allowed to enter the CS if it possess the token and continues to hold the token until its CS execution is over; then it passes the token to the next site 12/2/2018 COP5611

81 Lamport’s Distributed Mutual Exclusion Algorithm
This algorithm is based on the total ordering using Lamport’s clocks Each process keeps a Lamport’s logical clock Each process is associated with a unique id that can be used to break the ties In the algorithm, each process keeps a queue, request_queuei, which contains mutual exclusion requests ordered by their timestamp and associated id Ri of each process consists of all the processes The communication channel is assumed to be FIFO 12/2/2018 COP5611

82 Lamport’s Distributed Mutual Exclusion Algorithm – cont.
12/2/2018 COP5611

83 Lamport’s Distributed Mutual Exclusion Algorithm – cont.
12/2/2018 COP5611

84 Ricart-Agrawala Algorithm
12/2/2018 COP5611

85 A Simple Toke Ring Algorithm
When the ring is initialized, one process is given the token The token circulates around the ring It is passed from k to k+1 (modulo the ring size) When a process acquires the token from its neighbor, it checks to see if it is waiting to enter its critical section If so, it enters its CS When exiting from its CS, it passes the token to the next Otherwise, it passes the token to the next 12/2/2018 COP5611

86 Suzuki-Kasami’s Algorithm
Data structures Each site maintains a vector consisting the largest sequence number received so far from other sites The token consists of a queue of requesting sites and an array of integers, consisting of the sequence number of the request that a site executed most recently 12/2/2018 COP5611

87 Suzuki-Kasami’s Algorithm – cont.
12/2/2018 COP5611

88 Agreement Protocols In distributed systems, sites are often required to reach mutual agreement In distributed database systems, data managers must agree on whether to commit or to abort a transaction Reaching an agreement requires the sites have knowledge about values at other sites Agreement when the system is free from failures Agreement when the system is prone to failure 12/2/2018 COP5611

89 Agreement Problems There are three well known agreement problems
Byzantine agreement problem Consensus problem Interactive consistency problem 12/2/2018 COP5611

90 Lamport-Shostak-Pease Algorithm
12/2/2018 COP5611

91 Lamport-Shostak-Pease Algorithm – cont.
12/2/2018 COP5611

92 Distributed File Systems
A distributed file system is a resource management component in a distributed operating systems It implements a common file system shared by all the computers in the systems Two important goals Network transparency High availability 12/2/2018 COP5611

93 Architecture – cont. 12/2/2018 COP5611

94 Caching Caching is commonly used in distributed file systems to reduce delays in accessing the data In file caching, a copy of the data stored at a remote file server is brought to the client, reducing access delays due to network latency The effectiveness of caching is based on the temporal locality in programs Files can also be cached at the server side 12/2/2018 COP5611

95 Client Caching 12/2/2018 COP5611

96 Cache Consistency 12/2/2018 COP5611

97 Naming in Distributed File Systems – cont.
Three approaches to naming in distributed file systems The simplest scheme is to concatenate the host name to the names of files Not network transparent Not location-independent Mounting remote directories to local directories Location transparent but not network transparent A single global directory Limited to a few cooperating computers 12/2/2018 COP5611

98 Writing Policy This is related to the cache consistency
It decides what to do when a cache block at the client is modified Several different policies Write-through Delayed writing policy for some time Delayed writing policy when the file is closed 12/2/2018 COP5611

99 Cache Consistency Schemes to guarantee consistency
Server-initiated approach Servers inform the cache managers whenever the data in client caches become stale Cache managers can retrieve the new data when needed Client-initiated approach Cache managers validate data with the server before returning it to the clients Limited caching 12/2/2018 COP5611

100 Availability Availability is an important issue in distributed file systems Replication is the primary mechanism for enhancing the availability of files in distributed file systems Replication Unit of replication Replica management 12/2/2018 COP5611

101 Scalability Scalability deals with the suitability of the design to support more clients Caching helps reduce the client response time Server-initiated cache invalidation Some clients can be used as servers The structure of the server process also plays a major role in scalability 12/2/2018 COP5611

102 Distributed Shared Memory
Distributed computing is mainly based on the message passing model Client/server model Remote procedure calls Distributed shared memory is a resource management component that implements the shared memory model in distributed systems, where there is no physically shared memory 12/2/2018 COP5611

103 Distributed Shared Memory – cont.
This is a further extension of the virtual memory management on a single computer When a process accesses data in the shared address space, a mapping manager maps the shared memory address to the physical memory, which can be local or remote 12/2/2018 COP5611

104 Distributed Shared Memory – cont.
Central implementation issues in DSM How to keep track of the location of remote data How to overcome the communication delays and high overhead associated with communication protocols How to improve the system performance 12/2/2018 COP5611

105 The Central-Server Algorithm
A central server maintains all the shared data It serves the read requests from other nodes or clients by returning the data items to them It updates the data on write requests by clients 12/2/2018 COP5611

106 The Central-Server Algorithm – cont.
12/2/2018 COP5611

107 The Migration Algorithm
In contrast to the central-server algorithm, data in the migration algorithm is shipped to the location of data access request Subsequent accesses can then be performed locally Thrashing can be a problem 12/2/2018 COP5611

108 The Migration Algorithm – cont.
12/2/2018 COP5611

109 The Migration Algorithm – cont.
12/2/2018 COP5611

110 The Read-Replication Algorithm
The read-replication algorithm allows multiple node to have read access or one node to have read-write access 12/2/2018 COP5611

111 The Read-Replication Algorithm – cont.
12/2/2018 COP5611

112 The Read-Replication Algorithm – cont.
12/2/2018 COP5611

113 The Full-Replication Algorithm
This is a further extension of the read-replication algorithm It allows multiple nodes to have both read and write access to shared data blocks Data consistency issue 12/2/2018 COP5611

114 The Full-Replication Algorithm – cont.
12/2/2018 COP5611

115 Memory Coherence Coherence protocols Write-invalidate protocol
A protocol to keep replicas coherent Write-invalidate protocol A write to a shared data causes the invalidation of all copies except the one where the write occurs Write-update protocol A write to a shared data causes all copies of that data to be updated 12/2/2018 COP5611

116 Memory Coherence – cont.
12/2/2018 COP5611

117 False Sharing 12/2/2018 COP5611

118 Distributed Scheduling
A distributed scheduler is a resource management component of a distributed operating system that focuses on judiciously and transparently redistributing the load of the system among the computers to maximize the overall performance 12/2/2018 COP5611

119 Components of a Load Distributing Algorithm
Four components Transfer policy Determines when a node needs to send tasks to other nodes or can receive tasks from other nodes Selection policy Determines which task(s) to transfer Location policy Find suitable nodes for load sharing Information policy 12/2/2018 COP5611

120 Stability The queuing-theoretic perspective Algorithmic stability
The CPU queues grow without bound if arrival rate is greater than the rate at which the system can perform work A load distributing algorithm is effective under a given set of conditions if it improves the performance relative to that of a system not using load distribution Algorithmic stability An algorithm is unstable if it can perform fruitless actions indefinitely with finite probability Processor thrashing 12/2/2018 COP5611

121 Sender-Initiated Algorithms
12/2/2018 COP5611

122 Receiver-Initiated Algorithms
12/2/2018 COP5611

123 Empirical Comparison of Sender-Initiated and Receiver-Initiated Algorithms
12/2/2018 COP5611

124 Symmetrically Initiated Algorithms
Sender-initiated component A sender broadcasts a TooHigh message, sets a TooHigh timeout alarm, and listens for an Accept A receiver that receives a TooHigh message cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value On receiving an Accept message, if the site is still a sender, choose the best task to transfer and transfer it If no Accept has been received before the timeout, it broadcasts a ChangeAverage message to increase the average load estimates at the other nodes 12/2/2018 COP5611

125 Symmetrically Initiated Algorithms – cont.
Receiver-initiated component It broadcasts a TooLow message, set a TooLow timeout alarm, and starts listening for a TooHigh message If TooHigh message is received, it cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value If no TooHigh message is received before the timeout, the receiver broadcasts a ChangeAverage message to decrease the average at other nodes 12/2/2018 COP5611

126 Comparison 12/2/2018 COP5611

127 Adaptive Algorithms A stable symmetrically initiated algorithm
Each node keeps of a senders list, a receivers list, and an OK list By classifying the nodes in the system as Sender/overloaded, Receiver/underloaded, or OK using the information gathered through polling 12/2/2018 COP5611

128 A Stable Symmetrically Initiated Algorithm – cont.
Sender-initiated component The sender polls the node at the head of the receiver The polled node moves the sender to the head of its sender list and sends a message indicating it is a receiver, sender, or OK node The sender updates the polled node based on the reply If the polled node is a receiver, it transfers a task The polling process stops if its receiver’s list becomes empty, or the number of polls reaches a PollLimit 12/2/2018 COP5611

129 A Stable Symmetrically Initiated Algorithm – cont.
Receiver-initiated component The nodes polled in the following order Head to tail of its senders list Tail to head in the OK list Tail to head in the receivers list 12/2/2018 COP5611

130 A Stable Sender-Initiated Algorithm
This algorithm uses the sender-initiated algorithm of the stable symmetrically initiated algorithm Each node is augmented by an array called the statevector It keeps track of its status at all the other nodes in the system It is updated based on the information at the polling stage The receiver-initiated component is replaced by the following protocol When a node becomes a receiver, it informs all the nodes that are misinformed 12/2/2018 COP5611

131 A Stable Sender-Initiated Algorithm – cont.
12/2/2018 COP5611

132 A Stable Sender-Initiated Algorithm – cont.
12/2/2018 COP5611

133 Comparison 12/2/2018 COP5611

134 Task Placement vs. Task Migration
Task placement refers to the transfer of a task that is yet to begin execution to a new location and starts its execution there Task migration refers to the transfer of task that has already begun execution to a new location and continuing its execution there 12/2/2018 COP5611

135 Task Migration State transfer Unfreeze
The task’s state includes the content of registers, the task stack, the task’s status, virtual memory address space, file descriptors, any temporary files and buffered messages In addition, current working directory, signal masks and handlers, resource usage statistics, and references to children and parent processes Unfreeze The task is installed at the new machine and is put in the ready queue 12/2/2018 COP5611

136 Issues in Task Migration
State transfer The cost to support remote execution, including delays due to freezing the tasks, obtaining and transferring the state, and unfreezing the task Residual dependencies Transferring pages in the virtual memory space Redirection of messages Location-dependent system calls Residual dependencies are undesirable 12/2/2018 COP5611

137 State Transfer Mechanisms
12/2/2018 COP5611

138 Fault Tolerance and Recovery
An important goal of distributed systems is to increase the availability and reliability by automatically recovering from partial failures without seriously affecting the overall performance Closely related to dependable systems Two broad classes of algorithms Recovery When a failure has occurred, recover the system from the erroneous state to an error-free state Fault tolerance Failure masking 12/2/2018 COP5611

139 Classification of Failures
12/2/2018 COP5611

140 Backward and Forward Error Recovery
If the errors and damages caused by faults can be completely and accurately accessed, then those errors can be removed and resulting state is error-free Backward-error recovery If the errors and damages can not be accessed, the system can be restored to a previous error-free state Problems include performance penalty, recurrence of faults, and unrecoverable components 12/2/2018 COP5611

141 Basic Approach in Backward Error Recovery
The basic idea is to save some recovery points on a stable storage Operation-based approach State-based approach 12/2/2018 COP5611

142 The Operation-based Approach
In this approach, all the modifications that are made to the state of a process are recorded so that a previous state can be restored by reversing all the changes made to the state The record of the system activity is known as an audit trail or log 12/2/2018 COP5611

143 The Operation-based Approach – cont.
Updating-in-place Every update operating to an object updates the object and results in a log to be recorded in a stable storage which has enough information to completely undo and redo the operation The information needs to include the name of the object, the old state of the object, and the new state of the object It can be implemented as a collection of operations Do, undo, redo, and display operations The major problem is a do operation cannot be undone if the system crashes after the update operation but before the log record is stored 12/2/2018 COP5611

144 The Operation-based Approach – cont.
The write-ahead-log protocol To overcome the problem in updating-in-place scheme, a recoverable operation is implemented by the following operations Update an object only after the undo log is recorded Before committing the updates, redo and undo logs are recorded Problems with operation-based approach Writing a log on every update operation is expensive in terms of storage requirement and CPU overhead 12/2/2018 COP5611

145 The State-based Approach
In this approach, the complete state of a process is saved when a recovery point is established Recovering involves reinstating its saved state and resuming the execution of the process from that state The process of saving state is referred to as checkpointing or taking a checkpoint A recovery point at which checkpointing occurs is referred to as a checkpoint The process of restoring a process to a prior-state is referred to as rolling-back the process 12/2/2018 COP5611

146 Issues in Recovery in Concurrent Systems
How and when to do checkpointing Independently or collectively Asynchronous vs. synchronous checkpointing Recovery Which checkpoints to be used when processes roll back? Strongly consistent checkpoints Consistent checkpoints 12/2/2018 COP5611

147 Orphan Messages and Domino Effect
12/2/2018 COP5611

148 Lost Messages 12/2/2018 COP5611

149 Consistent Set of Checkpoints
12/2/2018 COP5611

150 Synchronous Checkpointing
The processes are coordinated so that the set of all recent checkpoints is guaranteed to be consistent Assumptions Processes communicate by exchanging messages Communication channels are FIFO Communication failures do not partition the network Two kinds of checkpoints Permanent checkpoints Tentative checkpoints Two phases in checkpointing 12/2/2018 COP5611

151 Synchronous Checkpointing – cont.
First phase An initiating process Pi takes a tentative checkpoint and requests all other processes to take tentative checkpoints Each process informs Pi whether it succeeded in taking a tentative checkpoint If Pi learns that all the processes have successfully taken tentative checkpoints, Pi decides that all checkpoints should be made permanent; otherwise, Pi decides all the tentative checkpoints should be discarded 12/2/2018 COP5611

152 Synchronous Checkpointing – cont.
Second phase Pi informs all the processes of its decision A process, on receiving the message from Pi will act accordingly. Therefore, either all the processes take the permanent checkpoints or none Every process must not send messages related to the computation after taking the tentative checkpointing and but before receiving the decision from Pi Correctness A set of permanent checkpoints taken by the algorithm is consistent 12/2/2018 COP5611

153 Synchronous Checkpointing – cont.
12/2/2018 COP5611

154 The Rollback Recovery Algorithm
Assumptions A single process invokes the algorithm Rollback recovery and checkpointing are not concurrently invoked First phase A process Pi checks to see if all the processes are willing to restart from their previous checkpoints If all processes reply “yes”, Pi decides to restart; otherwise, no Second phase Pi propagates its decision to all the processes 12/2/2018 COP5611

155 The Rollback Recovery Algorithm – cont.
12/2/2018 COP5611

156 Asynchronous Checkpointing and Recovery
In asynchronous checkpointing and recovery, checkpoints are taken independently A recovery algorithm has to search for the most recent consistent set of checkpoints before it can initiate recovery 12/2/2018 COP5611

157 Asynchronous Checkpointing and Recovery – cont.
Two types of log storage: volatile log and stable log Each processor, after an event, records a triplet Its state before the event The message The set of messages sent by the processor 12/2/2018 COP5611

158 Asynchronous Checkpointing and Recovery – cont.
The basic idea is to find out whether there are orphan messages if processors roll back to a set of checkpoints The existence of orphan messages is discovered by comparing the number of messages sent and received 12/2/2018 COP5611

159 Asynchronous Checkpointing and Recovery – cont.
12/2/2018 COP5611

160 Introduction to Fault Tolerance
A system is fault-tolerant If it can mask failures It continues to perform its specified function in the event of a failure Mainly through redundancy Or it exhibits a well defined failure behavior in the event of failure Distributed commit, either all sites commit a particular operation or none of them 12/2/2018 COP5611

161 Fault Tolerance Through Redundancy
The key approach to fault tolerance is redundancy Three kinds of redundancy Information redundancy Time redundancy Physical redundancy A system can have A multiple number of processes A multiple number of hardware components A multiple number of copies of data 12/2/2018 COP5611

162 Failure Resilient Processes
A process is resilient if it masks failures and guarantees progress despite a certain number of system failures Backup processes In this approach, each resilient process is implemented by a primary process and one or more backup processes The state of the primary processes is stored at some intervals If the primary terminates, one of the backup processes becomes active and takes over 12/2/2018 COP5611

163 Failure Resilient Processes – cont.
Replicated execution Several processes execute the same program concurrently It can increase the reliability and availability It requires that all requests at all processes in the same order Nonidempotent operations need to be taken care of 12/2/2018 COP5611

164 Distributed Commit The distributed commit problem involves having an operation being performed by each member of a process group or none at all This is referred to as global atomicity Commit protocols Given that each site has a recovery strategy at the local level, commit protocols ensure that all the sites either commit or abort the transaction unanimously, even in the presence of multiple and repetitive failures 12/2/2018 COP5611

165 Two-Phase Commit Protocol
In this protocol, one of the processes acts as a coordinator Other processes are referred to as cohorts Cohorts are assumed to be executing at different sites A stable storage is available at each site The write-ahead log protocol is used There are two phases involved in the protocol 12/2/2018 COP5611

166 Two-Phase Commit Protocol – cont.
Coordinator 12/2/2018 COP5611

167 Non-blocking Commit Protocols
To be non-blocking in the event of site failures Operational sites should agree on the outcome of the transaction by examining their local states Failed sites, upon recovery, must also reach the same conclusion regarding the outcome of the transaction as operational sites do Independent recovery refers to the situation that the recovering sites can decide the final outcome of the transaction based solely on their local state 12/2/2018 COP5611

168 Three-Phase Commit Protocol for Single Site Failure
12/2/2018 COP5611

169 Voting Protocols Distributed commit protocols are resilient to single site failures But they are not resilient to multiple site failures, communication failures, and network partitioning Voting protocols are more fault tolerant They allow data accesses under network failures, multiple site failures, and message losses without compromising the integrity of the data The basic idea is that each replica is assigned some number of votes and a majority of votes must be collected before a process can access a replica 12/2/2018 COP5611

170 Static Voting System model
The replicas of files are stored at different sites Every file access operation requires that an appropriate lock is obtained The lock rule allows either “one writer and no readers” or “multiple readers and no writers” Every file is associated with a version number Indicates the number of times the file has been updated Version numbers are stored on stable storage Every write operation updates its version number 12/2/2018 COP5611

171 Static Voting – cont. Basic idea
Every replica is assigned a certain number of votes This information is stored on stable storage A read or write operation is permitted if a certain number of votes, read quorum or write quorum, are collected by the requesting process 12/2/2018 COP5611

172 Static Voting – cont. 12/2/2018 COP5611

173 Static Voting – cont. 12/2/2018 COP5611

174 Static Voting – cont. 12/2/2018 COP5611

175 Vote Assignment 12/2/2018 COP5611

176 Vote Assignment Examples
12/2/2018 COP5611

177 Security and Protection Introduction
Protection and security deal with the control of unauthorized use and the access to hardware and software resources Confidentiality Information should only be disclosed to authorized parties Integrity Alterations can be made only in an authorized way Authorization Authentication 12/2/2018 COP5611

178 Security and Protection Introduction – cont.
Security policy Which actions on what entities in a system are allowed Security mechanisms By which a policy can be enforced Important ones include Encryption Authentication Authorization Auditing 12/2/2018 COP5611

179 Access Control Typical distributed systems are organized as client-server architectures A request from a service generally involves invoking a method of a specific object Verifying access right is referred to as access control, whereas authorization is about granting access rights 12/2/2018 COP5611

180 Access Control – cont. General issues
The system consists of subjects that issue a request to access an object Subjects are processes acting on behalf of users, but can also be objects that need the services of other objects Objects are entities with their own state and operations 12/2/2018 COP5611

181 Access Control Matrix Three components Current objects
Current subjects Generic rights 12/2/2018 COP5611

182 Access Control Matrix – cont.
12/2/2018 COP5611

183 Access Control Matrix – cont.
12/2/2018 COP5611

184 Capabilities The capability based method corresponds to the row-wise decomposition of the access matrix Each subject s is assigned a list of pairs (o, P[s,o]) for all objects o that it is allowed to access The pairs are referred to as as capabilities 12/2/2018 COP5611

185 Capabilities – cont. Advantages of capabilities Drawbacks Efficiency
Simplicity Flexibility Drawbacks Control of propagation Review is difficult Revocation of access rights is difficult Garbage collection is difficult 12/2/2018 COP5611

186 The Access Control List Method
Corresponds to the column-wise decomposition of the access matrix Each object o is assigned a list of pairs (s, P[s, o]) for all subjects s that are allowed to access the object When a subject s requests access a to object o, the system checks the access control list of o to see if an entry (s, F) exists; if yes, then check if a belongs to F 12/2/2018 COP5611

187 Access Control List Method – cont.
12/2/2018 COP5611

188 Access Control List Method – cont.
Advantages Easy revocation Easy review of an access Implementation issues Efficiency of execution Efficiency of storage Protection groups Authority to change an access control list Self control Hierarchical control 12/2/2018 COP5611

189 The Lock-Key Method A hybrid of the capability-based method and the access control list method Every subject has a capability list that contains tuples of the form (O, k), indicating that the subject can access object O using key k Every object has an access control list that contains tuples of the form (l, ), called a lock entry, indicating that any subject which can open the lock l can access this object in modes in  12/2/2018 COP5611

190 The Lock-Key Method – cont.
When a subject s makes the request to access object o in mode a, the system does the following The system locates the tuple (o, k) in the capability list of the subject If no such tuple is found, the access is not permitted Otherwise, the access is permitted only if there exists a lock entry (l, Y) in the access control list of the object o such that k = l and a   12/2/2018 COP5611

191 Authentication 12/2/2018 COP5611

192 Design Principles Shannon’s principle Exhaustive search principle
Shannon’s principle of diffusion – Spread the correlation and dependencies among key-string variables over substrings as much as possible Shannon’s principle of confusion – Change a piece of information so that the output has no obvious relation to the input Exhaustive search principle The determination of the key requires an exhaustive search of the an extremely large space 12/2/2018 COP5611

193 Private Key Cryptography
Data encryption standard (DES) It is a block cipher that crypts 64-bit data blocks using a 56-bit key Two basic operations Permutation Substitution Three stages Initial permutation stage Complex transformation stage Final permutation stage 12/2/2018 COP5611

194 Public Key Cryptography – cont.
Now it is possible for two users to have a secure communication even they have not communicated before Implementation issues One-way functions 12/2/2018 COP5611

195 RSA Method The encryption key is a pair (e, n)
The decryption key is a pair (d, n) 12/2/2018 COP5611

196 RSA Method – cont. Generating the private and public key requires four steps Choose two very large prime numbers, p and q Compute n = p x q and z = (p – 1) x (q – 1) Choose a number d that is relatively prime to z Compute the number e such that e x d = 1 mod z 12/2/2018 COP5611

197 Authentication In distributed systems, authentication means verifying the identity of communicating entities to each other The assumption is that the communication network is not secure in that an intruder can copy and play back a message on the network The textbook called it “interactive secure connections” 12/2/2018 COP5611

198 Authentication – cont. Authentication based on a shared secret key.
12/2/2018 COP5611

199 Authentication Using a Key Distribution Center
The principle of using a KDC. 12/2/2018 COP5611

200 Authentication Using Public-Key Cryptography
Mutual authentication in a public-key cryptosystem. 12/2/2018 COP5611

201 Message Integrity and Confidentiality
Message integrity means that messages are protected against modification Confidentiality ensures that messages cannot be intercepted and read by eavesdroppers Digital signatures A user cannot forge the signature of other users A sender of a signed message cannot deny the validity of his signature on the message A recipient of a signed message cannot modify the signature in the message 12/2/2018 COP5611

202 Digital Signatures Digital signing a message using public-key cryptography. 12/2/2018 COP5611

203 Kerberos 12/2/2018 COP5611

204 Kerberos – cont. Setting up a secure channel in Kerberos. 12/2/2018
COP5611

205 Thank You I have learned many things from this class
I hope you have learned some things that are useful to your career either directly or indirectly I have enjoyed teaching this class I hope you have enjoyed it too – at least to some degree If you have comments/suggestions/criticisms, please let me know I am open to novel ideas 12/2/2018 COP5611


Download ppt "Outline Announcement Existing distributed systems Final Review"

Similar presentations


Ads by Google