Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Principles of Reliable Distributed Systems Lecture 1: Introduction Spring 2007 Idit Keidar
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Staff Lecturer: Idit Keidar –Office hours: Mon 14:30-15:30 Meyer 902 TA: Alex Shraer
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Material Textbooks: –Distributed Systems 2nd edition Sape Mullender (Editor), ACM Press Frontier Series, Addison Wesley –Distributed Computing; Fundamentals, Simulations and Advanced Topics Hagit Attiya and Jennifer Welch, McGraw Hill Research papers –See links on course web page Lecture slides –Do NOT cover all the material!
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Grading and Requirements Final exam – 50%-80% –Allowed material: annotated lecture slides Dry homework assignments (4 of 5) – 30% MAGEN –Will count only if difference from exam score is < 30 –Good practice for exam! –Submit individually You may discuss with others, but write by yourself Wet homework assignments – 20% TAKEF –Two assignments, larger one in Passover –Submit in pairs or individually
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Prerequisites You need background in algorithms and operating systems You need some programming experience If you do not have the prerequisites (or CS equivalents), you need explicit permission from me to take the course
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Birdseye View of Course Syllabus Distributed systems, models, basic concepts Replication, atomic broadcast The consensus problem in different models Shared memory models and storage-based systems Peer-to-peer computing
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Distributed Systems Characteristics, Issues, Availability Material: Chapter 1 of Mullender
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Characteristics Multiple computers –each having CPU, local memory, stable storage (disk), I/O to the environment Interconnections (networked system) Shared state –correct operation of the system described in terms of global invariants –maintaining these requires coordination
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Issues to Address Independent failure Unreliable communication Costly communication Insecure communication Software bugs Malicious intrusion
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Interesting System Properties Performance –metrics: latency (time complexity), overhead (message complexity), throughput Scalability Reliability –under what circumstances the system is reliable (i.e., satisfies its specification) –the probability that the system is reliable –can the user know when the system is not reliable (fail- awareness)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Advantages of Distributed (Networked) Systems Sharing of information and resources over wide geographical spread Small cost-effective computers close to data Incremental growth Management autonomy for components Independent failure
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Advantages of Centralized Systems All information and resources equally accessible Functions always work the same way Object names are always the same Easy management Goal for distributed system: provide the above abstractions
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 What About Availability? Security? Is it better to put all the eggs in one basket? –independent failure vs. fate-sharing Independent failure can be a problem: “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.” [Lamport 87] –but one can exploit independent failure to provide better availability And communication can fail too…
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 How do you survive failures and achieve high availability?
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Replication Multiple copies of data/service –synchronize for consistency
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Availability Overcoming independent failure with redundancy Spatial redundancy: multiple servers for the same service –failing independently –degree of replication defines availability level Temporal redundancy: repeat operations
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Examples of Reliable Distributed Computing Paradigms
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Primary-Backup (Passive) Replication “Hot” standby Client talks to primary server Primary updates backup(s) Client detects server failure using timeout –performs “fail-over” to backup server –may need to repeat last operation(s) Can be a problem with “false suspicions” Works with benign servers only
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 State Machine Replication aaa bb c Replicas are identical deterministic state machines Process operations in the same order remain consistent
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 State Machine (Active) Replication Clients send updates to all servers All servers are identical deterministic state machines –perform operations in the same order to remain consistent May be slower than primary backup, but provides quicker, smoother fail-over Can overcome false suspicions and tolerate malicious servers
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Supporting State Machine Replication Send update operations in messages so that –messages are reliable –messages arrive in the same order to all replicas This is called Atomic Broadcast It requires the receivers to agree on the message order Consensus is a service that lets processes reach agreement
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Peer-to-Peer Systems Decentralized, self-organizing, distributed systems in which most communication is symmetric Popularized by music swapping software –Napster, Gnutella, KaZaA Lots of nodes (e.g., millions) Dynamic: frequent join, leave, failure Little or no infrastructure (no central server) All nodes are “peers” – have same role; don’t have lots of resources
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Data-Centric Systems Ephemeral processes (clients) are not all around at the same time Clients share state (data) Clients synchronize among each other using the shared data Data can be stored at “dumb” shared disks
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Example: Coordinated Attacks
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Example: Coordinated Attack Let’s attack A B
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 The Coordinated Attack Problem Requirements: –both generals must decide the same: either to attack or not to attack –if both are not ready to attack they must not attack –if both are ready to attack then they must attack Motivation: atomic transaction commit in distributed databases [Gray 78]
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Properties of Coordinated Attack Agreement: If both generals decide, they decide the same Termination: Every general eventually decides Validity: If both inputs are “not ready” the decision is “no attack”; if both inputs are “ready” then the decision is “attack”
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 A Simple Solution General A sends vote (“yes” or “no”) General B responds with his vote If both say yes, they attack Otherwise they do not Aka 2-phase commit Problems?
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Failure Model 1 Generals may die Subordinates eventually replace them, need to know correct result Crash-recovery model
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Failure Model 2 Any number of messengers can be captured (message loss) Proof on board
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Coordinated Attack Definition: Take II Revised requirements: –both generals must decide the same: either to attack or not to attack –if both are not ready to attack they must not attack –if both are ready to attack and no messages are lost then they must attack Note: this is not an assumption about the model. It’s a conditional requirement that has to hold only in runs in which no messages are lost. Proof on the board!
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Models Material: Chapter 2 of Mullender
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Theory vs. Practice Distributed systems are not as intuitive as centralized ones Two approaches to understanding them: –Experimental observation (“practice”) –Modeling and analysis (“theory”) to be useful, models need to characterize reality Complement each other
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Theory Needs Models In order to design an algorithm, we need a model of the system where the algorithm will be deployed Model captures assumptions on process capabilities, timing, types and number of failures. –E.g., assume that at most one server crashes –This means that the system is allowed to be unreliable if other failures occur (two servers crash, one server is infiltrated)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Good Models Accurate – analysis yields truths about the analyzed system/object Tractable – analysis is possible Accurate and tractable models are hard to define –need to abstract away issues that do not affect the phenomena of interest –include exactly those attributes that do
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Questions We’d Like to Answer 1.Feasibility – what classes of problems can be solved 2.Cost – how expensive must the solution be Computation and complexity models for centralized systems do not help us answer these questions for distributed systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Synchronous vs. Asynchronous Synchrony assumptions: –message latency is bounded, –processes have synchronized clocks –processing times are bounded Asynchrony: non-assumption
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Asynchronous Model Unbounded message delay, processor speed Desirable: an algorithm for this model works also in synchronous model Alas, too strong Consensus impossible even when one process can crash [FLP85]
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Round Lock-Step Synchronous Model Algorithm runs in synchronous rounds: –send messages to any set of processes, –receive messages from previous round, –do local processing (possibly decide, halt) Easy to work with But, may lead to inefficiency when implemented over slow network
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Stronger Models Is the result still valid if we assume each of the following? –bounded loss: at most 10 messages are lost on each channel –eventual delivery: an unknown finite number of messages are lost on each channel
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 We’ll talk a lot about models later in the course….
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Specifications Material: Chapter 3 of Mullender
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Specifications A specification of a module is an abstraction To use a module, all we need to know is its specification –abstract away implementation Managing complex systems via modularity requires clear component specifications
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Step 1: Define Interfaces Example: Coordinated Attack There are two generals A, B Each has an input variable inp A, inp B {“ready”, “not ready”} Possible actions for P {A, B}: –Decide P (v), v {“attack”, “no attack”} (Output) writes v to a write-once variable dec P –Send P (m) (Output) –Deliver P (m) (Input)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 System Example: A Shared Counter Interface: inc P () P {A, B} Code: –int x, initially 0 –on inc P () do: x++
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Traces and Runs A run is an alternating sequence of events and states the system goes through –example run of the counter: 0, inc A (), 1, inc B (), 2, inc B (), 3, inc B (), 4 A trace is the sequence of events in a run –trace of the above run: inc A () inc B () inc B () inc B () –e.g., trace of a coordinated attack algorithm: send A (“yes”) send B (“no”) deliver B (“yes”) deliver A (“no”) decide B (“no attack”) decide A (“no attack”)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Sequences, Prefixes, and Predicates Sequence: a 1 a 2 a 3 a 4 a 5,… Prefixes of this sequence: a 1, a 1 a 2, a 1 a 2 a 3, etc. A predicate is a formula evaluated to a boolean value (true or false) The predicate “if m is delivered it was previously sent” evaluates to true over the trace: send A (“yes”) send B (“no”) deliver B (“yes”) deliver A (“no”) decide B (“no attack”) decide A (“no attack”)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Step 2: Specify Properties Concurrent systems can be specified using trace properties A trace property is a predicate evaluated over a trace of the concurrent system Example property: every message that is received was previously sent Not a property: the average number of messages sent in a run is 34
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Safety Properties A safety property is of the form nothing bad happens (that is, all states are safe). Examples: –The number of processes in a critical section is always less than 2 –No two processes decide on different values –Every delivered message was previously sent
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Liveness Properties A liveness property is of the form something good happens (that is, an interesting state is eventually achieved) Examples: –A process that wishes to enter the critical section eventually does so –p grows without bound –Every general eventually decides
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 More Formally A safety property is prefix-closed: –if it holds in a run, it holds in every prefix –you can’t “fix” it after it’s “broken” Every run can be extended to satisfy a given liveness property: –no matter how “broken”, you can always “fix” it Any property is either a safety property, a liveness property, or equivalent to a conjunction of a safety and a liveness property –e.g., Critical Section is a conjunction of Mutual Exclusion and Progress
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Timing Properties If a message is sent, then it arrives within five minutes. Safety or liveness? Can be expressed only in a timed model
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Properties of Coordinated Attack Agreement: If both generals decide, they decide the same Termination: Every general eventually decides Validity: If both inputs are “not ready” the decision is “no attack”; if both inputs are “ready” and every message sent is delivered then the decision is “attack”