CSE 812. Outline What is a distributed system/program? Program Models Program transformation.

Slides:



Advertisements
Similar presentations
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 14: Simulations 1.
Advertisements

Timed Distributed System Models  A. Mok 2014 CS 386C.
Global States.
Impossibility of Distributed Consensus with One Faulty Process
CS 542: Topics in Distributed Systems Diganta Goswami.
Teaser - Introduction to Distributed Computing
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Spring 2014.
Byzantine Generals Problem: Solution using signed messages.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
CPSC 668Set 14: Simulations1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
1 Complexity of Network Synchronization Raeda Naamnieh.
CPSC 668Set 1: Introduction1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CS 582 / CMPE 481 Distributed Systems
Ordering and Consistent Cuts Presented By Biswanath Panda.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Ordering and Consistent Cuts Presented by Chi H. Ho.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Composition Model and its code. bound:=bound+1.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Election Algorithms and Distributed Processing Section 6.5.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Inter-process Communication and Coordination Chaitanya Sambhara CSC 8320 Advanced Operating Systems.
Representing distributed algorithms Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity,
Why do we need models? There are many dimensions of variability in distributed systems. Examples: interprocess communication mechanisms, failure classes,
Consensus and Its Impossibility in Asynchronous Systems.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Spring 2011.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 10, 2005 Session 9.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
CSE 812. Outline Defining Programs, specifications, faults, etc. Safety and Liveness based on the work of Alpern and Schneider Defining fault-tolerance.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Program correctness The State-transition model A global states S  s 0 x s 1 x … x s m {s k = set of local states of process k} S0  S1  S2  Each state.
ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,
Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,
Leader Election (if we ignore the failure detection part)
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Hwajung Lee. Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.   i,j  V  i,j are non-faulty ::
ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
What is a distributed system? A network of processes. The nodes are processes, and the edges are communication channels.
Agenda  Quick Review  Finish Introduction  Java Threads.
Introduction to distributed systems description relation to practice variables and communication primitives instructions states, actions and programs synchrony.
Understanding Models. Modeling Communication: A message passing model System topology is a graph G = (V, E), where V = set of nodes ( sequential processes.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Fall 2011.
Classifying fault-tolerance Masking tolerance. Application runs as it is. The failure does not have a visible impact. All properties (both liveness & safety)
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
ITEC452 Distributed Computing Lecture 2 – Part 3 Models in Distributed Systems Hwajung Lee.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Model and complexity Many measures Space complexity Time complexity
Understanding Models.
CS 5620 Distributed Systems and Algorithms
Leader Election (if we ignore the failure detection part)
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Atomicity, Non-determinism, Fairness
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
ITEC452 Distributed Computing Lecture 3 Models in Distributed Systems
CS 5620 Distributed Systems and Algorithms
CS 5620 Distributed Systems and Algorithms
Distributed Systems and Algorithms
Presentation transcript:

CSE 812

Outline What is a distributed system/program? Program Models Program transformation

What is a distributed system? A network of processes. The nodes are processes, and the edges are communication channels

Code for Distributed Systems Each node would need some code. –CSE 231? Communication between nodes –Network level issues –CSE 824 Protocols/Programs for allowing this system to function –This course –Focus on abstract protocols (in my presentations) –Focus on implementation issues (Student presentations)

Important issues Knowledge is local Clocks are not synchronized No shared address space Most of the time Topology and routing Scalability Fault tolerance

Some common subproblems Leader election Mutual exclusion Time synchronization Distributed snapshot Replica management

A classification Client-server modelPeer-to-peer model: Notion of neighbors Server Clients

Outline What is a distributed system/program? Program Models Program transformation

Defining Programs Goal of this discussion is to extend the concept of programs from programs such as that in C/C++/… to more abstract programs Consider the map of MSU shown on the next page. A robot needs to be programmed so that it can go from point A to point B –Identify a program for such robot

A B

A B

Issues Non determinism –At a certain state, the program is presented from a set of multiple options. –Assumption: the program may choose either of these options non- deterministically. Note that this does not imply any fairness unless assumed otherwise.

Abstraction Thinking of the program as a finite automata

Defining Safety Example: –Consider the arrows in previous picture Intuitively, safety identifies transitions that should not be executed by the program

Example 2 Peterson’s mutual exclusion algorithm –Two processes State can be n (non critical), t (trying), or c (critical) It is necessary to ensure that both processes are not in state c simultaneously If a process is in state t, then it must eventually go to state c When a process in state n (respectively, t, c) changes its state, it must change it to t (respectively, c, n) –Additional variable turn

Automata for Peterson’s Mutual Exclusion n1,n2 t1,n2 n1,t2 c1,n2 n1,c2t1,t2 turn=1 t1,t2 turn=2 c1,t2 t1,c2

Safety Examples –[c1, c2] not reached –([n1, n2], [c1, n2]) is not permitted

Example 3 Car climate control –Driver side temperature –Passenger side temperature –Controls for increasing and decreasing temperature –A button for `Sync’ –Minimum and maximum temperature

Automata for Car Climate Control

Use of Invariants Sync ==> d_temp = p_temp d_temp >= min p_temp >= min d_temp <= max p_temp <= max

Designing Programs Given an Invariant

Stepping back Overview of different program models –Message passing model –Shared memory model Different semantics

Why do we need models? There are many dimensions of variability in distributed systems. Examples: - interprocess communication mechanisms, - failure classes, - security mechanisms etc.

Why do we need models? Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the implementation details from observers who view the system at a higher level. For example C program is an abstraction of the assembly program that preserves certain properties

A message passing model System topology is a graph G = (V, E), where V = set of nodes (sequential processes) E = set of edges Four types of actions by a process: - Internal action-input action - Communication action-output action

A Reliable FIFO channel Axiom 1. Message m sent  message m received Axiom 2. Message propagation delay is arbitrary but finite. Axiom 3. m1 sent before m2  m1 received before m2. P Q

Life of a process in message passing model When a message m is rec’d 1.Evaluate a predicate with m and the local variables; 2. if predicate = true then -update internal variables; -send zero or more messages; else skip {do nothing} end if

Synchrony vs. Asynchrony Send & receive can be blocking or non- blocking Postal communication is asynchronous: Telephone communication is synchronous Synchronous communication or not? (1)Remote Procedure Call, (2) Synchronous clocks Physical clocks are synchronized Synchronous processes Lock-step synchrony Synchronous channels Bounded delay Synchronous message-order First-in first-out channels Synchronous communication Communication via handshaking

Shared memory model Address spaces of processes overlap M M2 Concurrent operations on a shared variable are serialized

Variations of shared memory models State reading model Each process can read the states of its neighbors Link register model Each process can read from and write to adjacent registers. The entire local state is not shared.

Weak vs. Strong Models One object (or operation) of a strong model = More than one objects (or operations) of a weaker model. Often, weaker models are synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples HLL model is stronger than assembly language model. Asynchronous is weaker than synchronous. Bounded delay is stronger than unbounded delay (channel)

Model transformation Stronger models - simplify reasoning, but - needs extra work to implement Weaker models - are easier to implement. - Have a closer relationship with the real world “Can model X be implemented using model Y?” is an interesting question in computer science. Sample problems Non-FIFO to FIFO channel Message passing to shared memory Non-atomic broadcast to atomic broadcast

Non-FIFO to FIFO channel P Q buffer m1m4m3m2

Non-FIFO to FIFO channel {Sender process P}{Receiver process Q} var i : integer {initially 0}var k : integer {initially 0} buffer: buffer[0..∞] of msg {initially  k: buffer [k] = emptyrepeat send m[i],i to Q; {STORE} receive m[i],i from P; i := i+1 store m[i] into buffer[i]; forever {DELIVER} while buffer[k] ≠ empty do begin deliver content of buffer [k]; buffer [k] := empty  k := k+1; end forever

Observations (a) Needs Unbounded sequence numbers and (b) Unbounded number of buffer slots (Both are bad) Now solve the same problem on a model where (a) The propagation delay has a known upper bound of T. (b) The messages are sent per unit time. (c) The messages are received at a rate faster than r. The buffer requirement drops to r.T. Synchrony pays. Question. How to solve the problem using bounded buffer space if the propagation delay is arbitrarily large?

Message-passing to Shared memory {Read X by process i}: read x[i] {Write X:= v by process i} - x[i] := v; -Atomically broadcast v to every other process j (j ≠ i); -After receiving broadcast, process j (j ≠ i) sets x[j] to v. Understand the significance of atomic operations. It is not trivial, but is very important in distributed systems This is incomplete. There are more pitfalls here.

Non-atomic to atomic broadcast Atomic broadcast = either everybody or nobody receives {process i is the sender} for j = 1 to N-1 (j ≠ i) send message m to neighbor[j] (Easy!) Now include crash failure as a part of our model. What if the sender crashes at the middle? Implement atomic broadcast in presence of crash.

Mobile-agent based communication Communicates via messengers instead of (or in addition to) messages. What is the lowest Price of an iPod in Michigan? Carries both program and data

June 20, 2007ProSe: Programming Tool for Sensor Networks; 41 Computational Model in Sensor Networks More generally in wireless networks –Communication pattern is local broadcast However, if multiple nodes broadcast simultaneously then a collision may occur

Write-All with Collision Model Thus, communication in sensor networks is local broadcast with collision In one action, –A sensor can (1) write its own state and (2) the state of all its neighbors at once –However, simultaneous execution of 2 or more sensors may cause the state of the common neighbors to remain unchanged Can we convert a program in shared memory to write-all with collision model? –Why?

Other classifications of models Reactive vs Transformational systems A reactive system never sleeps (like: a server) A transformational (or non-reactive systems) reaches a fixed point after which no further change occurs in the system Named vs Anonymous systems In named systems, process id is a part of the algorithm. In anonymous systems, it is not so. All are equal. (-) Symmetry breaking is often a challenge. (+) Easy to switch one process by another with no side effect.