Advanced Operating System

Slides:

Advertisements

Similar presentations

COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.

Advertisements

CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.

Winter, 2004CSS490 MPI1 CSS490 Group Communication and MPI Textbook Ch3 Instructor: Munehiro Fukuda These slides were compiled from the course textbook,

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.

CS542 Topics in Distributed Systems Diganta Goswami.

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.

Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)

Distributed Systems Spring 2009

LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.

CS 582 / CMPE 481 Distributed Systems

Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.

© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.

Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.

1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.

CSS490 Replication & Fault Tolerance

© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer.

Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.

Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.

Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.

CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.

1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:

Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.

Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.

Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.

Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.

Winter, 2004CSS490 Synchronization1 Textbook Ch6 Instructor: Munehiro Fukuda These slides were compiled from the textbook, the reference books, and the.

Lamport’s Logical Clocks & Totally Ordered Multicasting.

Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.

Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Distributed Coordination. Turing Award r The Turing Award is recognized as the Nobel Prize of computing r Earlier this term the 2013 Turing Award went.

CIS825 Lecture 2. Model Processors Communication medium.

Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.

D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.

Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.

Building Dependable Distributed Systems, Copyright Wenbing Zhao

Distributed Systems Topic 5: Time, Coordination and Agreement

Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.

Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.

COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.

Fault Tolerance (2). Topics r Reliable Group Communication.

Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

CSS430 Deadlocks Textbook Ch7

CSE 486/586 Distributed Systems Reliable Multicast --- 1

CSE 486/586 Distributed Systems Global States

Sarah Diesburg Operating Systems COP 4610

Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013

CSS490 Grid Computing Textbook No Corresponding Chapter

COT 5611 Operating Systems Design Principles Spring 2012

Replication Improves reliability Improves availability

Outline Distributed Mutual Exclusion Introduction Performance measures

Event Ordering.

EEC 688/788 Secure and Dependable Computing

Andy Wang Operating Systems COP 4610 / CGS 5765

CSS490 Distributed Shared Memory

Seminar Mobilkommunikation Reliable Multicast in Wireless Networks

Lecture 8 Processes and events Local and global states Time

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

CSS430 Virtual Memory Textbook Ch10

CSS430 Deadlocks Textbook Ch8

CSE 486/586 Distributed Systems Global States

CSE 486/586 Distributed Systems Reliable Multicast --- 2

COT 5611 Operating Systems Design Principles Spring 2014

CSE 486/586 Distributed Systems Reliable Multicast --- 1

Outline Theoretical Foundations

Presentation transcript:

Advanced Operating System Group Communication Hello! Everyone, My name is Shinya Kobayashi. Today, I am going to present our paper titled “Inter-Cluster Job Coordination Using Mobile Agents” on behalf of the first author, Munehiro Fukuda. Munehiro was hoping to show up and present the paper at AMS2001, however he got to wait in Japan until he will get an H1B visa. Since I received the presentation materials from him quite recently, please allow me to present this paper using this script. I can respond to your questions as far as I know, however you can also ask Munehiro by email. His email address is on the title page of our paper. (time 1:05) 7/6/2018 Advanced Operating System

Advanced Operating System Group Communication Communication types: One-to-many: broadcast Many-to-one: synchronization, collective communication Many-to-many: gather and scatter Group addressing Multicast address Broadcasting One-to-one communication Performance drawback on bus-type networks Simpler for switching-based networks Semantics Send-to-all, bulletin-board semantics 0-, 1-, m-out-of-n, all-reliable 7/6/2018 Advanced Operating System

Advanced Operating System Atomic Multicast Send-to-all semantics and all-reliable Simple emulation: A repetition of one-to-one communication with acknowledgment What if a receiver fails Time-out retransmission What if a sender fails before all receivers receive a message All receivers forward the message to the same group. A receiver discard the 2nd or the following messages. 7/6/2018 Advanced Operating System

Advanced Operating System Message Ordering R1 and R2 receive m1 and m2 in a different order! Some message ordering required Absolute ordering Consistent ordering Causal ordering FIFO ordering S1 R1 R2 S2 m2 m1 m1 m2 7/6/2018 Advanced Operating System

Advanced Operating System Absolute Ordering Rule: Mi must be delivered before mj if Ti < Tj Implementation: A clock synchronized among machines A sliding time window used to commit message delivery whose timestamp is in this window. Example: Distributed simulation Drawback Too strict constraint No absolute synchronized clock No guarantee to catch all tardy messages Ti < Tj Ti mi Tj mi mj mj 7/6/2018 Advanced Operating System

Advanced Operating System Consistent Ordering Rule: Messages received in the same order (regardless of their timestamp). Implementation: A message sent to a sequencer, assigned a sequence number, and finally multicast to receivers A message retrieved in incremental order at a receiver Example: Replicated database updation Drawback: A centralized algorithm Ti < Tj Ti Tj mj mj mi mi 7/6/2018 Advanced Operating System

Advanced Operating System Causal Ordering Rule: Happened-before relation If eki, eli ∈h and k < l, then eki → eli, If ei = send(m) and ej = receive(m), then ei → ej, If e → e’ and e’ → e”, then e → e” Implementation: Use of a vector message Example: Distributed file system Drawback: Vector as an overhead Broadcast assumed S1 R1 R2 R3 S2 m4 m1 m1 m4 m2 m2 m3 From R2’s view point m1 →m2 7/6/2018 Advanced Operating System

Advanced Operating System FIFO Ordering Rule: Messages received in the same order as they were sent. Implementation: Messages assigned a sequence number Example: TCP This is the weakest ordering. S R m1 Router 1 m2 m1 m3 m2 m4 m3 Router 2 m4 7/6/2018 Advanced Operating System