Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal.

Slides:



Advertisements
Similar presentations
INTRODUCTION TO NETWORKS
Advertisements

Prof R. Guerraoui Distributed Programming Laboratory
Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.
Cassandra Structured Storage System over a P2P Network Avinash Lakshman, Prashant Malik.
Distributed Indexed Outlier Detection Algorithm Status Update as of March 11, 2014.
Impala: A Middleware System for Managing Autonomic, Parallel Sensor Systems Ting Liu and Margaret Martonosi Princeton University.
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Exercises for Chapter 4: Interprocess.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
CS 582 / CMPE 481 Distributed Systems
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
SRG PeerReview: Practical Accountability for Distributed Systems Andreas Heaberlen, Petr Kouznetsov, and Peter Druschel SOSP’07.
Scalable Application Layer Multicast Suman Banerjee Bobby Bhattacharjee Christopher Kommareddy ACM SIGCOMM Computer Communication Review, Proceedings of.
Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.
Group Communication using Ensemble Part II. 2 Introduction From previous tutorial: Ensemble’s application interface: Concepts of Group Membership, View,
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
6/27/2015Page 1 This presentation is based on WS-Membership: Failure Management in Web Services World B. Ramamurthy Based on Paper by Werner Vogels and.
1 Computer Networks Routing Algorithms. 2 IP Packet Delivery Two Processes are required to accomplish IP packet delivery: –Routing discovering and selecting.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
Composition Model and its code. bound:=bound+1.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
ROUTING PROTOCOLS Rizwan Rehman. Static routing  each router manually configured with a list of destinations and the next hop to reach those destinations.
Lab 1 Bulletin Board System Farnaz Moradi Based on slides by Andreas Larsson 2012.
1 The Google File System Reporter: You-Wei Zhang.
Communication (II) Chapter 4
Information Technology Lecture No 3 By: Khurram Shahid.
“Intra-Network Routing Scheme using Mobile Agents” by Ajay L. Thakur.
Power Save Mechanisms for Multi-Hop Wireless Networks Matthew J. Miller and Nitin H. Vaidya University of Illinois at Urbana-Champaign BROADNETS October.
SELMA: A middleware platform for self- organizing distributed applications in mobile multi-hop ad-hoc networks Daniel Görgen, Hannes Frey, Johannes K.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
1 SmartGossip: An Adaptive Broadcast Service for Wireless Sensor Networks Presented By Thomas H. Hand Duke University Adapted from: “ SmartGossip: An Adaptive.
AS Computing Data Transmission and Networks. Transmission error Detecting errors in data transmission is very important for data integrity. There are.
Minimizing Energy Consumption in Sensor Networks Using a Wakeup Radio Matthew J. Miller and Nitin H. Vaidya IEEE WCNC March 25, 2004.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
Chapter 31 INTRODUCTION TO ALGEBRAIC CODING THEORY.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
Other Clustering Techniques
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
1 GPS-Free-Free Positioning System for Wireless Sensor Networks Farid Benbadis, Timur Friedman, Marcelo Dias de Amorim, and Serge Fdida IEEE WCCN 2005.
1 UFlood: High-Throughput Wireless Flooding Jayashree Subramanian Collaborators: Robert Morris, Ramakrishna Gummadi, and Hari Balakrishnan.
A Multicast Routing Algorithm Using Movement Prediction for Mobile Ad Hoc Networks Huei-Wen Ferng, Ph.D. Assistant Professor Department of Computer Science.
Ben Miller.   A distributed algorithm is a type of parallel algorithm  They are designed to run on multiple interconnected processors  Separate parts.
4: Network Layer4-1 Chapter 4: Network Layer Last time: r Chapter Goals m Understand network layer principles and Internet implementation r Started routing.
UDP: User Datagram Protocol Chapter 12. Introduction Multiple application programs can execute simultaneously on a given computer and can send and receive.
ID NO : 1070 S. VARALAKSHMI Sethu Institute Of Tech IV year -ECE department CEC Batch : AUG 2012.
The “New Network Node” Algorithm Brought to you by: Brian Wolf(Researcher) Harlan Russell (Advisor) Joe Hammond (Advisor Emeritus) Vivek Mehta(Graduate.
 Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following.
Chapter 8 Fault Tolerance. Outline Introductions –Concepts –Failure models –Redundancy Process resilience –Groups and failure masking –Distributed agreement.
Spatial Aware Geographic Forwarding for Mobile Ad Hoc Networks Jing Tian, Illya Stepanov, Kurt Rothermel {tian, stepanov,
CSE 486/586 Distributed Systems Gossiping
Definition of Distributed System
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Network Configurations
Worse is better, also called the New Jersey style, is the name of a computer software design.
Corona Linearization Patrick O’Donnell.
Indirect Communication Paradigms (or Messaging Methods)
B. Ramamurthy Based on Paper by Werner Vogels and Chris Re
Indirect Communication Paradigms (or Messaging Methods)
Message Passing Systems Version 2
EE 122: Intra-domain routing: Link State
Presentation transcript:

Ziv Dayan Tom Afek Kafka Instructor Ittay Eyal

 What is a failure detector?  Our failure detector  Software Implementation  Gossip style  Independent local unit

 Communication – by messages  Each message contains a list of heartbeats  Each heartbeat contains  IP of creator  Time since creation  Each node contains its own Local Node: Local Node Net Members Node Neighbors Versions Neighbor Version

Repeat periodically:  Choose the node whose threshold is closest to expiration  Wait until the threshold has expired  Check the local time of creation of the last heartbeat received by the suspected node:  If changed – the node is OK  Else – the suspected node had crashed

Computer Listener Main Message Handler Message Sender Sender Detector

 A new abstract class is added – NetMessage  Method 1: Handle() – decodes the received message using the proper version and returns Message  Method 2: toString() – used for serialization NetMessage SHA1MessageNormalMessage Message

 H = f(P, n, threshold)  Assumptions required  Simplicity Vs Efficiency  Full topology  Spread time << threshold

 Assumption – Local Information  Strong Assumption  Reliability  x – number of messages -  Probability for false detection  We want  Result :

 Linear Performance  The bigger is P the bigger is the slope

 Assumptions  Synchrony  Consistency  Calculation for average case

 High Performance

 Comparison Categories  Efficiency  Scalability  Dynamism  Reliability