CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)

Slides:



Advertisements
Similar presentations
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy)
Advertisements

COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 4: Failure Detection and Membership All slides © IG.
Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse.
1 CS 525 Advanced Distributed Systems Spring 09 Indranil Gupta Lecture 7 More on Epidemics (or “Tipping Point Protocols”) February 12, 2009 (gatorlog.com)
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.
Ýmir Vigfússon IBM Research Haifa Labs Ken Birman Cornell University Qi Huang Cornell University Deepak Nataraj Cornell University.
Reliable Group Communication Quanzeng You & Haoliang Wang.
Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
1 CS525 Advanced Distributed Systems Spring 2015 Indranil Gupta (Indy) Lecture 1 January 20, 2015 All Slides © IG.
CS542: Topics in Distributed Systems
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
Self Healing Wide Area Network Services Bhavjit S Walha Ganesh Venkatesh.
Wide-area cooperative storage with CFS
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
CSE679: Multicast and Multimedia r Basics r Addressing r Routing r Hierarchical multicast r QoS multicast.
Epidemic spreading in complex networks: from populations to the Internet Maziar Nekovee, BT Research Y. Moreno, A. Paceco (U. Zaragoza) A. Vespignani (LPT-
Communication (II) Chapter 4
Probabilistic Broadcast Presented by Keren Censor 1.
Epidemics Michael Ford Simon Krueger 1. IT’S JUST LIKE TELEPHONE! 2.
A Randomized Error Recovery Algorithm for Reliable Multicast Zhen Xiao Ken Birman AT&T Labs – Research Cornell University.
1 CS525 Advanced Distributed Systems Spring 2011 Indranil Gupta (Indy) Lecture 1 January 18, 2011 All Slides © IG.
Dec 4, 2007 Reliable Multicast Group Neelofer T. CMSC 621.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
1 SmartGossip: An Adaptive Broadcast Service for Wireless Sensor Networks Presented By Thomas H. Hand Duke University Adapted from: “ SmartGossip: An Adaptive.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
CS603 Fault Tolerance - Communication April 17, 2002.
Lecture 19-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) October 29, 2013 Lecture 19 Gossiping Reading: Section.
Reliable Adaptive Lightweight Multicast Protocol Ken Tang, Scalable Network Technologies Katia Obraczka, UC Santa Cruz Sung-Ju Lee, Hewlett-Packard Laboratories.
CSE 486/586 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Roie Melamed, Technion AT&T Labs Araneola: A Scalable Reliable Multicast System for Dynamic Wide Area Environments Roie Melamed, Idit Keidar Technion.
Lecture 19-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 30, 2012 Lecture 19 Gossiping.
Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2011 Gossiping Reading: Section 15.4 / 18.4  2011, N. Borisov, I. Gupta, K. Nahrtstedt,
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
CSE 486/586 Distributed Systems Gossiping
SmartGossip: A Reliable Broadcast Service for Wireless Sensor Networks
Cassandra - A Decentralized Structured Storage System
Lecture 2: Leaf-Spine and PortLand Networks
Introduction to Wireless Sensor Networks
CS525 Advanced Distributed Systems Spring 2013
PROTEAN: A Scalable Architecture for Active Networks
(slides by Nick Feamster)
CS525 Advanced Distributed Systems Spring 2009
Intra-Domain Routing Jacob Strauss September 14, 2006.
Plethora: Infrastructure and System Design
Reliable group communication
View Change Protocols and Reconfiguration
湖南大学-信息科学与工程学院-计算机与科学系
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
DIMPLE: DynamIc Membership ProtocoL for Epidemic protocols
Global State and Gossip
DHT Routing Geometries and Chord
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
From Viewstamped Replication to BFT
Updates in Highly Unreliable, Replicated Peer-to-Peer Systems
Lecture 21: Replication Control
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
Last Class: Web Caching
EE 122: Lecture 22 (Overlay Networks)
Lecture 21: Replication Control
CS 425 / ECE 428 Distributed Systems Fall 2018 Indranil Gupta (Indy)
-By Shrutirupa Let Us Gossip About It!! -By Shrutirupa
Presentation transcript:

CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Sep 12, 2017 Lecture 5: Gossiping All slides © IG

Multicast 2

Fault-tolerance and Scalability Needs: Reliability (Atomicity) 100% receipt Speed 3

Centralized 4

Tree-Based 5

Tree-based Multicast Protocols Build a spanning tree among the processes of the multicast group Use spanning tree to disseminate multicasts Use either acknowledgments (ACKs) or negative acknowledgements (NAKs) to repair multicasts not received SRM (Scalable Reliable Multicast) Uses NAKs But adds random delays, and uses exponential backoff to avoid NAK storms RMTP (Reliable Multicast Transport Protocol) Uses ACKs But ACKs only sent to designated receivers, which then re-transmit missing multicasts These protocols still cause an O(N) ACK/NAK overhead [Birman99] 6

A Third Approach 7

A Third Approach 8

A Third Approach 9

A Third Approach 10

“Epidemic” Multicast (or “Gossip”) 11

Push vs. Pull So that was “Push” gossip Once you have a multicast message, you start gossiping about it Multiple messages? Gossip a random subset of them, or recently-received ones, or higher priority ones There’s also “Pull” gossip Periodically poll a few randomly selected processes for new multicast messages that you haven’t received Get those messages Hybrid variant: Push-Pull As the name suggests 12

Properties Claim that the simple Push protocol Is lightweight in large groups Spreads a multicast quickly Is highly fault-tolerant 13

Analysis From old mathematical branch of Epidemiology [Bailey 75] Population of (n+1) individuals mixing homogeneously Contact rate between any individual pair is At any time, each individual is either uninfected (numbering x) or infected (numbering y) Then, and at all times Infected–uninfected contact turns latter infected, and it stays infected 14

Analysis (contd.) Continuous time process Then with solution: (why?) (can you derive it?) 15

Epidemic Multicast 16

Epidemic Multicast Analysis (why?) Substituting, at time t=clog(n), the number of infected is (correct? can you derive it?) 17

Analysis (contd.) Set c,b to be small numbers independent of n Within clog(n) rounds, [low latency] all but number of nodes receive the multicast [reliability] each node has transmitted no more than cblog(n)gossip messages [lightweight] 18

Why is log(N) low? log(N) is not constant in theory But pragmatically, it is a very slowly growing number Base 2 log(1000) ~ 10 log(1M) ~ 20 log (1B) ~ 30 log(all IPv4 address) = 32 19

Fault-tolerance Packet loss 50% packet loss: analyze with b replaced with b/2 To achieve same reliability as 0% packet loss, takes twice as many rounds Node failure 50% of nodes fail: analyze with n replaced with n/2 and b replaced with b/2 Same as above 20

Fault-tolerance With failures, is it possible that the epidemic might die out quickly? Possible, but improbable: Once a few nodes are infected, with high probability, the epidemic will not die out So the analysis we saw in the previous slides is actually behavior with high probability [Galey and Dani 98] Think: why do rumors spread so fast? why do infectious diseases cascade quickly into epidemics? why does a virus or worm spread rapidly? 21

Pull Gossip: Analysis In all forms of gossip, it takes O(log(N)) rounds before about N/2 processes get the gossip Why? Because that’s the fastest you can spread a message – a spanning tree with fanout (degree) of constant degree has O(log(N)) total nodes Thereafter, pull gossip is faster than push gossip After the ith, round let be the fraction of non-infected processes. Let each round have k pulls. Then This is super-exponential Second half of pull gossip finishes in time O(log(log(N)) 22

Topology-Aware Gossip Network topology is hierarchical Random gossip target selection => core routers face O(N) load (Why?) Fix: In subnet i, which contains ni nodes, pick gossip target in your subnet with probability (1-1/ni) Router load=O(1) Dissemination time=O(log(N)) N/2 nodes in a subnet Router N/2 nodes in a subnet 23

Answer – Push Analysis (contd.) Using: Substituting, at time t=clog(n) 24

SO,... Is this all theory and a bunch of equations? Or are there implementations yet? 25

Some implementations Clearinghouse and Bayou projects: email and database transactions [PODC ‘87] refDBMS system [Usenix ‘94] Bimodal Multicast [ACM TOCS ‘99] Sensor networks [Li Li et al, Infocom ‘02, and PBBF, ICDCS ‘05] AWS EC2 and S3 Cloud (rumored). [‘00s] Cassandra key-value store (and others) use gossip for maintaining membership lists Usenet NNTP (Network News Transport Protocol) [‘79] 26

NNTP Inter-server Protocol Each client uploads and downloads news posts from a news server 2. CHECK <Message IDs> Upstream Server Downstream Server 238 {Give me!} TAKETHIS <Message> 239 OK Server retains news posts for a while, transmits them lazily, deletes them after a while. 27

Summary Multicast is an important problem Tree-based multicast protocols When concerned about scale and fault-tolerance, gossip is an attractive solution Also known as epidemics Fast, reliable, fault-tolerant, scalable, topology-aware 28