CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 20 Nov. 2.

Slides:

Advertisements

Similar presentations

Push Technology Humie Leung Annabelle Huo. Introduction Push technology is a set of technologies used to send information to a client without the client.

Advertisements

COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.

COS 461 Fall 1997 COS 461: Networks and Distributed Computing u Prof. Ed Felten u u.

Transport Layer – TCP (Part1) Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF.

Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse.

Reliable Group Communication Quanzeng You & Haoliang Wang.

Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,

L-21 Multicast. L -15; © Srinivasan Seshan, Overview What/Why Multicast IP Multicast Service Basics Multicast Routing Basics DVMRP Overlay.

Chapter 6 Errors, Error Detection, and Error Control.

CSE 561 – Multicast Applications David Wetherall Spring 2000.

Department of Computer Engineering University of California at Santa Cruz Networking Systems (1) Hai Tao.

School of Information Technologies Internet Multicasting NETS3303/3603 Week 10.

LANs Media Access Control Step 1 in Sharing Resources.

COS 420 Day 18. Agenda Group Project Discussion Program Requirements Rejected Resubmit by Friday Noon Protocol Definition Due April 12 Assignment 3 Due.

CMPE 150- Introduction to Computer Networks 1 CMPE 150 Fall 2005 Lecture 22 Introduction to Computer Networks.

CMPE 150- Introduction to Computer Networks 1 CMPE 150 Fall 2005 Lecture 23 Introduction to Computer Networks.

Slide Set 15: IP Multicast. In this set What is multicasting ? Issues related to IP Multicast Section 4.4.

Network Multicast Prakash Linga. Last Class COReL: Algorithm for totally-ordered multicast in an asynchronous environment, in face of network partitions.

Wolfgang EffelsbergUniversity of Mannheim1 Multicast IP Wolfgang Effelsberg University of Mannheim September 2001.

Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)

Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.

CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 5: Sept. 7.

CSE679: Multicast and Multimedia r Basics r Addressing r Routing r Hierarchical multicast r QoS multicast.

Network Topologies.

Process-to-Process Delivery:

1 CMSCD1011 Introduction to Computer Audio Lecture 10: Streaming audio for Internet transmission Dr David England School of Computing and Mathematical.

Communication (II) Chapter 4

Lecture 2 TCP/IP Protocol Suite Reference: TCP/IP Protocol Suite, 4 th Edition (chapter 2) 1.

Section 4 : The OSI Network Layer CSIS 479R Fall 1999 “Network +” George D. Hickman, CNI, CNE.

A Randomized Error Recovery Algorithm for Reliable Multicast Zhen Xiao Ken Birman AT&T Labs – Research Cornell University.

© 2002, Cisco Systems, Inc. All rights reserved..

Introduction to Networks CS587x Lecture 1 Department of Computer Science Iowa State University.

Mobile Communications: Mobile Transport Layer Mobile Communications Chapter 10: Mobile Transport Layer  Motivation  TCP-mechanisms  Indirect TCP  Snooping.

Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.

TELE202 Lecture 5 Packet switching in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lectures »C programming »Source: ¥This Lecture »Packet switching in Wide.

7/26/ Design and Implementation of a Simple Totally-Ordered Reliable Multicast Protocol in Java.

1 Network Layer Lecture 16 Imran Ahmed University of Management & Technology.

Types of Service. Types of service (1) A network architecture may have multiple protocols at the same layer in order to provide different types of service.

Björn Landfeldt School of Information Technologies NETS 3303 Networked Systems Multicast.

2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.

Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,

CIS679: Multicast and Multimedia (more) r Review of Last Lecture r More about Multicast.

Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:

Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.

Forward Error Correction vs. Active Retransmit Requests in Wireless Networks Robbert Haarman.

CS603 Fault Tolerance - Communication April 17, 2002.

The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.

EE689 Lecture 13 Review of Last Lecture Reliable Multicast.

Protocol Layering Chapter 11.

TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.

1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.

Building a Reliable IP Multicast Distributed System Karl Thomas Rees CS 560.

Fault Tolerance (2). Topics r Reliable Group Communication.

Transmission Control Protocol (TCP) TCP Flow Control and Congestion Control CS 60008: Internet Architecture and Protocols Department of CSE, IIT Kharagpur.

Day 13 Intro to MANs and WANs. MANs Cover a larger distance than LANs –Typically multiple buildings, office park Usually in the shape of a ring –Typically.

Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.

Data Link Layer.

Mobile Transport Layer  Motivation  TCP-mechanisms  Indirect TCP  Snooping TCP  Mobile TCP  Fast retransmit/recovery  Transmission freezing  Selective.

CMPE 252A: Computer Networks

Chapter 3 Internet Applications and Network Programming

RSVP: A New Resource ReSerVation Protocol

Congestion Control, Internet transport protocols: udp

Transport Layer Unit 5.

IT351: Mobile & Wireless Computing

PRESENTATION COMPUTER NETWORKS

Computer Networking A Top-Down Approach Featuring the Internet

IP Multicast COSC /5/2019.

EE 122: Lecture 13 (IP Multicast Routing)

Computer Networks Protocols

Impact of transmission errors on TCP performance

Presentation transcript:

CS514: Intermediate Course in Operating Systems Professor Ken Birman Ben Atkin: TA Lecture 20 Nov. 2

Scalable Reliable Multicast Became a hot topic with the introduction of Berkeley’s SRM Idea is to provide a primitive that –Has best-effort reliability (tries to repair problems… doesn’t promise) –Is receiver driven Sender doesn’t know who is in the group Receiver must solicit repairs –Has localized costs

SRM goal compared to virtual synchrony Virtual synchrony provides a strong guarantee –Either all processes that stay operational receive the message –Or none does But this seems to limit scalability Virtual synchrony can be fast but apparently is limited to groups of about 50 processes or fewer

Stock Exchange Problem: Vsync. multicast is too “fragile” Most members are healthy…. … but one is slow

Figure 1: Multicast throughput in an 8-member group perturbed by transient failures Ideal Actual

The problem gets worse as the system scales up Virtually synchronous Ensemble multicast protocols perturb rate average throughput on nonperturbed members group size: 32 group size: 64 group size: 96

SRM Unlike virtual synchrony the membership of the group will be loosely defined –Do it in the style of IP multicast –Receivers join at will Receiver has a way to request missing data

Application Level Framing Really, a separate idea But was bundled with SRM: –Application is assumed to be doing video multicast streaming –Perhaps it has its own way to recover lost data So instead of retransmitting precisely the lost data… … we solicit retransmission but let the application decide what to send in response

Comparison Virtual synchrony: –Strong membership –Each multicast has its own identity –Strong guarantees of ordering, synchronization, atomicity SRM: –Weak membership –Multicast has a sequence number but retransmission determined separately –Weak guarantees

Flow control Virtual synchrony does flow control –Sender has some amount of space –If it runs low, new multicasts are stopped SRM lacks any form of flow control Sender just runs at whatever rate it may select, oblivious to network In fact, SRM doesn’t even use acknowledgements from receiver to sender!

How SRM works Assumes that network has an IP multicast routing tree within it So, IP multicast is reasonably cheap SRM is built using broadcasts within this tree –One exception: so called “scalable session messages” –We’ll return to this later

Basic algorithm Application generates and numbers a packet Sends it using IP multicast Receiver receives it, puts it in order, delivers it … cost is one packet per link in the IP multicast tree, one send and one receive per process

Basic algorithm: application perspective S R R R R R R S: Sender R: Receiver  : Network Router

Basic algorithm: network perspective S R R R R R R               S: Sender R: Receiver  : Network Router

Things to notice IP multicast lives “in the network” SRM protocol lives in the application There may be many more network nodes involves than application nodes Similar observation applies to any protocol unless implemented by the router itself

SRM error recovery Suppose a packet is lost… This will either happen –In the O/S near the receiver Affects just one process –Or in the network or the O/S of the sender Affects a subtree In SRM, subtree might be very large

Basic algorithm: network perspective S R R R R R R               S: Sender R: Receiver  : Network Router

Receiver notices Either it receives the next packet in the stream –In this case it notices a gap Or it receives a “session message” –Again, this causes it to see a gap –Sent periodically

A wave of discovery Receivers have some distance from the sender Can measure it in milliseconds Close receivers notice the problem first.

Basic algorithm: network perspective S R R R R R R               S: Sender R: Receiver  : Network Router

NAK Basic algorithm: network perspective S R R R R R R               S: Sender R: Receiver  : Network Router

NAK Basic algorithm: network perspective S R R R R R R               S: Sender R: Receiver  : Network Router

NAK spreads like a wave Other receivers see the NAK –It inhibits them from sending their retransmission request Some upstream sender sees it too –It generates a repair

NAK or Repair Flood But why won’t all receivers at a given distance NAK simultaneously? Or all senders retransmit? –In fact this would be a problem as protocol was described

SRM borrows an idea from the Ethernet Introduce a few milliseconds of randomness Instead of sending the NAK after timeout  … pick a delay  and send after a randomly selected delay in time interval [  …  +  ] With any luck at all, first to NAK inhibits the others Similarly for retransmissions

Why will this work? Ideally, packet loss is low –Hence zero or one losses in the entire tree –But many impacted receivers With this scheme –Ideally, one NAK inhibits others –One retransmission, inhibits others

Scalable Session Messages In a large tree may see losses in different subtrees Want to limit cost of –NAK –Repair Do this using the regional TTL setting for IP multicast Limits the propagation of the various messages, except for initial multicast of the message itself

Experience? SRM works well in networks with low loss rates But if packet loss becomes higher –SRM overheads begin to soar –May see many solicitations, repairs on each network link –Eventually, this overloads the network Example: with 100 receivers in a large network we easily provoked scenarios where each packet triggers 5 or 10 NAKs and a similar number of repairs

Why is this bad? All processes see these NAK and repair messages Hence costs are global

A quadratic cost! Risk of a “multi-loss” rises with network size, roughly linearly And cost of a NAK or repair is also linear Could argue that SRM has an O(n 2 ) mechanism buried within In practice, dies around 50 processes… similar to vsync!

Status? In the Internet today, IP multicast is mostly disabled If used at all, limited to individual companies –But even then is usually disabled Internet mbone set aside for multicast But has a rigid structure, used for a single Internet “video feed”

Althernatives to SRM RMTP was developed at Lucent –Reliable Multicast Transport Protocol Supports 1-many file transfer Uses notion of regional repair servers –Each region has a repair server that handles processes in its vicinity –Hierarchical repair architecture –Also has flow-control built in

RMTP status Experience suggests it works well –But placement of repair servers is very important –Also, not very helpful for arbitrary uses; most experience is with file transfer applications An IETF standard as of 1999

PGM A Cisco proposal Idea is to put a form of reliable multicast right into the routers –Routers talk to each other –Recovery of lost packets occurs locally –But mechanism is very simple and offers no guarantees

PGM Cisco promoting it as a standard But seems not to be taking off To be useful, needs to work on all routers in a network –Cisco has a tunneling scheme to leapfrog small numbers of non- compliant routers

LGM Lightweight group multicast: another network-level proposal Getting a lot of attention now But no implementations in the field And there are three or four other proposals

Forward Error Correction Can be used with any protocol Idea is: –Instead of just sending data: m 0.. m 1 … m 2 … –Send an error correction code from time to time: m 0.. m 1 … m 2 … ECC012… m 3 … Receiver can use code to correct for packet loss

FEC continued Example? –Suppose that ECC simply is XOR of prior k packets –Than with any k-1 of them and ECC we can reconstruct the missing packet –Hence would need to lose 2 to have an unrecoverable gap Can use more powerful codes but they tend to be slower to compute

FEC continued With genuinely random packet loss, FEC works really well No need to send a NAK or a repair Good match to Internet router behavior –RED drops `em –FEC fixes em! One idea: FEC plus congestion control equals SRM!

Issues? Dependency on a single route or single IP multicast forwarding tree –Routers rarely fail… –But if router becomes heavily loaded, may drop many packets –Similarly if routes change – c.f. Jahanian study

Wishful thinking? Recall our idea of a network routing scheme that would offer failure- independent redundant routes –(A,B) and (A’,B’) “independent” If we had such a scheme for IP multicast… … we could retransmit over it and would have a good chance of recovery even if network becomes very overloaded!

IBM Gryphon Routers Gryphon: high speed publish- subscribe project at IBM –Idea was to build hardware –Started with software on UNIX boxes… highly optimized Special purpose “network” just for publish-subscribe applications –Targets stock markets, financial systems –Likely to have a substantial market –But expensive: dedicated infrastructure

What will future bring? Topic of much debate Two broad schools of thought –Mine (pessimistic) –Theirs (optimistic)

Mine? Perhaps I’m too negative But I suspect: –The Internet is a mess –QoS is not coming any time soon –Multicast isn’t about to start working –Situation broken and we need a whole new “web” design But perhaps can reuse many pieces of existing architecture and technology Much early evidence suggests this can work

Theirs The prevailing academic research perspective They see –Diffsrv, multilevel queuing… –Various multicast mechanisms And they expect –Internet telephony –Publish-subscribe hardware –Streaming media over IP Vast riches for all!