15-441: Computer Networking Lecture 25: Multicast Challenges, CDN and P2P systems
Overview Multicast Challenges Content Distribution Networks Peer-to-Peer Networks 2/18/2019
Multicast Issues Reliable transfer Reliable Multicast Protocols ACK/NACK Implosion Exposure Reliable Multicast Protocols Scalable Reliable Multicast Reliable Multicast Transport Protocol Pragmatic General Multicast Lightweight Multicast Service Congestion control 2/18/2019
Loss Recovery Sender-reliable Receiver-reliable Wait for ACKs from all receivers. Re-send on timeout or selective ACK Per receiver state in sender not scalable ACK implosion Receiver-reliable Receiver NACKs (resend request) lost packet Does not provide 100% reliability NACK implosion 2/18/2019
Implosion R1 S R3 R4 R2 R1 S R3 R4 R2 All 4 receivers request a resend Packet 1 is lost R1 S R3 R4 R2 2 1 R1 S R3 R4 R2 Resend request 2/18/2019
Retransmission Re-transmitter How to retransmit Problem: Exposure Options: sender, other receivers How to retransmit Unicast, multicast, scoped multicast, retransmission group, … Problem: Exposure 2/18/2019
Exposure R1 S R3 R4 R2 R1 S R3 R4 R2 Packet 1 does not reach R1; Receiver 1 requests a resend Packet 1 resent to all 4 receivers R1 S R3 R4 R2 2 1 Resend request R1 S R3 R4 R2 1 Resent packet 2/18/2019
Ideal Recovery Model S R3 R4 R2 R1 S R3 R4 R2 R1 Packet 1 reaches R1 but is lost before reaching other Receivers Only one receiver sends NACK to the nearest S or R with packet S R3 R4 R2 2 1 R1 S R3 R4 R2 Resend request 1 Resent packet Repair sent only to those that need packet R1 2/18/2019
Aside: Using the Routers Routers do transport level processing: Buffer packets Combine ACKs Send retransmissions Model solves implosion and exposure, but not scalable Violates end-to-end argument S Router R1 RTX NACK R2 R3 R4 2/18/2019
Multicast Issues Reliable transfer Reliable Multicast Protocols ACK/NACK Implosion Exposure Reliable Multicast Protocols Scalable Reliable Multicast Reliable Multicast Transport Protocol Pragmatic General Multicast Lightweight Multicast Service Congestion control 2/18/2019
Scalable Reliable Multicast (SRM) Originally designed for wb Receiver-reliable NACK-based Every member may multicast NACK or retransmission 2/18/2019
SRM Request Suppression Packet 1 is lost; R1 requests resend to Source and Receivers Packet 1 is resent; R2 and R3 no longer have to request a resend R1 S R3 R2 2 1 Delay varies by distance X Resend request R1 S R3 R2 1 X Resent packet 2/18/2019
Request Damping Deterministic Suppression Stochastic Suppression Receivers start timers with delay = C1 x ds,r Stochastic Suppression Start timers with delay = U[0,D2] x ds,r 2/18/2019
SRM Star Topology X S R2 R3 R4 S R2 R3 R4 Delay is same length Packet 1 is lost; All Receivers request resends Packet 1 is resent to all Receivers S R2 1 R3 R4 Resent packet S R2 2 1 R3 X R4 Delay is same length Resend request 2/18/2019
SRM (Summary) NACK/Retransmission suppression Delay before sending Delay based on RTT estimation Deterministic + Stochastic components Periodic session messages Full reliability Estimation of distance matrix among members 2/18/2019
What’s Missing? Losses at link (A,C) causes retransmission to the whole group Only retransmit to those members who lost the packet [Only request from the nearest responder] S A B 0.99 C D E F Sender Receiver 2/18/2019
Local Recovery Application-level hierarchy TTL scoped multicast Fixed v.s. dynamic TTL scoped multicast Router supported 2/18/2019
Reliable Multicast Transport Protocol (RMTP) Reliable Multicast Transport Protocol by Purdue and AT&T Research Labs Designed for file dissemination (single-sender) Deployed in AT&T’s billing network 2/18/2019
RMTP: Fixed Hierarchy Rcvr unicasts periodic ACK to its Designated Receiver (DR) DR unicasts its own ACK to its parent Rcvr chooses closest statically configured (DR) Mcast or unicast retransmission Based on percentage of requests Scoped mcast for local recovery S D R1 R2 D R3 D R4 R5 R R R R R R D R Receiver DR R* Router 2/18/2019
RMTP: Comments : Heterogeneity : Position of DR critical Lossy link or slow receiver will only affect a local region : Position of DR critical Static hierarchy cannot adapt local recovery zone to loss points 2/18/2019
Pragmatic General Multicast Cisco’s reliable multicast protocol NACK-based, with suppression Repair only forwarded to the NACKers 2/18/2019
Pragmatic General Multicast Packet 1 reaches only R1; R2, R3, R4 request resends Packet 1 resent to R2, R3, R4; Not resent to R1 R1 S R3 R4 R2 2 1 X Routers remember resend requests Resend request R1 S R3 R4 R2 1 Resent packet 2/18/2019
Light-weight Multicast Service (LMS) Enhance multicast routing with selective forwarding LMS extends router forwarding - what routers are meant to do in the first place No packet storing or processing at routers Strictly IP: no peeking into higher layers 2/18/2019
LMS: Definitions Mcast to a subtree X Replier Turning point Receiver volunteered to answer requests Turning point Where requests start to move downstream Directed mcast Mcast to a subtree S Replier link Turning point R1 X 1 R2 Replier R3 R4 R5 R6 2/18/2019
LMS with Replier Links X R1 S R3 R6 R2 R4 R5 S R1 R2 R3 R4 R5 R6 Packet 1 reaches only R1; R2 requests resend Resend requests from each receiver follow replier links R1 S R3 R6 R2 Resend request Replier link R4 R5 Resend request S Replier link 1 2 R1 Turning point X 1 R2 R3 R4 R5 R6 2/18/2019
LMS with Replier Links R1 S R3 R6 R2 R4 R5 R1 S R3 R6 R2 R4 R5 Request from replier links go up towards the Source Packet 1 is resent to all Receivers R1 S R3 R6 R2 Resend request Replier link R4 R5 R1 S R3 R6 R2 1 Resent packet Replier link R4 R5 Turning point 2/18/2019
Multicast Issues Reliable transfer Reliable Multicast Protocols ACK/NACK Implosion Exposure Reliable Multicast Protocols Scalable Reliable Multicast Reliable Multicast Transport Protocol Pragmatic General Multicast Lightweight Multicast Service Congestion control 2/18/2019
Multicast Congestion Control What if receivers have very different bandwidths? Send at max? Send at min? Send at avg? 100Mb/s 100Mb/s R S R 1Mb/s ???Mb/s 1Mb/s R R 56Kb/s 2/18/2019
Video Adaptation: RLM Receiver-driven Layered Multicast Layered video encoding Each layer uses its own mcast group On spare capacity, receivers add a layer On congestion, receivers drop a layer Join experiments used for shared learning 2/18/2019
Layered Media Streams R1 joins layer 1, joins layer 2 joins layer 3 R2 10Mbps 512Kbps 128Kbps R2 join layer 1, join layer 2 fails at layer 3 R3 joins layer 1, fails at layer 2 2/18/2019
Drop Policies for Layered Multicast Priority Packets for low bandwidth layers are kept, drop queued packets for higher layers Requires router support Uniform (e.g., drop tail, RED) Packets arriving at congested router are dropped regardless of their layer Which is better? Intuition vs. reality! 2/18/2019
RLM Intuition Uniform Priority RLM approaches optimal operating point Better incentives to well-behaved users If oversend, performance rapidly degrades Clearer congestion signal Allows shared learning Priority Can waste upstream resources Hard to deploy RLM approaches optimal operating point Uniform is already deployed 2/18/2019
RLM Intuition Uniform vs. Priority Dropping Uniform Priority 70 60 50 40 Uniform Performance Priority 30 20 10 Offered load 2/18/2019
Receiver-Driven Layered Multicast Each layer a separate group Receiver subscribes to max group that will get through with minimal drops Dynamically adapt to available capacity Use packet losses as congestion signal Assume no special router support Packets dropped independently of layer 2/18/2019
RLM Join Experiment Receivers periodically try subscribing to higher layer If enough capacity, no congestion, no drops Keep layer (& try next layer) If not enough capacity, congestion, drops Drop layer (& increase time to next retry) What about impact on other receivers? 2/18/2019
Join Experiments Layer 4 3 2 1 Time 2/18/2019
Overview Multicast Challenges Content Distribution Networks Peer-to-Peer Networks 2/18/2019
Motivation Problem of traditional client-server model Solution Single point of failure (DoS Attack) Not Scalable Solution Replication (CDN) Hosts connect to peers directly (P2P) 2/18/2019
Content Distribution Networks Replicate content on many servers Challenges How to replicate content Where to replicate content How to find replicated content How to choose among know replicas How to direct clients towards replica Discussed in DNS/server selection lecture DNS, HTTP 304 response, anycast, etc. Akamai 2/18/2019
How Akamai Works How is content replicated? Akamai only replicates static content Modified name contains original file Akamai server is asked for content First checks local cache If not in cache, requests file from primary server and caches file 2/18/2019
How Akamai Works Clients fetch html document from primary server E.g. fetch index.html from cnn.com URLs for replicated content are replaced in html E.g. <img src=“http://cnn.com/af/x.gif”> replaced with <img src=“http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif”> Client is forced to resolve aXYZ.g.akamaitech.net hostname 2/18/2019
How Akamai Works Root server gives NS record for akamai.net Akamai.net name server returns NS record for g.akamaitech.net Name server chosen to be in region of client’s name server TTL is large G.akamaitech.net nameserver choses server in region Should try to chose server that has file in cache - How to choose? Uses aXYZ name and consistent hash TTL is small 2/18/2019
How Akamai Works cnn.com (content provider) DNS root server Akamai server Get foo.jpg 12 11 Get index.html 5 1 2 3 Akamai high-level DNS server 6 4 7 Akamai low-level DNS server 8 Closest Akamai server 9 10 End-user Get /cnn.com/foo.jpg 2/18/2019
Akamai – Subsequent Requests cnn.com (content provider) DNS root server Akamai server Get index.html 1 2 Akamai high-level DNS server 7 Akamai low-level DNS server 8 Closest Akamai server 9 10 End-user Get /cnn.com/foo.jpg 2/18/2019
Consistent Hash “view” = subset of all hash buckets that are visible Desired features Smoothness – little impact on hash bucket contents when buckets are added/removed Spread – small set of hash buckets that may hold an object regardless of views Load – across all views # of objects assigned to hash bucket is small 2/18/2019
Consistent Hash – Example Construction Assign each of C hash buckets to random points on mod 2n circle, where, hash key size = n. Map object to random position on unit interval Hash of object = closest bucket 14 Bucket 12 4 8 Monotone addition of bucket does not cause movement between existing buckets Spread & Load small set of buckets that lie near object Balance no bucket is responsible for large number of objects 2/18/2019
Overview Multicast Challenges Content Distribution Networks Peer-to-Peer Networks 2/18/2019
Peer-to-peer networks Typically each member stores content that it desires Basically a replication system for files Always the tradeoff between possible location of files and searching difficulties Peer-to-peer allows files to be anywhere searching is the challenge Other challenges: Dynamic member list Scale 2/18/2019
Example: Napster Centralized Indexing On startup, client contacts central server and reports list of files To download a file Client first contact centralized server to find the location of the file Transfer is done peer-to-peer Hybrid scheme Advantage? Disadvantage? 2/18/2019
Example: Gnutella Distribute file location Idea: multicast the request Hot to find a file: Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages: Totally decentralized, highly robust Disadvantages: Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) 2/18/2019
Example: Freenet Addition goals to file location: Architecture: Provide publisher anonymity, security Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines Architecture: Each file is identified by a unique identifier Each machine stores a set of files, and maintains a “routing table” to route the individual requests 2/18/2019
Freenet Query User requests key XYZ – not in local cache Looks up nearest key in routing table and forwards to corresponding node If request reaches node with data, it forwards data back to upstream requestor Requestor adds file to cache, adds entry in routing table Any node forwarding reply may change the source of the reply helps anonymity If data not found, failure is reported back 2/18/2019
Freenet Features Nodes tend to specialize in searching for similar keys over time LRU cache: Files are not guaranteed to live forever Files can be encrypted Messages have random 64 bit ID for loop detection Random initial TTL for strong anonymity 2/18/2019
Freenet Summary Advantages Disadvantages Provides publisher anonymity Totally decentralize architecture robust and scalable Resistant against malicious file deletion Disadvantages Does not always guarantee that a file is found, even if the file is in the network 2/18/2019
Conclusions The key challenge of building wide area P2P systems is a scalable and robust location service Solutions covered in this lecture Naptser: centralized location service Gnutella: broadcast-based decentralized location service Freenet: intelligent-routing decentralized solution (but correctness not guaranteed; queries for existing items may fail) Other solutions: Chord, CAN 2/18/2019