Lecture 6: Multicast l Challenge: how do we efficiently send messages to a group of machines? n Need to revisit all aspects of networking –Routing –Autonomous systems and admin control –Address allocation –Congestion control (next time?) –Reliable delivery (next time) –Ordered delivery (next time)
Multicast Motivation l Send data to multiple receivers at once n broadcasting, narrowcasting n telecollaboration n software update n group coordination, subcasting l Send question to unknown receiver n resource discovery n distributed database n anonymous directory services
Multicast Efficiency l Send data only once down link shared by multiple receivers R R R R Sender
MBone Deployment l How do we provide multicast services without router support? l MBone as virtual network n PC routers, forward multicast traffic by tunneling over Internet n Distance vector for routing between PCs n Limit tunnel bw to avoid interfering with TCP l Widespread use!
IP Multicast Service Model l Provided by (inter) network n with help from LAN n deployed as virtual network (MBone) l Delivery n best effort (unreliable, unordered, …) n packets addressed to group –group address allocated from special range in IP l TTL for scope control (ex: send one hop)
IP Multicast Service Model l Receivers n zero, one or many receivers n dynamic -- anyone can join, leave n can belong to any number of groups l Senders n Any number of senders –just send packet to group address n May or may not be in group l what about multicast virtual circuits?
LAN Multicast Routing l Directly supported on shared media LANs (Ethernet, FDDI) n App subscribes to group n IP layer notifies Ethernet card to listen to packets with group address l What about switched LANs? (switched Ethernet, ATM) n ARP already requires broadcast support!
Flooding l If haven’t seen a packet before n forward it on every link but incoming n requires switches to remember each pkt! R R R Sender
Spanning Tree l Ensures every host gets one copy (retransmission needed for drops)
Computing a Spanning Tree l One (simple) approach n Elect a leader (ex: node with lowest id) n Spanning tree is shortest path to leader n Re-compute if topology changes l Another (simple) approach n distribute topology everywhere, compute in parallel (aka link state)
Switched LAN Multicast Routing l What if only few receivers? n Data sent needlessly over some links n => prune links if no receivers l What if large scale LAN? n No single tree efficient for all conditions n => spanning tree per group n => spanning tree per source l Same solutions as in Internet!
Internet Multicast Routing l How do we distribute packets across thousands of LANs? n Each router responsible for its attached LAN l Reduces to: n How do we forward packets to all interested routers? (DVMRP, M-OSPF, MBone) n How do hosts declare interest to their routers? (IGMP)
Spanning Trees l Tree per group or tree per source? n Scalability: router table entry per group vs. entry per group * per source n Efficiency: worst case factor of 2 latency n Bandwidth: can spread load with per- source trees n Rendezvous: per source can leverage unicast routing tables
Basic Approaches l Broadcast data and prune where there are no members n based on distance vector routing (DVMRP) l Broadcast membership and compute forwarding tables n based on link state routing (MOSPF) l Broadcast rendezvous for group addresses n independent of unicast routing (PIM)
Distance Vector Multicast l Intuition: unicast routing tables form inverse tree from senders to destination n why not use backwards for multicast? n Various refinements to eliminate useless transfers l Implemented in DVMRP (Distance Vector Multicast Routing Protocol) n in use in MBone
Reverse Path Flooding (RPF) l Router forwards packet from S iff packet came via shortest path back to S R R R S s s
Redundant Sends l RPF will forward packet to router, even if it will discard n each router gets pkt on all of its input links! l Each router connected to LAN will broadcast packet Ethernet
Reverse Path Broadcast (RPB) l With distance vector, neighbors exchange routing tables l Only send to neighbor if on its shortest path back to source l Only send on LAN if have shortest path back to source n break ties arbitrarily
Truncated RPB l End hosts tell routers if interested l Routers forward on LAN iff there are receivers l Challenges: n robust to router/host failures n avoid overloading LAN with announcements
Internet Group Management Protocol (IGMP) l To join, host multicasts join requests to hosts in group + routers on LAN, TTL=1 l Routers periodically query hosts for group membership n stop forwarding if receiver crashes l Hosts set random timer n When expire, multicast join n Other hosts in group disable timer
IGMP (step 1) l Elect one router periodically query about all groups (TTL=1) Q
IGMP (step 2) l One group member responds to group with TTL=1, everyone else defers Q G G G
Reverse Path Multicast (RPM) l Forward packets only to those areas of the network with receivers l “Broadcast and prune” n Use IGMP to tell if LAN if no members n If no children are members, propagate prune to parent in tree l Assume membership and prune if wrong vs. assume non-membership and explicit join
How does a receiver join? l Graft (prune cancellation) n Routers remember where they sent prunes –where multicast traffic came from n If child joins, send graft in same direction(s) l Requires ARQ n what if graft is dropped? n What if prune is dropped?
Phase 1: Truncated Broadcast g s g g
Phase 2: Pruning g s g g prune(s,g)
Phase 3: Grafting g s g g graft(s,g) g report g
Phase 4: Steady State g s g g g
RPM Mechanics l Data-driven -- prune only when packet arrives n Why? l Periodically time out prunes n Why? l Are loops possible in routing tree?
Link State Multicast l MOSPF (Multicast OSPF) l Use IGMP to determine LAN members l Flood topology/group changes l Each router gets complete topology, group membership n compute shortest path spanning tree n recompute tree every time topology changes n add/delete links if membership changes
Link State Mechanics l Unicast link state n precompute routes for all destinations – all-to-all shortest path n Use CIDR to reduce table size l Multicast link state n precompute routes for all source-group pairs? n compute on demand when packet arrives? l Are loops possible in routing tables?
PIM Motivation l PIM = Protocol Independent Routing l What about sparse groups -- groups that involve only a few members? n Ex: narrowcasting -- chat rooms, virtual corporations, … => could be common use l Distance vector + link state => n periodic broadcast to every router in Internet!
PIM Approach l Every multicast group has a set of predefined rendezvous points (RP) n need multiple for failure redundancy l Receiver join: send packet to one RP l Source join: send to all RP’s l If infrequent traffic, use tree per group n rooted at rendezvous points l Add/delete links via reference counting
What if frequent traffic? l Want to specialize with tree per source n All routers below RP observe traffic n If lots of packets from one source –router sends join to source –source sends traffic to router in addition to RPs –once router gets traffic from source, prune
PIM Shared Tree g s g g RP
PIM Receiver Join g s g g RP Join *,g Report g g What if join here?
PIM Shared Tree After Join g s g g g RP g
PIM Source Based Tree g s g g g RP g Join s,g
PIM Source Based Tree g s g g g RP g What if join s,g? What if source?
PIM Protocol Independence? l Packets sent to specific targets using unicast routing services n Receiver join: send to RP n Source join: send to RPs n Source based tree creation: send to source n Multicast packets: send to RPs, routers l Multiaccess links confuse reference counting: all routers listen for add/prune
How do we decide on RP’s? l Need consensus of which RP to use for each group, or nothing works l Current thinking n Replicated bootstrap manager n RP candidates register with manager n Manager broadcasts candidate RP list n Hosts use consistent hash to select RP
Hierarchical Routing Motivation l Decompose routing problem into sub- problems l Allow each organization to choose their own algorithm l Reduce table size by aggregating addresses
Hierarchical Broadcast and Prune l Reverse Path Flooding n Discard incoming packet if not from reverse path n Multicast incoming packet to all borders l Reverse Path Multicast n For each neighbor AS, compute if we’re on reverse path to source n Multicast incoming packets to useful borders n Propagate prunes across AS back to source
Hierarchical Link State l Broadcast topology, group membership within each AS l Broadcast AS topology, group membership, to all AS’s l Compute multicast tree across AS’s n identify incoming/outgoing borders l Compute multicast tree within each AS
Hierarchical PIM l Explicit sender and receiver joins l Protocol independence l => border routers propagate joins across boundaries l Hard to interoperate PIM with broadcast&prune
Hierarchical Routing Mechanics l What if each AS uses its own algorithm? n Incoming multicast packet may arrive on any border router n How do we detect routing loops? n Propagate explicit joins in PIM l How do we reduce table size using aggregates? n Combine nearby sources? similar groups?
Multicast Address Allocation l More dynamic than unicast allocation l Groups cross organizational boundaries l Central server? n Blocks of addresses for internal groups n Leases for global addresses l Random allocation? n Can listen to tell if address is in use n Need conflict resolution protocol
Scope Control Motivation l Efficiency with reverse path multicast n sender prunes receivers l Administrative control over listeners n anyone can listen to multicast conversation! n snooping more difficult in unicast l Coordinate sub-group actions n elect a leader/suppress duplicate actions n locate nearest receiver
Scope Control Mechanisms l Administrative TTL boundaries n Sender uses TTL = max diameter of local intranet n At border router, forward pkts iff > local TTL l Allocate block of “local” addresses n At border router, forward only global addresses
Expanding Ring Multicast l Locate “nearest” receiver l Progressively send to more and more of group n Send first to all receivers within one hop, then two hops, then three, …
Layered Multicast l What if receivers have different bandwidths? R R R S 100Mb/s 1Mb/s 56Kb/s
Layered Multicast l Transmit signal at multiple granularities n 56Kb/s - voice only n 1Mb/s - choppy video n 100Mb/s - high quality video l Drop packets at router if can’t handle rate? n Which packets? n Leads to persistent congestion at bottleneck n Wastes upstream resources
Receiver-Driven Layered Multicast l Receivers dynamically adapt to available capacity l Each layer a separate group n receiver subscribes to max layer that will get through with minimal drops l Use packet drops as congestion signal n No drops => try subscribing to higher layer n Drops => unsubscribe to layer
RLM Mechanics l Receivers periodically try subscribing to higher layer l If unsuccessful, overwhelms bottleneck router n Causes packet losses for other receivers? n => other receivers drop layers? n What if two conflicting subscribes at same time?
RLM Receiver Coordination l Receiver advertises intent to add layer l Other receivers n ignore congestion signal during experiment n avoid conflicting experiments l Receiver advertises result when done. Example of shared learning: n if successful, other receivers add layer n if unsuccessful, wait longer until next retry