1 IP Multicasting By Behzad Akbari These slides are based on the slides of J. Kurose (UMASS) and Shivkumar (RPI)
2 R1 R2 R3R4 source duplication R1 R2 R3R4 in-network duplication duplicate creation/transmission duplicate Broadcast Routing deliver packets from source to all other nodes source duplication is inefficient: source duplication: how does source determine recipient addresses?
3 In-network duplication flooding: when node receives broadcast packet, sends copy to all neighbors Problems: cycles & broadcast storm controlled flooding: node only broadcast packet if it hasn’t broadcast same packet before Node keeps track of packet ids already broadcasted Or reverse path forwarding (RPF): only forward packet if it arrived on shortest path between node and source spanning tree No redundant packets received by any node
4 A B G D E c F A B G D E c F (a) Broadcast initiated at A (b) Broadcast initiated at D Spanning Tree First construct a spanning tree Nodes forward copies only along spanning tree
5 A B G D E c F (a)Stepwise construction of spanning tree A B G D E c F (b) Constructed spanning tree Spanning Tree: Creation Center node Each node sends unicast join message to center node Message forwarded until it arrives at a node already belonging to spanning tree
6 Multicast Routing: Problem Statement Goal: find a tree (or trees) connecting routers having local mcast group members tree: not all paths between routers used source-based: different tree from each sender to rcvrs shared-tree: same tree used by all group members Shared tree Source-based trees
7 Approaches for building mcast trees Approaches: source-based tree: one tree per source shortest path trees reverse path forwarding group-shared tree: group uses one tree minimal spanning (Steiner) center-based trees …we first look at basic approaches, then specific protocols adopting these approaches
8 Shortest Path Tree mcast forwarding tree: tree of shortest path routes from source to all receivers Dijkstra’s algorithm R1 R2 R3 R4 R5 R6 R i router with attached group member router with no attached group member link used for forwarding, i indicates order link added by algorithm LEGEND S: source
9 Reverse Path Forwarding if (mcast datagram received on incoming link on shortest path back to center) then flood datagram onto all outgoing links else ignore datagram rely on router’s knowledge of unicast shortest path from it to sender each router has simple forwarding behavior:
10 Reverse Path Forwarding: example result is a source-specific reverse SPT –may be a bad choice with asymmetric links R1 R2 R3 R4 R5 R6 R7 router with attached group member router with no attached group member datagram will be forwarded LEGEND S: source datagram will not be forwarded
11 Reverse Path Forwarding: pruning forwarding tree contains subtrees with no mcast group members no need to forward datagrams down subtree “prune” msgs sent upstream by router with no downstream group members R1 R2 R3 R4 R5 R6 R7 router with attached group member router with no attached group member prune message LEGEND S: source links with multicast forwarding P P P
12 Shared-Tree: Steiner Tree Steiner Tree: minimum cost tree connecting all routers with attached group members problem is NP-complete excellent heuristics exists not used in practice: computational complexity information about entire network needed monolithic: rerun whenever a router needs to join/leave
13 Center-based trees single delivery tree shared by all one router identified as “center” of tree to join: edge router sends unicast join-msg addressed to center router join-msg “processed” by intermediate routers and forwarded towards center join-msg either hits existing tree branch for this center, or arrives at center path taken by join-msg becomes new branch of tree for this router
14 Center-based trees: an example Suppose R6 chosen as center: R1 R2 R3 R4 R5 R6 R7 router with attached group member router with no attached group member path order in which join messages generated LEGEND
15 IP Multicast Architecture Hosts Routers Service model Host-to-router protocol (IGMP) Multicast routing protocols (various)
16 Internet Group Management Protocol IGMP: “signaling” protocol to establish, maintain, remove groups on a subnet. Objective: keep router up-to-date with group membership of entire LAN Routers need not know who all the members are, only that members exist Each host keeps track of which mcast groups are subscribed to Socket API informs IGMP process of all joins
17 How IGMP Works On each link, one router is elected the “querier” Querier periodically sends a Membership Query message to the all-systems group ( ), with TTL = 1 On receipt, hosts start random timers (between 0 and 10 seconds) for each multicast group to which they belong QRouters: Hosts:
18 How IGMP Works (cont.) When a host’s timer for group G expires, it sends a Membership Report to group G, with TTL = 1 Other members of G hear the report and stop (suppress) their timers Routers hear all reports, and time out non-responding groups Q GGGG Routers: Hosts:
19 How IGMP Works (cont.) Normal case: only one report message per group present is sent in response to a query Query interval is typically seconds When a host first joins a group, it sends immediate reports, instead of waiting for a query IGMPv2: Hosts may send a “Leave group” message to “all routers” ( ) address Querier responds with a Group-specific Query message: see if any group members are present Lower leave latency
20 IP Multicast Architecture Hosts Routers Service model Host-to-router protocol (IGMP) Multicast routing protocols
21 Multicast Routing Basic objective – build distribution tree for multicast packets The “leaves” of the distribution tree are the subnets containing at least one group member (detected by IGMP) Multicast service model makes it hard Anonymity Dynamic join/leave
22 Routing Techniques Flood and prune Begin by flooding traffic to entire network Prune branches with no receivers Examples: DVMRP, PIM-DM Link-state multicast protocols Routers advertise groups for which they have receivers to entire network Compute trees on demand Example: MOSPF
23 Routing Techniques(…) Core-based protocols Specify “meeting place” aka “core” or “rendezvous point (RP)” Sources send initial packets to core Receivers join group at core Requires mapping between multicast group address and “meeting place” Examples: CBT, PIM-SM
24 Routing Techniques (…) Tree building methods: Data-driven: calculate the tree only when the first packet is seen. Eg: DVMRP, MOSPF Control-driven: Build tree in background before any data is transmitted. Eg: CBT Join-styles: Explicit-join: The leaves explicitly join the tree. Eg: CBT, PIM-SM Implicit-join: All subnets are assumed to be receivers unless they say otherwise (eg via tree pruning). Eg: DVMRP, MOSPF
25 Shared vs. Source-based Trees Source-based trees Separate shortest path tree for each sender (S,G) state at intermediate routers Eg: DVMRP, MOSPF, PIM-DM, PIM-SM Shared trees Single tree shared by all members Data flows on same tree regardless of sender (*,G) state at intermediate routers Eg: CBT, PIM-SM
26 Source-based Trees Router Source Receiver S R R R R R S S
27 A Shared Tree RP Router Source Receiver S S S R R R R R
28 Shared vs. Source-Based Trees Source-based trees Shortest path trees – low delay, better load distribution More state at routers (per-source state) Efficient in dense-area multicast Shared trees Higher delay (bounded by factor of 2), traffic concentration Choice of core affects efficiency Per-group state at routers Efficient for sparse-area multicast
29 Distance-Vector Multicast Routing DVMRP consists of two major components: A conventional distance-vector routing protocol (like RIP) A protocol for determining how to forward multicast packets, based on the unicast routing table DVMRP router forwards a packet if The packet arrived from the link used to reach the source of the packet Reverse path forwarding check – RPF If downstream links have not pruned the tree
30 Example Topology GG S G
31 Flood with Truncated Broadcast GG S G
32 Prune GG S Prune (s,g) G
33 Graft (s,g) Graft GG S G G Report (g)
34 Steady State GG S G G
35 DVMRP limitations Like distance-vector protocols, affected by count-to-infinity and transient looping Shares the scaling limitations of RIP. New scaling limitations: (S,G) state in routers: even in pruned parts! Broadcast-and-prune has an initial broadcast. No hierarchy: flat routing domain
36 Multicast Backbone (MBone) An overlay network of IP multicast-capable routers using DVMRP Tools: sdr (session directory), vic, vat, wb Host/router MBone router Physical link Tunnel Part of MBone RR R H R H R R H
37 A method for sending multicast packets through multicast- ignorant routers IP multicast packet is encapsulated in a unicast IP packet (IP-in- IP) addressed to far end of tunnel: Tunnel acts like a virtual point-to-point link Intermediate routers see only outer header Tunnel endpoint recognizes IP-in-IP (protocol type = 4) and de-capsulates datagram for processing Each end of tunnel is manually configured with unicast address of the other end MBone Tunnels IP header, dest = unicast IP header, dest = multicast Transport header and data…
38 Protocol Independent Multicast (PIM) Support for both shared and per-source trees Dense mode (per-source tree) Similar to DVMRP Sparse mode (shared tree) Core = rendezvous point (RP) Independent of unicast routing protocol Just uses unicast forwarding table
39 PIM Protocol Overview Basic protocol steps in RegisterRouters with local members Join toward Rendezvous Point (RP) to join shared tree Routers with local sources encapsulate data messages to RP Routers with local members may initiate data- driven switch to source-specific shortest path trees PIM v.2 Specification (RFC2362)
40 Source 1 Receiver 1 Receiver 2 PIM Example: Build Shared Tree (*,G) Receiver 3 (*,G) Join message toward RP Shared tree after R1,R2 join RP
41 Source 1 Receiver 1 Receiver 2 Data Encapsulated in Register (*,G) Receiver 3 (*,G) Unicast encapsulated data packet to RP in Register RP RP de-capsulates, forwards down shared tree
42 Source 1 Receiver 1 Receiver 2 RP Send Join to High Rate Source Receiver 3 (S1,G) RP Join message toward S1 Shared tree
43 Source 1 Receiver 1 Receiver 2 Build Source-Specific Distribution Tree Receiver 3 Join messages Shared Tree RP Build source-specific tree for high data rate source (S1,G),(*,G) (S1, G) (S1,G),(*,G)
44 Source 1 Receiver 1 Receiver 2 Forward On “Longest-match” Entry Receiver 3 Source 1 Distribution Tree Shared Tree RP (S1,G),(*,G) (S1, G) (S1,G),(*,G) Source-specific entry is “longer match” for source S1 than is Shared tree entry that can be used by any source (*, G)
45 Prune S1 off Shared Tree Prune S1 off shared tree where if S1 and RP entries differ Source 1 Receiver 1 Receiver 2 Receiver 3 Source 1 Distribution Tree Shared Tree RP Prune S1
46 Reliable Multicast Transport Problems: Retransmission can make reliable multicast as inefficient as replicated unicast Ack-implosion if all destinations ack at once Source does not know # of destinations “Crying baby”: a bad link affects entire group Heterogeneity: receivers, links, group sizes Not all multicast applications need strong reliability of the type provided by TCP. Some can tolerate reordering, delay, etc
47 Reliability Models Reliability => requires redundancy to recover from uncertain loss or other failure modes. Two types of redundancy: Spatial redundancy: independent backup copies Forward error correction (FEC) codes Problem: requires huge overhead, since the FEC is also part of the packet(s) it cannot recover from erasure of all packets Temporal redundancy: retransmit if packets lost/error Lazy: trades off response time for reliability Design of status reports and retransmission optimization