CHAPTER 5 Advanced Networking Technologies

CHAPTER 5 Advanced Networking Technologies
C. Develder and M. Pickavet

Content Traffic Engineering Failure Recovery Multicast Ethernet IPv6
Technologies

Traffic Engineering How to route the traffic (or more general: engineer the traffic)? Now: Shortest path (hop count) routing! Alternatives: - Constraint based routing (use other metrics) - Load balancing (use different routes) - MPLS (Multi Protocol Label Switching) as supporting technology So far the routing in the internet is based on a shortest path routing algorithm (typically OSPF) using a simple metric (typically hop count). In addition traffic between a certain source and destination (area) is following a single route (although this might change over time due to reconfigurations). The use of a simple shortest path routing algorithm has a number of important limitations. In general a shortest path today is defined by the number of hops from source to destination (the shortest path metric is the hop count). This does not give any information about the delay along the path or the available bandwidth. In that respect it becomes interesting to consider constraint based routing, where different metrics could be used (for time critical traffic the delay is important, for high bandwidth traffic the available bandwidth is important). Of course these metrics are not fully independent (e.g. a higher available bandwidth will in general result in a lower delay, but a high bandwidth route may also result in a long delay if the physical length of the route (e.g. via satellite) is much longer than the lower bandwidth route (e.g. via fiber)). A second important remark is that classical (single route) routing may result in traffic concentration on certain links and in routers whereas on other parts of the network a lot of resources are still available. Load balancing (or traffic engineering in general) may help to more evenly distribute the traffic over the network, resulting in less congestion on certain links. In order to support load balancing, MPLS (Multi-Protocol Label Switching) may be a very useful supporting technology. This allows to set up explicit routes in the network (not the shortest routes). Technologies

How to find the route with the required QoS?
low BW fiber link CR low delay High BW CR ER ER CR CR high BW satellite link QoS routing: taking certain constraints into account (bandwidth, delay, cost, …)  CONSTRAINT BASED ROUTING (could be very complex) (additive [hop count, delay], multiplicative [loss rate], concave constraints [bandwidth]) An important question is how to find the best route to fulfil a certain constraint (e.g. delay, bandwidth). Both IntServ and DiffServ rely on the standard routing protocols used in the current internet (e.g. OSPF). This will result however in a route that takes into account only one metric (usually hop count). In order to find the optimum route through a network it may be beneficial to take into account several and more advanced metrics (e.g. delay, bandwidth, packet loss rate). The example shows two routes in the network. A first (low delay) route will pass over a fiber link that however is close to congestion (due to traffic already using this link), resulting in a limited bandwidth. A second route will run over a satellite link (large propagation delay) but is hardly used (large bandwidth still available). Depending on the application one may opt for one or the other route. This will be supported by constraint based routing. In order to use constraint based routing, information about the different metrics has to be distributed in the network. This is possible by extending the existing routing protocols. OSPF for example is currently only providing topology information with a single metric (typically set to 1 for the hop count). By adding other metrics one could extend OSPF (Q-OSPF where Q stands for QoS) and obtain in this way a topology database with multiple metrics per link. From that topology database it will be possible to calculate generalized shortest paths (e.g. path with smallest delay, largest bandwidth, …). These calculations may become very complex depending on the specifics of the metrics. For bandwidth one can take the smallest value of the route (concave constraint), for delay one has to add the different contributions from the different links (additive constraint) and for loss rate one has to multiply the link specific values (multiplicative constraints). These are known to be very computational intensive calculations. It is clear that the use of constraint based routing algorithms is very attractive for IntServ and DiffServ but they are also useful on their own (in the normal best effort forwarding strategy). Note: the use of constraint based routing will result in much more complex routing tables! Also the stability of the protocol becomes more problematic. How to distribute “constraint” information? (e.g. BW on links) Add information on link state during OSPF (Q-OSPF) Very useful for both DiffServ and IntServ! Routing table gets much more complex! Technologies

Load balancing CR CR ER ER CR CR CR
Shortest path problem: overload certain links An important problem in the current internet routing is the fact that shortest paths are used resulting in some concentration of traffic on certain links (resulting in congestion, delay and packet loss). In order to counteract this, one could use load balancing in order to distribute the traffic over the network more evenly. This requires however the use of multiple paths between source and destination (area). A first possibility is the equal cost multipath routing (ECMP) algorithm. This will be useful when there are several shortest routes between the same two endpoints. In this case one could try to use the different shortest routes (and not choose one). A possibility is to use a hash function (e.g. half of the destination domains will go on shortest path 1 and the other half on shortest path 2). A second possibility is to set up explicit tunnels in the network which may follow a route, different from the shortest route. A supporting technology is MPLS or Multi Protocol Label Switching. Load balancing: distribute traffic more evenly over the network: - equal cost multipath (use of hash function) - use of MPLS Technologies

MPLS: Multi Protocol Label Switching
Label Switched Router (LSR) LSR IN 1 IN 2 OUT 2 OUT 1 5 5 4 3 Label Information Base (LIB) MPLS header IP payload IP header MPLS will replace the normal packet forwarding based on a router table look-up with a forwarding based on a label (and label swapping). This is illustrated in the figure. An MPLS packet will have an additional header in front of the IP-packet header (in fact the MPLS header is inserted between the IP header and the layer 2 header, e.g. Ethernet header). This header (of 32 bits) contains a label (20 bits), 3 bits for experimental use, a stacking bit and an 8 bit TTL. The label will play a central role in the forwarding process (also called label switching). Suppose an MPLS packet arrives in a router supporting MPLS (=LSR or Label Switched Router). Upon arrival one will look up a database, the Label Information Base (LIB), in order to know where the packet has to go and what the new label is on the outgoing link. The example illustrates an incoming label of 5 on input link 1 that is mapped to an outgoing label of 4 on output link 1 (as indicated in the LIB). Another MPLS packet arriving on IN 2 with label 3 will be mapped to OUT 1 with label 5. Important: labels only have local significance. An end to end path (LSP or Label Switched Path) consists of labels on each link and a relation between them defined in the label information base (“label swapping table”). As a result, the same label may be reused on different links. Note: ATM (Asynchronous Transfer Mode) uses a similar concept. MPLS header (32 bit): Label (20 bit): MPLS label Exp (3 bit): experimental use S (1 bit): stacking bit TTL (8 bit): time to live local significance e.g. use for DiffServ Technologies

Routing <> Label Switching
2 6 8 B 2 A B /16 g: m  5 6 8 1 B k B m 7 k B g B g w: k  B 4 m w 5 3 B w B k d: w  8 B d d 7 m: k  3 k k: d  4 IP router IP/MPLS Label Switched Router (LSR) Technologies

MPLS: Path set-up (LSP)
C Y: to B Y: to C Y: to Y Label Information Base Label Switched Router Z 200100 A RESV Lab 100 B PATH Lab_Req Y 200 PATH Lab_Req Y RESV Lab 200 100 Y 300100 100 100200 PATH Lab_Req Y RESV Lab 300 RESV Lab 100 PATH Lab_Req Y Respond with a label (receive) (LABEL object in RESV msg) 300 X Need label for Destination Y (LABEL_REQUEST object in PATH msg) The goal of MPLS is to set up dedicated routes in the network where the forwarding is based on a label swapping operation (and not based on a longest prefix match as in normal IP forwarding). The principle of path set-up in MPLS (Multi Protocol Label Switching) is as follows: suppose one wants to set up a label switched path (LSP) from X to Y. First a label request message is sent from source X to destination Y using the normal forwarding of IP packets (following the shortest path as obtained from e.g. OSPF). When the label request arrives in Y, this computer (could also be a router, typically an edge router) will assign a label for this request (e.g. 100). This should be a label that is not yet used on the link from the computer to the last router (C). The computer will send a label mapping message back to the last router. This router will assign a label (e.g. 200) on the link towards the previous router B (a label that is not yet used on this link) and it will keep a label swapping table (Label Information Base) in its memory (swapping from 200 to 100 when information will come from the source X). A label mapping message will be sent from C to B and B will assign a label on the link towards router A (e.g. 100). Router A will take similar actions assigning a label on the link to X (e.g. 300) and filling in its label swapping table (from 300 to 100). In this way a LSP is set up from source to destination. The protocol used to set up this path (to assign and distribute the labels) is RSVP (in the past, the Label Distribution Protocol (LDP) also has been considered). Some terminology: MPLS capable router: Label Switched Router (LSR) MPLS path: Label Switched Path (LSP) RSVP-TE D W Technologies

C Y: to B Y: to C Y: to Y Label Switched Router Z A B PATH Lab_Req Y PATH Lab_Req Y Y PATH Lab_Req Y PATH Lab_Req Y X Need label for Destination Y (LABEL_REQUEST object in PATH msg) For printing RSVP-TE D W Technologies

C Label Information Base Z 200100 A RESV Lab 100 B 200 RESV Lab 200 100 Y 300100 100 100200 RESV Lab 300 RESV Lab 100 Respond with a label (receive) (LABEL object in RESV msg) 300 X For printing RSVP-TE D W Technologies

MPLS: support of TE C A B D Z Y X W MPLS Label IP header 50150
150450 450100 200100 300100 100200 A 450 B 100 100 Y 150 100 200 X This figure illustrates the principle of the load-balancing support of MPLS. From terminal X one can send IP packets to terminal Y via two different routes (support by MPLS). Sending an IP packet with MPLS label 300 will follow the lower route (label swapping sequence: 300 – 100 – 200 – 100), sending an IP packet with label 50 will follow the upper route (label swapping sequence: 50 – 150 – 450 – 100). Choosing the appropriate label will allow to send traffic over two different routes, enabling the spreading of the traffic over the network. 50 300 MPLS Label IP header D W Technologies

MPLS: Example MPLS “tunnel”
A B F C D E OUT 1 OUT 2 3 5 4 3 LSP MPLS may be used to set up explicit paths in a network, not following the shortest path as illustrated previously. In this case the path will be set up using explicit routing (via an EXPLICIT_ROUTE object in the RSVP PATH message): the route will be calculated at the source (e.g. constraint based routing) and will be explicitly carried in the path set-up message (label request message). The path set-up message will follow this explicit route and not the “OSPF-based” shortest route. In this way, a Label Switched Path will be set up following a specific route (with some specific features). The example shows the routing table in case MPLS is not used (table on top). In this case a packet arriving at the incoming interface of router A with destination will be forwarded to router B, as indicated in the routing table. In case of the use of MPLS, a Label Switched Path (LSP) is set up between router A and router C and the routing table entry for network is directing the packets into the LSP from A to C via D, E and F. MPLS tunnel (LSP) set-up via explicit routing: during path set-up an explicit path is used (not the OSPF shortest, but e.g. a constraint based path with lowest delay) Technologies

MPLS: VPN example MPLS Virtual Private Network between three company locations LSP Public Internet (MPLS capable) easy end-to-end encryption for security An interesting application of these MPLS “tunnels” (LSP’s) is the set-up of Virtual Private Networks. These can be used to interconnect several company locations using Label Switched Paths. MPLS may be combined with DiffServ in order to provide QoS. Because the IP header (and thus the DSCP) is not analysed in the Label Switched Routers (LSR), an indication of the PHB is made in the MPLS header using the 3 Exp bits. Note that these tunnels may also be made secure in a straightforward manner. MPLS could be combined with DiffServ to provide QoS (the 3 Exp bits are used to indicate the PHB) Technologies

Technologies

Failure Recovery: OSPF based
normal operation knowledge of network topology Router C: Link-State Database [AB,BD,BC] [BD,CD,DE] [AE,DE] [AB,AE] incoming link state packets Dijkstra: shortest paths Router C: Routing Table A Another important issue is the resilience of the network against failures. This may be resolved using the classical routing protocols (e.g. OSPF) but this will in general be fairly slow. Note that the whole internet concept was developed to make it reliable against network failures. Other (protection) techniques may be much faster and may result in a better service for the customers. A service interruption of a few seconds will be much better than a service interruption of hundreds of seconds. As a reminder, the link state based routing is illustrated in the figure (OSPF or Open Shortest Path First). B C E D Technologies

Failure Recovery: OSPF based
knowledge of network topology Router C: Link-State Database [AB,BD,BC] [BD,CD] [AE] [AB,AE] incoming link state packets Link ED not advertised Dijkstra: shortest paths Router C: Routing Table A B C This figure illustrates how OSPF will react after a link failure. The routing tables will be updated correctly but this may take a long time (order of 100 seconds, although tuning of the parameters may reduce this with an order of magnitude). For a number of emerging multimedia services, this is unacceptable. From experience in the telephone network (PSTN), it is clear that voice traffic is not affected by short interruptions (order of few 100 msec), which is in general easily obtained using the appropriate protection mechanisms. E D this may take 50 to 100 seconds Technologies

Failure Recovery: MPLS based
Set up back-up LSP between edge routers Copy incoming traffic on primary and back-up LSP (1+1 protection) Select traffic from back-up LSP if primary LSP not available  VERY FAST (single decision at receiving end = egress router) primary LSP ingress router egress router CR ER copy traffic on backup LSP take traffic from backup LSP if primary LSP fails The reaction time after a failure can be improved drastically by the use of MPLS back-up paths. A Label Switched Path (primary LSP) is set up between the two edge routers. At the same time a backup LSP (which is using different links and Label Switched Routers) is set up between the same edge routers. In the ingress edge router (left side), the IP traffic is duplicated and sent on the primary LSP and the back-up LSP. At the egress edge router (right side), the IP packets are taken from the primary LSP. In case of failure of the primary LSP, the traffic at the egress edge router is taken from the back-up LSP. This switch over may be very fast (order of 100 msec). backup LSP Note: all traffic between the two edge routers may be protected with the same back-up LSP Technologies

MPLS: failure recovery
Z 50150 150450 450100 200100 300100 100200 A 450 B 100 100 Y 150 100 200 X This figure illustrates the principle of the failure recovery support of MPLS. From terminal X one will send IP packets to terminal Y with MPLS label 300 following the lower route (label swapping sequence: 300 – 100 – 200 – 100). When a failure occurs, the IP packets will use a new label (50) and will follow the upper route (label swapping sequence: 50 – 150 – 450 – 100). This will be very fast when the upper path is already established. 50 300 MPLS Label IP header D W Technologies

Technologies

Multicast: multiple unicast
A number of emerging network applications require the delivery of packets from one or more senders to a group of receivers. These applications include bulk data transfer (for example the transfer of a software upgrade from the software developer to users needing the upgrade), streaming continuous media (for example, the transfer of the audio, video, and text of a live lecture to a set of distributed lecture participants), shared data applications (for example, a whiteboard or teleconferencing application that is shared among many distributed participants), data feeds (for example, stock quotes), WWW cache updating, and interactive gaming (for example, distributed interactive virtual environments or multiplayer games such as World Of Warcraft). For each of these applications, an extremely useful abstraction is the notion of a multicast: the sending of a packet from one sender to multiple receivers with a single send operation. From a networking standpoint, the multicast abstraction (a single send operation that results in copies of the sent data being delivered to many receivers) can be implemented in many ways. One possibility for the sender is to use a separate unicast transport connection to each of the receivers. An application-level data unit that is passed to the transport layer is then duplicated at the sender and transmitted over each of the individual connections. This approach implements a one-sender-to-many-receivers multicast abstraction using an underlying unicast network layer. It requires no explicit multicast support from the network layer to implement the multicast abstraction; multicast is emulated using multiple point-to-point unicast connections. A source is sending the same information to a number of receivers (e.g. video distribution) Multiple unicast flows or single multicast flow Technologies

Multicast: single multicast tree
Class D multicast address ( multicast group) A second alternative is to provide explicit multicast support at the network layer. In this latter approach, a single datagram is transmitted from the sending host. This datagram (or a copy of this datagram) is then replicated at a network router whenever it must be forwarded on multiple outgoing links in order to reach the receivers. This second approach towards multicast enables more efficient use of network bandwidth in that only a single copy of a datagram will ever traverse a link. On the other hand, considerable network layer support is needed to implement a multicast-aware network layer. Unlike the unicast case, Internet multicast is not a connectionless service: state information for a multicast connection must be established and maintained in routers that handle multicast packets sent among hosts in a so-called multicast group. This, in turn, will require a combination of signaling and routing protocols in order to set up, maintain, and tear down connection state in the routers. This added complexity has hampered the success of multicast in the Internet. (A now historical effort was the MBONE, mainly used for videoconferences, which was a virtual network and used tunneling to cross unicast routers.) When considering internet multicast, a multicast datagram is addressed using address indirection. That is, a single identifier is used for the group of receivers, and a copy of the datagram that is addressed to the group using this single identifier is delivered to all the multicast receivers associated with that group. In the Internet, the single identifier that represents a group of receivers is a Class D multicast address. The group of receivers associated with a class D address is referred to as a multicast group. This is illustrated in the figure: a number of hosts are associated with the multicast group address of and will receive all datagrams addressed to that multicast address. A problem to be addressed is the fact that each host has a unique IP unicast address that is completely independent of the address of the multicast group in which it is participating. Additional questions: How does a group get started and how does it terminate? How is the group address chosen? How are new hosts added to the group (either as senders or receivers)? Can anyone join a group (and send to, or receive from, that group) or is group membership restricted and if so, by whom? Do group members know the identities of the other group members as part of the network-layer protocol? How do the network routers interoperate with each other to deliver a multicast datagram to all group members? (Note: multicast address range: , or /4 ; see RFC 3171) Who belongs to multicast group? connection oriented! - requires state in the network - requires signaling - requires special routing protocols How to become member of the multicast group? How to set up the multicast tree? Multiple unicast flows or single multicast flow Technologies

Multicast architecture
Internet MULTICAST ROUTING DVMRP Distance Vector Multicast Routing Protocol PIM Protocol Independent Multicast (used in a wide area: intradomain) IGMP Internet Group Management Protocol (used in a single (sub)network) One typically makes a distinction between multicast routing protocols at three levels: (1) within a single (sub)network, (2) within a wide area network (ISP or Internet Service Provider, Autonomous System or AS, intradomain) and (3) between wider area networks: interdomain (between ISP’s or AS’s). The latter will not be considered here. The Internet Group Management Protocol, IGMP (currently version 3 [RFC 3376]), operates between a host and its directly attached router (within a (sub)network). The example shows a local interface attached to a LAN, and while each LAN has multiple attached hosts, at most a few of these hosts will typically belong to a given multicast group at any given time. IGMP provides the means for a host to inform its attached router (gateway) that an application running on the host wants to join a specific multicast group. The scope of IGMP is managing multicast group membership, and interaction is limited to a host and a router it is attached to. Hence another protocol is required to coordinate the multicast routers (including the attached routers) throughout the Internet, so that multicast datagrams are routed to their final destinations (both at interdomain and intradomain level). This latter functionality is accomplished by the network-layer multicast routing algorithms such as PIM (Protocol Independent Multicast), DVMRP (Distance Vector Multicast Routing Protocol) and MOSPF (Multicast Open Shortest Path First). Network-layer multicast in the Internet thus consists of two complementary components: IGMP and multicast routing protocols (intradomain and interdomain). also interdomain Technologies

Internet Group Management Protocol (IGMP)
IGMP messages: optional! ( soft state) Edge Router has to know the multicast groups where local hosts are subscribed ER Internet query report IGMP has only three message types. A general membership_query message is sent by a router to all hosts on an attached interface (e.g. to all hosts on a LAN) to determine the set of all multicast groups that have been joined by the hosts on that interface. A router can also determine if a specific multicast group has been joined by hosts on an attached interface using a specific membership_query. The specific query includes the multicast address of the group being queried in the multicast group address field of the IGMP membership_query message. Hosts respond to a membership_query message with an IGMP membership_report message. Membership_report messages can also be generated by a host when an application first joins a multicast group without waiting for a membership_query message from the router. Membership_report messages are received by the router, as well as all hosts on the attached interface (for example, in the case of a LAN). Each membership_report contains the multicast address of a single group that the responding host has joined. Note that an attached router doesn't really care which hosts have joined a given multicast group or even how many hosts on the same LAN have joined the same group (the router should know the multicast groups that are used in the LAN). In either case, the router's work is the same: it must run a multicast routing protocol together with other routers to ensure that it receives from the Internet the multicast datagrams for the appropriate multicast groups. The final type of IGMP message is the leave_group message. Interestingly, this message is optional! But if it is optional, how does a router detect that there are no longer any hosts on an attached interface that are joined to a given multicast group? The answer to this question lies in the use of the IGMP membership_query message. The router infers that no hosts are joined to a given multicast group when no host responds to a membership_query message with the given group address. This is an example of what is sometimes called soft state in an Internet protocol. In a soft-state protocol, the state (in this case of IGMP, the fact that there are hosts joined to a given multicast group) is removed via a timeout event (in this case, via a periodic membership_query message from the router) if it is not explicitly refreshed (in this case, by a membership_report message from an attached host). Defined in RFC 3376 (IGMPv3). Technologies

Service model of multicast
- local join of multicast group using IGMP - access router will take care of receiving multicast group packets (for its local hosts) (use of multicast routing protocol) - receiver driven joining of a group - senders do not know the receivers - all group members can be sender Remaining question: How to interconnect the edge routers? Use of multicast routing protocols Having examined the protocol for joining and leaving multicast groups (IGMP), the current Internet multicast service model is now more clear. In this multicast service model, any host can join a multicast group at the network layer. A host simply issues a membership_report IGMP message to its attached router. That router, working in concert with other Internet routers, will soon begin delivering multicast datagrams to the host. Joining a multicast group is thus receiver-driven. A sender does not need to be concerned with explicitly adding receivers to the multicast group but neither can it control who joins the group and therefore who receives datagrams sent to that group. Similarly, there is no control over who sends to the multicast group. Datagrams sent by different hosts can be arbitrarily interleaved at the various receivers (with different interleavings possible at different receivers). A malicious sender can inject datagrams into the multicast group datagram flow. Even with benign senders, since there is no network-layer coordination of the use of multicast addresses, it is possible that two different multicast groups will choose to use the same multicast address. From a multicast application viewpoint, this will result in interleaved extraneous multicast traffic. The solution is Source-Specific Multicast (SSM), for which version 3 of IGMP offers support. Indeed, IGMP v3 allows "source filtering", that is, the ability for a system to report interest in receiving packets *only* from specific source addresses, or from *all but* specific source addresses, sent to a particular multicast address. (Note: SSM: see informational RFC 3569 “An Overview of Source-Specific Multicast (SSM)”, and RFC 4607 “Source-Specific Multicast for IP”) Note: no coordination of the choice of a class D address for a multicast group ( multiple groups may eventually use the same class D address!) Solution: “source filtering”, as in IGMP v3 Technologies

Multicast routing: group shared tree
ER ER CR ER A first approach for multicasting information between the different routers (if they need information from the same multicast group) is to use a group-shared tree. In the group-shared tree approach, only a single routing tree is constructed for the entire multicast group (see figure). Multicast packets will flow only over the bold links. Note that the links are bi-directional, since packets can flow in either direction on a link. How to build up the routing tree between edge routers? first approach: multicast group shared tree Note: all group members use the same (bidirectional) tree Technologies

Multicast routing: group shared tree
ER ER CR ER RP CR CR CR CR The building up of a group shared tree may be based on the notion of defining a center node (also known as a rendez-vous-point or a core) in the single shared multicast routing tree. In this center-based approach, a center node is first identified for the multicast group. Routers with attached hosts belonging to the multicast group then unicast so-called join messages addressed to the center node. A join message is forwarded using unicast routing toward the center until it either arrives at a router that already belongs to the multicast tree or arrives at the center. In either case, the path that the join message has followed defines the branch of the routing tree between the edge router that initiated the join message and the center. One can think of this new path as being grafted onto the existing multicast tree for the group. A critical question for center-based tree multicast routing is the process used to select the center. It is shown that centers can be chosen so that the resulting tree is within a constant factor of optimum (the solution to the Steiner tree problem). How to build up the routing group shared tree? Use of a rendezvous point (center based approach) Note: choice of rendezvous point is difficult Technologies

Multicast routing: source based tree
ER ER CR ER CR CR CR ER CR ER ER ER Another approach is the use of source-based trees. In a source-based approach, an individual routing tree is constructed for each sender in the multicast group. In a multicast group with N hosts, N different routing trees will be constructed for that single multicast group. Packets will be routed to multicast group members in a source-specific manner. Note that different links may be used compared to the group-shared tree case, but also the source trees will be different. Second approach: multiple source based trees Note: trees will be different and in general unidirectional Technologies

An incoming multicast packet is forwarded in a router on all of its outgoing links (except the one on which the packet was received) only if the packet arrived on the link that is on its own shortest path back to the sender ER ER CR ER CR CR CR ER CR ER ER ER A simple algorithm to build up a single source tree is the reverse path forwarding (RPF) algorithm. When a router receives a multicast packet with a given source address, it transmits the packet on all of its outgoing links (except the one on which it was received) only if the packet arrived on the link which is on its own shortest path back to the sender. Otherwise, the router simply discards the incoming packet without forwarding it on any of its outgoing links. Such a packet can be dropped because the router has already received or will receive, a copy of this packet on the link that is on its own shortest path back to the sender. Note that reverse path forwarding does not require that a router knows the complete shortest path from itself to the source; it only needs to know the next hop on its unicast shortest path to the sender. How to build up a source based tree? Use of a Reverse Path Forwarding (RPF) Note: prune messages from edge routers that have no hosts belonging to the multicast group Technologies

ER ER CR ER CR CR CR ER CR ER ER ER A problem with the RPF is that routers will receive multicast packets even when there are no hosts (connected to the router) interested in the multicast group. The solution to this problem of receiving unwanted multicast packets under RPF is known as pruning. A multicast router that receives multicast packets and has no attached hosts joined to that group will send a prune message to its upstream router. If a router receives prune messages from each of its downstream routers, then it can forward a prune message upstream. While pruning is conceptually straightforward, two subtle issues arise. First, pruning requires that a router knows which routers downstream are dependent on it for receiving their multicast packets. This requires additional information beyond that required for RPF. A second complication is more fundamental: if a router sends a prune message upstream, then what should happen if later on, a router would need to join that multicast group? Recall that, under RPF, multicast packets are "pushed” down the RPF tree to all routers. If a prune message removes a branch from that tree, then some mechanism is needed to restore that branch. One possibility is to add a graft message that allows a router to "unprune" a branch. Another option is to allow pruned branches to time-out and be added again to the multicast RPF tree; a router can then re-prune the added branch if the multicast traffic is still not wanted. Prune messages sent from edge routers that have no hosts belonging to the multicast group (“pruned” routers will not forward packets from the multicast group) Technologies

Examples of multicast routing protocols
Distance Vector Multicast Routing Protocol (DVMRP) source based trees reverse path forwarding, pruning and grafting Protocol Independent Multicast (PIM) Two different scenarios: dense mode and sparse mode dense mode (DM): large number of users  RPF approach sparse mode (SM): few users  central approach bidirectional (BIDIR): variant of SM  central approach The first multicast routing protocol used in the Internet and the most widely supported multicast routing algorithm is the Distance Vector Multicast Routing Protocol (DVMRP). DVMRP implements source-based trees with reverse path forwarding, pruning, and grafting (add a previously pruned branch). DVMRP uses a distance vector algorithm that allows each router to compute the outgoing link (next hop) that is on its shortest path back to each possible source. This information is then used in the RPF algorithm, as discussed above. The Protocol Independent Multicast (PIM) routing protocol explicitly envisions two different multicast distribution scenarios. In so-called dense mode, multicast group members are densely located, that is, many or most of the routers in the area need to be involved in routing multicast datagrams. In sparse mode, the number of routers with attached group members is small with respect to the total number of routers; group members are widely dispersed. In dense mode (RFC 3973), since most routers will be involved in multicast (for example, have attached group members), it is reasonable to assume that each and every router should be involved in multicast. Thus, an approach like RPF that floods datagrams to every multicast router (unless a router explicitly prunes itself), is well-suited to this scenario. On the other hand, in sparse mode (RFC 4601), the routers that need to be involved in multicast forwarding are few and far between. In this case, a data-driven multicast technique like RPF, which forces a router to constantly work (prune) simply to avoid receiving multicast traffic, is much less satisfactory. In sparse mode, the default assumption should be that a router is not involved in a multicast distribution; the router should not have to do any work unless it wants to join a multicast group. This argues for a center-based approach (rendezvous point approach), where routers send explicit join messages, but are otherwise uninvolved in multicast forwarding. One can think of the sparse-mode approach as being receiver-driven (that is, nothing happens until a receiver explicitly joins a group) versus the dense-mode approach as being data-driven (that is, that datagrams are multicast everywhere, unless explicitly pruned). Multicast Open Shortest Path First (MOSPF) Core Based Tree (CBT) Technologies

(continued from previous slide) Bidirectional PIM (BIDIR-PIM, RFC 5015) is a variant of PIM Sparse-Mode that builds bidirectional shared trees connecting multicast sources and receivers. Bidirectional trees are built using a fail-safe Designated Forwarder (DF) election mechanism operating on each link of a multicast topology. With the assistance of the DF, multicast data is natively forwarded from sources to the Rendezvous- Point (RP) and hence along the shared tree to receivers without requiring source-specific state. The DF election takes place at RP discovery time and provides the route to the RP, thus eliminating the requirement for data-driven protocol events. PIM Source Specific Multicast (PIM-SSM, RFC 3569) builds trees that are rooted in just one source, offering a more secure and scalable model for a limited amount of applications (mostly broadcasting of content). In SSM, an IP datagram is transmitted by a source S to an SSM destination address G, and receivers can receive this datagram by subscribing to channel (S,G). Other examples are: MOSPF (Multicast Open Shortest Path First) and CBT (Core Base Tree; historical). Technologies

Ethernet forwarding Determine to which LAN segment to forward frame?
A switch has a switch table entry in switch table: <Node LAN Address, Switch Interface, Time Stamp> stale entries in table dropped (TTL can be 60 min) switches learn which hosts can be reached through which interfaces when frame received, switch “learns” location of sender: incoming LAN segment records sender/location pair in switch table Technologies

Ethernet: Self learning
Send frame from X to Y Send frame back from Y to X Fill in switch table X  1 Y  3 D 1 3 2 X  1 X  4 X  4 4 4 4 Y  4 C A Y  1 B 1 3 2 1 3 1 3 2 2 X Y Ethernet Switch Technologies

Ethernet: Forwarding When switch receives a frame:
index switch table using MAC dest address if entry found for destination then{ if dest on segment from which frame arrived then drop the frame else forward the frame on interface indicated } else flood forward on all but the interface on which the frame arrived Technologies

Ethernet: switched loops
Send frame from X to Y X  1 X  2 D 1 X  1 2 X  4 X  3 4 4 X  1 A B X  4 1 3 2 1 3 2 X Y Formation of loops Multiple copies received by terminals Technologies

Spanning Tree Protocol (STP)
with multiple paths, cycles result - switches may multiply and forward frame forever for increased reliability, desirable to have redundant, alternative paths from source to dest solution: organize switches in a spanning tree by disabling subset of interfaces Technologies

IEEE 802.1D: Spanning Tree Protocol (STP) STP forms a spanning tree where interfaces are blocked to avoid loops in the network Switches communicate using 2 types of BPDU’s (Bridge Protocol Data Units): - Configuration BPDU’s (at start-up) - Topology Change Notification BPDU’s and their acknowledgements (during operation) The spanning tree is built automatically STP will also result in a higher reliability Technologies

Configuration procedure: Step 1: all ports in blocking mode Step 2: choose a root switch Step 3: minimum spanning tree algorithm calculated in a distributed way using the Port Path Costs (cf. Kruskal) Step 4: ports will change to forwarding mode based on spanning tree How to choose the root switch? Based on (lowest) Bridge ID Bridge ID format: Bridge priority (2 bytes) MAC address (6 bytes) Technologies

STP timers: Hello time: BPDU send interval (=2 sec) Forward delay: transition delay from blocking to forwarding mode Max Age: time the BPDU’s are valid Recovery times STP: With standard timers = sec Acceptable in small LAN environment, but not for large Ethernet networks STP extension for fast recovery after failures: IEEE 802.1w Rapid Spanning Tree Protocol (RSTP) (recovery in order of seconds) The collection of bridges in a LAN can be considered a graph whose nodes are the bridges and whose edges are the cables connecting the bridges. To break loops in the LAN while maintaining access to all LAN segments, the bridges collectively compute a spanning tree. The spanning tree is not necessarily a minimum cost spanning tree. A network administrator can reduce the cost of a spanning tree, if necessary, by altering some of the configuration parameters in such a way as to affect the choice of the root of the spanning tree. The spanning tree that the bridges compute using the Spanning Tree Protocol can be determined using the following rules. Elect a root bridge. The root bridge of the spanning tree is the bridge with the smallest bridge ID. Each bridge has a unique identifier (ID) and a configurable priority number; the bridge ID contains both numbers. To compare two bridge IDs, the priority is compared first. If two bridges have equal priority, then the MAC addresses are compared. A network administrator can determine which bridge is the root by configuring its priority to be higher (lower priority number) than any other bridge on the LAN. Determine the least cost paths to the root bridge. The computed spanning tree has the property that messages from any connected device to the root bridge traverse a least cost path, i.e., a path from the device to the root that has minimum cost among all paths from the device to the root. The cost of traversing a single network segment is configurable; the cost of traversing a path is the sum of the costs of the segments on the path. Different technologies have different default costs for network segments. Also, an administrator can configure the cost of traversing a particular network segment. Technologies

Spanning Tree Protocol: Example
RP: Root Port DP: Designated Port BP: Blocked Port router 1 root 8 DP RP switch 7 5 BP DP The property that messages always traverse least-cost paths to the root is guaranteed by the following two rules. Least cost path from each bridge. After the root bridge has been chosen, each bridge determines the cost of each possible path from itself to the root. From these, it picks the one with the smallest cost (the least-cost path). The port connecting to that path becomes the root port of the bridge. Least cost path from each network segment. The bridges on a network segment collectively determine which bridge has the least-cost path from the network segment to the root. The port connecting this bridge to the network segment is then the designated port for the segment. Disable all other root paths. Any active port that is not a root port or a designated port is a blocked port. Modifications in case of ties. The above rules over-simplify the situation slightly, because it is possible that there are ties, for example, two or more ports on a single bridge are attached to least-cost paths to the root or two or more bridges on the same network segment have equal least-cost paths to the root. To break such ties: Breaking ties for root ports. When multiple paths from a bridge are least-cost paths, the chosen path uses the neighbor bridge with the lower bridge ID. The root port is thus the one connecting to the bridge with the lowest bridge ID. Breaking ties for designated ports. When more than one bridge on a segment leads to a least-cost path to the root, the bridge with the lower bridge ID is used to forward messages to the root. The port attaching that bridge to the network segment is the designated port for the segment. The final tie-breaker. In some cases, there may still be a tie, as when two bridges are connected by multiple cables. In this case, multiple ports on a single bridge are candidates for root port or designated port. In this case, the port with the lowest port number is used. 3 2 6 4 hub Technologies

Virtual LAN (VLAN) (Switched) LAN: Local area network where different hosts are interconnected via switches. They can communicate without limitation. Virtual LAN (VLAN): Defines a subset of the hosts that are able to communicate within a single VLAN. No layer 2 communication between VLAN’s. VLAN’s allow more flexible management of the network. Different VLAN implementations: Untagged (port based) Tagged (802.1Q) Technologies

Virtual LAN (VLAN): port based
A port is mapped on a VLAN (VLAN ID), (typically manual configuration) Ports will communicate only with other ports having the same VLAN ID Logically separate networks (different IP subnets)  traffic between VLAN’s via external router No tags are used Example : VLAN 1: ports 1,2,5,7 VLAN 2: ports 3,4,6 VLAN 1 VLAN 2 Technologies

Virtual LAN (VLAN): port based
IP router VLAN 1 VLAN 2 3 separate links VLAN 3 3 separate links D C A B Multiple VLAN’s require separate ports Interconnection via IP router Technologies

Virtual LAN (VLAN): tagged
Untagged frame: a frame that does not contain a tag header (tag not necessary in port based VLAN’s) Tagged frame: a frame that contains a tag header immediately following the Source MAC Address field of the frame. There are two types of tagged frames: VLAN-tagged frames and priority tagged frames: • VLAN-tagged frame: A tagged frame whose tag header carries both VLAN identification and priority information • priority-tagged frame: A tagged frame whose tag header carries priority information, but carries no VLAN identification information (VID = 0) VLAN-aware: A property of switches or end stations that recognize and support VLAN-tagged frames Technologies

Virtual LAN (VLAN): tagged
Standard IEEE Ethernet Frame format preamble SFD DA SA T/L data FCS SFD (Start-of-Frame Deliniter) Extra information is inserted preamble SFD DA SA TPID TAG T/L data FCS TPID (Tag Protocol Identifier) = 0x8100 Parts of basic Ethernet MAC frame for 10/100Mbps Ethernet : Preamble (7 bytes) SFD: Start-of -Frame Delimiter (1 byte) DA: Destination Address (6 bytes) SA: Source Address (6 bytes) T/L: Ethertype/length (2 bytes) FCS: Frame Check Sequence (4 bytes) Additional information for tagged VLAN (802.1Q): TPID: Tag Protocol Identifier (2 bytes), fixed value Priority: 3 bits CFI: Canonical Format Indicator (1 bit); CFI is used for compatibility between Ethernet and Token Ring networks VLAN-id: 12 bits User priority CFI VLAN identifier CFI (Canonical Format Indicator) = 0 (for ethernet) Technologies

Virtual LAN (VLAN): tag based
1 link or 3 separate links 1 single link D C A B Automatic configuration of tagged VLAN: through the GVRP protocol. (E.g. define VLANs on “source switch” D; set VLANs to use at NICs of clients; intermediate switches will be configured automatically by enabling GVRP) GVRP = GARP VLAN Registration Protocol GARP = Generic Attribute Registration Protocol NIC = Network Interface Card Multiple VLAN’s can use a single port (due to tagging) Interconnection via IP router Automatic configuration possible Technologies

Note: what happened to IPv5? In the late 1970’s, a protocol named ST — The Internet Stream Protocol — was created for the experimental transmission of voice, video, and distributed simulation. Two decades later, this protocol was revised to become ST2 (see RFC 1819: “[it] is an experimental connection-oriented internetworking protocol that operates at the same layer as connectionless IP. It has been developed to support the efficient delivery of data streams to single or multiple destinations in applications that require guaranteed quality of service.”) It was assigned IP version number 5. (Version numbers 0 through 3 were development versions of IPv4 used between 1977 and 1979.) Technologies

IPv6 Why a new standard? - exhaust of IP address space
- learn from experience with IPv4 - Increase address space from 32 bits to 128 bits - Introduce anycast addresses - Use streamlined 40 bytes header - Introduce the notion of a flow (e.g. audio and video flows) - Support traffic classes (see e.g. DSCP in DiffServ) Example: send request to any server of a certain type, routing system will deliver only to nearest server In the early 1990s, the Internet Engineering Task Force started an effort to develop a successor to the IPv4 protocol. A prime motivation for this effort was the realization that the 32-bit IP address space was beginning to be spent, with new networks and IP nodes being attached to the Internet (and being allocated unique IP addresses) at a breathtaking rate. To respond to this need for a large IP address space, a new IP protocol, IPv6, was developed. The designers of IPv6 also took this opportunity to tweak and augment other aspects of IPv4, based on the accumulated operational experience with IPv4. Expanded addressing capabilities. IPv6 increases the size of the IP address from 32 to 128 bits. This ensures that the world won't run out of IP addresses. Now, every grain of sand on the planet is IP-addressable. In addition to unicast and multicast addresses, a new type of address, called an anycast address, has also been introduced, which allows a packet addressed to an anycast address to be delivered to any one of a group of hosts. (This feature could be used, for example, to send an HTTP GET to the nearest of a number of mirror sites that contains a given document.) A streamlined 40-byte header. A number of IPv4 fields have been dropped or made optional. The resulting 40-byte fixed-length header allows faster processing of the IP datagram. A new encoding of options allows more flexible options processing. Flow labeling and priority. IPv6 has an elusive definition of a "flow”. This allows labeling of packets belonging to particular flows for which the sender requests special handling, such as a non-default quality of service or real-time service. For example, audio and video transmission might likely be treated as a flow. On the other hand, the more traditional applications, such as file transfer and might not be treated as flows. It is possible that the traffic carried by a high-priority user (for example, someone paying for better service for their traffic) might also be treated as a flow. What is clear, however, is that the designers of IPv6 foresee the eventual need to be able to differentiate among the "flows," even if the exact meaning of a flow has not yet been determined. The IPv6 header also has an eight-bit Traffic Class field. This field, like the TOS field in IPv4, can be used to give priority to certain packets within a flow, or it can be used to give priority to datagrams from certain applications (for example, ICMP packets) over datagrams from other applications (for example network news). Technologies

destination address (128 bit)
IPv6 payload length (16) traffic class (8) version(4) flow label (20) next header (8) hop limit (8) source address (128 bit) destination address (128 bit) payload The IPv6 datagram format is shown in the figure. The following fields are defined in IPv6: Version: This four-bit field identifies the IP version number. Not surprisingly, IPv6 carries a value of "6" in this field. Note that putting a "4" in this field does not create a valid IPv4 datagram. (If it did, life would be a lot simpler --> see the discussion below regarding the transition from IPv4 to IPv6.) Traffic class: This eight-bit field is similar in spirit to the ToS field we saw in IP version 4. Flow label: As discussed above, this 20-bit field is intended to identify a "flow" of datagrams (i.e. giving real-time applications special service). [Originally created for giving real-time applications special service, but currently unused] Payload length: This 16-bit value is treated as an unsigned integer giving the number of bytes in the IPv6 datagram following the fixed-length, 40-byte packet header. Next header: This field identifies the protocol to which the contents (data field) of this datagram will be delivered (for example, to TCP or UDP). The field uses the same values as the Protocol field in the IPv4 header. Hop limit: The contents of this field is decremented by one by each router that forwards the datagram. If the hop limit count reaches zero, the datagram is discarded. (= TTL of IPv4) Source and destination address: The various formats of the IPv6 128-bit address are described in RFC The representation is by 8 16-bit numbers (in HEX notation). Data: This is the payload portion of the IPv6 datagram. When the datagram reaches its destination, the payload will be removed from the IP datagram and passed on to the protocol specified in the next header field. IP address: 8 x 16bit numbers in HEX example: 3FFE:80B0:0:1:A00:20FF:FEA2:8DBC Technologies

destination address (128 bit)
IPv6 payload length traffic class version flow label next header hop limit source address (128 bit) destination address (128 bit) payload No fragmentation No checksum No options (but possible via next header) Fixed length of 40 bytes Ipv6 header next header = TCP TCP+data A number of fields disappeared in the IPv6 header: Fragmentation/Reassembly. IPv6 does not allow for fragmentation and reassembly at intermediate routers; these operations can be performed only by the source and destination. If an IPv6 datagram received by a router is too large to be forwarded over the outgoing link, the router simply drops the datagram and sends a "Packet Too Big" ICMP error message back to the sender. The sender can then resend the data, using a smaller IP datagram size. Fragmentation and reassembly is a time-consuming operation; removing this functionality from the routers and placing it squarely in the end systems considerably speeds up IP forwarding within the network. Note that fragmentation may be supported by the “next header” (similar fields as in IPv4 fragmentation). Checksum. Because the transport layer (for example, TCP and UDP) and data link (for example, Ethernet) protocols in the Internet layers perform checksumming, the designers of IP probably felt that this functionality was sufficiently redundant in the network layer that it could be removed. Once again, fast processing of IP packets was a central concern. Recall from the chapter describing the Internet Protocol (IP), that since the IPv4 header contains a TTL field (similar to the hop limit field in IPv6), the IPv4 header checksum needed to be recomputed at every router. As with fragmentation and reassembly, this was also a costly operation in IPv4. Options. An options field is no longer a part of the standard IP header. However, it was not removed. Instead, the options field is one of the possible "next headers" pointed to from within the IPv6 header. That is, just as TCP or UDP protocol headers can be the next header within an IP packet. The removal of the options field results in a fixed length, 40-byte IP header. Two examples of next headers are illustrated: a fragment header and a routing header. The routing header allows for the specification of a source route (similar to IPv4). Note: Since dec. 2007, the “Routing header 0” has been deprecated to avoid “ping-pong”, see RFC 5095. Ipv6 header next header=routing routing header next header=TCP TCP+data Ipv6 header next header=routing routing header next header=fragment fragment header next header=TCP TCP+data Routing header: strict or loose source route (similar to IPv4) Fragment header: similar to IPv4 Technologies Les 7-8

CHAPTER 5 Advanced Networking Technologies

Similar presentations

Presentation on theme: "CHAPTER 5 Advanced Networking Technologies"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CHAPTER 5 Advanced Networking Technologies

Similar presentations

Presentation on theme: "CHAPTER 5 Advanced Networking Technologies"— Presentation transcript:

Similar presentations

About project

Feedback