Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet Protocol m Datagram format m IPv4 addressing m ICMP m IPv6 r 4.5 Routing algorithms m Link state m Distance Vector m Hierarchical routing r 4.6 Routing in the Internet m RIP m OSPF m BGP r 4.7 Broadcast and multicast routing
Recall: Subnets
IP addressing: CIDR CIDR: Classless InterDomain Routing m subnet portion of address of arbitrary length m address format: a.b.c.d/x, where x is # bits in subnet portion of address Subnet part or CIDR-block host part /23
IP addresses: how to get one? Q: How does network get subnet part of IP addr? A: gets allocated portion of its provider ISP’s address space ISP's block /20 Organization /23 Organization /23 Organization /23... ….. …. …. Organization /23
Hierarchical addressing: route aggregation “Send me anything with addresses beginning /20” / / /23 ISP1 Organization 0 Organization 7 Internet Organization 1 ISP2 “Send me anything with addresses beginning /16” /23 Organization Hierarchical addressing allows efficient advertisement of routing information: This way, the whole 32 bit address does not need to be examined Border Router
Hierarchical addressing: more specific routes ISP2 has a more specific route to Organization 1 “Send me anything with addresses beginning /20” / / /23 ISP1 Organization 0 Organization 7 Organization 1 ISP2 “Send me anything with addresses beginning /16 or /23” /23 Organization Internet Border Router
Longest prefix matching Prefix Match Link Interface / / /16 1 otherwise 2 Border Router Forwarding Table If a packet with destination address arrives at the boarder router, then is it forwarding to interface 0 or 1? Since interface 1 has a longer match, it goes to interface 1
A Problem with Longest Match and subnetting “Send me anything with addresses beginning ……” / / /23 ISP1 Organization 0 Organization 7 Internet Organization 1 ISP2 “Send me anything with addresses beginning … /23 Organization In order to improve reliability, organization 7 has a backup link with ISP1. This way, if ISP1 has problems or ISP1’s provider has problems, then organization 7 is still reachable. Will this work? Border Router
Hierarchical Routing scale: with 200 million destinations: r can’t store all dest’s in routing tables! m Memory for address table must be very fast How fast? How long can an address lookup take on a 10GBit interface? r routing table exchange would swamp links! m There are ~ 1 million links m If link state was exchanged every 10 seconds and each link state is 20B, then each router receives and processes 160Mbps in link announcements m But, perhaps, only changes in link state could be distributed. administrative autonomy r internet = network of networks r each network admin wants to control routing in its own network m ATT does not want sprint to know what their topology is Trade secret Improves security m ATT wants to select a routing protocol and parameters without getting sprints permission Our routing study thus far - idealization r all routers identical r network “flat” … not true in practice
Hierarchical Routing r aggregate routers into regions, “autonomous systems” (AS) r Single administrative domain r Routers in the same AS run same routing protocol m “intra-AS” routing protocol m routers in different AS can run different intra-AS routing protocol r An ISP may be made of 1 or more ASs m ATT-USA = 1 AS and ATT-Europe is another m Some stub networks are an AS UD is an AS Some companies have routers but are not Ass r ASs have their own number, assigned by ICANN r There are ~50K ASs Gateway router r Direct link to router in another AS r Gateway routers run a common inter-networking routing protocol r For inter-AS routing, the destinations are always ASs m Actually, destinations are always ASs. But for inter-AS routing, it does not make much sense for a destination to be a single end- host.
C A B / / /22 Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table AS1 AS2 E These tables are made with RIP, OSPF, ISIS, etc Stub network (at the edge of the network) Service provider of AS1 (e.g., AS1=UD and AS2=cogent) The rest of the internet Simple example Connections to other ASs and the rest of the Internet (Recall that ASs (ISPs) sometiems meet at NAPs. E.g., google: MAE- East) An AS could also meet its provider at a POP.
C A B / / /22 Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table AS1 AS2 E These tables are made with RIP, OSPF, ISIS, etc Stub network (at the edge of the network) Service provider of AS1 (e.g., AS1=UD and AS2=cogent) The rest of the internet /32 E A A Q: How can routers in AS1 know where to send pkts with destination not in AS1? A: Easy, if a pkt is for an “unknown” address, send it to B. Specifically, B advertises a link to prefix /0 This is called a default route, and it can be statically set (no need for any routing protocol beside OSPF)
C A B / / /22 Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table Prefix Next hop /23 C /22 A Forwarding table AS1 AS2 AS / / /16 D E These tables are made with RIP, OSPF, ISIS, etc We need to put prefixes /16, /16, /16 in the forwarding tables Specifically, B should announce to A that is can reach /16 and /16, and D should announce it can reach /16 How to get there? 1.B must learn from E that /16 and /16 are reachable through E 2.A must learn that /16 is reachable through D 3.B and A must distribute this information throughout AS1 But 1 and 2 need a exterior inter-networking routing protocol 3 need interior inter-networking routing protocol EBGP and IBGP – border gateway routing protocol can accomplish this
3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b Intra-AS Routing algorithm Inter-AS Routing algorithm Forwarding table 3c Interconnected ASes r forwarding table configured by both intra- and inter-AS routing algorithm m intra-AS sets entries for internal dests m inter-AS & intra-As sets entries for external dests
Example: Setting forwarding table in router 1d r suppose AS1 learns (via inter-AS protocol) that subnet x reachable via AS3 (gateway 1c) but not via AS2. r inter-AS protocol propagates reachability info to all internal routers. r router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c. m installs forwarding table entry (x,I) 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c x …
Example: Choosing among multiple ASes r now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. r to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. m this is also job of inter-AS routing protocol! m If both gateways are equivalent, then the intra-AS routing protocol will route packets to the best gateway This is called hot potato routing: send packet towards closest of two routers. 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c x … …
/16 Hot Potato Routing AS1 AS2 A B Pkt arrives with dest in /16 AS2 could give send the pkt to gateway B – hot potato routing. But AS1 would prefer AS2 to carry its own traffic. So AS1 might require that AS2 gives higher priority to gateway A. In which case, AS1 could inject traffic into AS2 with destination in /16 at gateway B
Learn from inter-AS protocol that subnet x is reachable via multiple gateways Use routing info from intra-AS protocol to determine costs of least-cost paths to each of the gateways Hot potato routing: Choose the gateway that has the smallest least cost Determine from forwarding table the interface I that leads to least-cost gateway. Enter (x,I) in forwarding table Example: Choosing among multiple ASes r now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. r to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. m this is also job of inter-AS routing protocol! r hot potato routing: send packet towards closest of two routers.
Internet inter-AS routing: BGP r BGP (Border Gateway Protocol): the de facto standard r BGP provides each AS a means to: 1. Obtain subnet reachability information from neighboring ASs. 2. Propagate reachability information to all AS- internal routers. 3. Determine “good” routes to subnets based on reachability information and policy. r allows subnet to advertise its existence to rest of Internet: “I am here”
BGP basics r pairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections: BGP sessions m BGP sessions need not correspond to physical links. r when AS2 advertises a prefix to AS1: m AS2 promises it will forward datagrams towards that prefix. m AS2 can aggregate prefixes in its advertisement But this can cause problems when some prefixes have backup links 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session
Distributing reachability info r using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. m 1c can then use iBGP do distribute new prefix info to all routers in AS1 m 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session r when router learns of new prefix, it creates entry for prefix in its forwarding table. 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session
Path attributes & BGP routes r advertised prefix includes BGP attributes. m prefix + attributes = “route” r two important attributes: m AS-PATH: contains ASs through which prefix advertisement has passed: e.g, AS 67, AS 17 m NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may be multiple links from current AS to next-hop-AS) r when gateway router receives route advertisement, uses import policy to accept/decline.
BGP route selection r router may learn about more than 1 route to some prefix. Router must select route. r elimination rules: 1. local preference value attribute: policy decision 2. shortest AS-PATH 3. closest NEXT-HOP router: hot potato routing 4. additional criteria
BGP messages r BGP messages exchanged using TCP. r BGP messages: m OPEN: opens TCP connection to peer and authenticates sender m UPDATE: advertises new path (or withdraws old) m KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request m NOTIFICATION: reports errors in previous msg; also used to close connection
BGP routing policy r A,B,C are provider networks r X,W,Y are customer (of provider networks) r X is dual-homed: attached to two networks m X does not want to route from B via X to C m.. so X will not advertise to B a route to C A B C W X Y legend : customer network: provider network
BGP routing policy (2) r A advertises path AW to B r B advertises path BAW to X r Should B advertise path BAW to C? m No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers m B wants to force C to route to w via A m B wants to route only to/from its customers! A B C W X Y legend : customer network: provider network
BGP route processing r BGP advertises and withdraws paths with the UPDATE message r UPDATE has three fields m Router to withdraw m Attributes of routes to prefixes in NLRI m NLRI r The NLRI is a list of prefixes that the list of attributes applies to. If two prefixes have different attributes, then these two prefixes need to be announced with different UPDATE messages. r In OSPF each path is a list of routes and a total cost (two attributes). In BGP, routes have many attributes, cost (in AS hops) is but one. input policy engine routing decision routing table output policy engine configuration from peers to peers
RIBs r Routing information base (RIB) – a list of routes (attributes and all) m Adj-RIB-In: RIB learned from neighbor (many of these) m Adj-RIB-Out: RIB to be sent to neighbor (many of these) m Loc-RIB: RIB for local use (only one of these) Adj-rib-in peer Input Policy engine Adj-rib-in peer Adj-rib-in peer Adj-rib-in peer BGP decision Loc-RIB Input Policy engine Adj-rib-out peer Adj-rib-out peer Adj-rib-out peer Adj-rib-out peer
Sample routing environment AS1 AS2 input policy engine deny 0/0 from AS1 Give /24 form AS1 better preference Accept other routes /24 0/ / /24 0/0 decision process routes Use 0/0 from AS2 Use /24 from AS1 Use /24 from AS2 Use /24 from AS5 (this AS) AS3 AS4 output policy engine Do not propagate 0/0 Do not send /24 to AS4 Give /24 with metric = 10 to AS /24 path=(AS5, AS2) /24 path=(AS5, AS1) metric= /24 path=(AS5) /24 path=(AS5 AS1)
Fun with BGP r Routeviews.org collects and archives BGP announcements r One way to use routeviews is with dig m At the linux prompt m dig txt aspath.routeviews.org m Outputs various stuff and Answer section: –4.128.aspath.routeviews.org 600 IN TXT “ ” “ ” “16” Syntax = ASPath “Prefix” “prefix length” r Now use whois -h whois.arin.net "a ASXX" to learn about ASs where XX is an AS number. E.g., whois - h whois.arin.net "a AS34" gives information about AS34 r Try with some other AS
Check out a collection of path announcements r Open bgp030408p39.Partial m m An old (2003) partial list of BGP announcements received by several routers r Check which ASs peer with UD (ASN 34)
Why different Intra- and Inter-AS routing ? Policy: r Inter-AS: admin wants control over how its traffic routed, who routes through its net. r Intra-AS: single admin, so no policy decisions needed Scale: r hierarchical routing saves table size, reduced update traffic Performance: r Intra-AS: can focus on performance r Inter-AS: policy may dominate over performance