Interdomain routing V. Arun College of Information and Computer Sciences University of Massachusetts Amherst
Roadmap Hierarchical routing and BGP overview Causes of delayed convergence in BGP Stable Paths Problem and Simple Path Vector Protocol Gao-Rexford conditions Causes of inconsistency in BGP
Hierarchical routing aggregate routers into regions, “autonomous systems” (AS) routers in same AS run same routing protocol “intra-AS” routing protocol routers in different AS can run different intra-AS routing protocol gateway router: at “edge” of its own AS has link to router in another AS Network Layer
Interconnected ASes 3c 3a 2c 3b 2a AS3 2b 1c 1a 1b AS1 1d AS2 Intra-AS Routing algorithm Inter-AS Forwarding table 3c Network Layer
Inter-AS tasks suppose router in AS1 receives datagram destined outside of AS1: router should forward packet to gateway router, but which one? AS1 must: learn which dests are reachable through AS2, which through AS3 propagate this reachability info to all routers in AS1 job of inter-AS routing! 3c 3a 3b 2c AS3 other networks AS1 1c 1a 1d 1b 2a 2b other networks AS2 Network Layer
Example: setting forwarding table in router 1d suppose AS1 learns (via inter-AS protocol) that subnet x reachable via AS3 (gateway 1c), but not via AS2 inter-AS protocol propagates reachability info to all internal routers router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c installs forwarding table entry (x,I) … x 3c 3a 3b 2c AS3 other networks AS1 1c 1a 1d 1b 2a 2b other networks AS2 Network Layer
Example: choosing among multiple ASes now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2. to configure forwarding table, router 1d must determine which gateway it should forward packets towards for dest x this is also job of inter-AS routing protocol! … x 3c …… 3a 3b 2c AS3 other networks AS1 1c 1a 1d 1b 2a 2b other networks AS2 ? Network Layer
Internet inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto inter-domain routing protocol “glue that holds the Internet together” BGP provides each AS a means to: eBGP: obtain subnet reachability information from neighboring ASs. iBGP: propagate reachability information to all AS-internal routers. determine “good” routes to other networks based on reachability information and policy. allows subnet to advertise its existence to rest of Internet: “I am here” Network Layer
BGP basics BGP session: two BGP routers (“peers”) exchange BGP messages: advertises paths to destination network prefixes (“path vector” protocol) exchanged over semi-permanent TCP connections when AS3 advertises a prefix to AS1: AS3 promises it will forward datagrams towards that prefix AS3 can aggregate prefixes in its advertisement 3c 3a BGP message 3b 2c AS3 other networks AS1 1c 1a 1d 1b 2a 2b other networks AS2 Network Layer
BGP basics: distributing path information using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP to distribute prefix info to all routers in AS1 1b can then re-advertise reachability info to AS2 over 1b-to-2a eBGP session when router learns of new prefix, it creates entry for prefix in its forwarding table. route propagation eBGP session 3a 3b iBGP session 2c AS3 other networks 1c 2a 2b other networks 1a 1b AS2 1d AS1 Network Layer
Path attributes and BGP routes advertised prefix includes BGP attributes prefix + attributes = “route” two important attributes: AS-PATH: contains ASs through which prefix advertisement has passed: e.g., [AS 67, AS 17, AS 24] NEXT-HOP: indicates specific internal-AS router to next-hop AS. (multiple links may exist from self to next-hop-AS) policy-based routing: gateway router receiving route advertisement uses import policy to select/reject route and export policy to re-advertise route e.g., select cheaper route; or never route through AS x; or never advertise routes to AS y. Network Layer
BGP route selection (import policy) router may learn about more than 1 route to destination AS, selects route based on: local preference value attribute: policy decision shortest AS-PATH Multi-exit discriminator (MED) eBGP > iBGP closest NEXT-HOP router: hot potato routing router ID Network Layer
BGP re-announce (export policy) Routers commonly use “valley-free” routing export policy Never advertise peer or provider routes to another peer or provider. Network Layer
BGP routing policy A,B,C are provider networks X Y legend: customer network: provider network A,B,C are provider networks X,W,Y are customer (of provider networks) X is dual-homed: attached to two networks X does not want to route from B via X to C .. so X will not advertise to B a route to C Network Layer
BGP routing policy (2) A advertises path AW to B X Y legend: customer network: provider network A advertises path AW to B B advertises path BAW to X Should B advertise path BAW to C? No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers B wants to force C to route to w via A B wants to route only to/from its customers! Network Layer
BGP re-announce (export policy) Routers commonly use “valley-free” routing export policy Never advertise peer or provider routes to another peer or provider. Examples (arrows indicate $ flow or customer provider relationship, else peering): Q: Which of the above routes are permitted by “valley free” export policy? Network Layer
Why different Intra-, Inter-AS routing ? policy: inter-AS: admin wants control over how its traffic routed, who routes through its net. intra-AS: single admin, so no policy decisions needed scale: hierarchical routing saves table size, reduced update traffic performance: intra-AS: can focus on performance inter-AS: policy may dominate over performance Network Layer
Min Route Advertisement Interval (MRAI) Delayed BGP convergence Min Route Advertisement Interval (MRAI)
Delayed BGP convergence (1) Labovitz et al. fault injection measurement study: Tlong and Tdown take longer to converge, i.e., bad news propagates slower
Delayed BGP convergence (2) Tdown events across 5 ISPs from Labovitz et al. study: minutes of convergence delay
BGP convergence delay: Why? Q: What explains such high convergence delays on the order of several minutes in BGP? Count-to-infinity problems? No because BGP is a path vector, not distance vector, protocol, so loops easy to detect Policy routing? What is special about policy routing compared to shortest path routing?
BGP bouncing problem Assumptions: shorter paths preferred over longer paths. No particular preference between equal length paths.
Key problem: path exploration In an n-node clique with no export policy restrictions, there exist Omega((n-1)!) possible paths (n-1)! paths of length n-1 + C(n-1,n-2).(n-2)! paths of length n-2 + … + C(n-1,2).2! paths of length 2 + (n-1) paths of length 1 Message ordering matters Not every sequence of events results in exploring all or most of the Omega((n-1)!) paths Need a way to limit spurious path propagation
Min Route Advertisement Interval timer Key idea: wait long enough to see the effect of a change from all neighbors before reacting again Clique example Forces nodes to explore strictly increasing length paths Convergence in O(n) rounds Message complexity?
BGP bouncing with MRAI timers
Route Flap Damping (RFD) Delayed BGP convergence Route Flap Damping (RFD)
Route flap damping MRAI timers can limit spurious path exploration in response to a common root cause, but are not designed to limit the number of root cause events Route flap damping (RFD) intended to suppress persistent route flux due to say flaky routers Key idea: Limit the frequency of route changes to a prefix, no matter the reason
RFD penalty function example
RFD: Things get murkier! RFD can worsen convergence delay. Suppose node 1 in this 5-node clique flaps its route to d just twice, once withdrawing and then re-advertising.
RFD: Things get murkier!
Selective RFD scheme Key idea: Do not apply RFD to secondary flaps, i.e., when announced routes are increasingly more preferred or increasingly less preferred but apply RFD when the preference direction changes, e.g., a node announces a more preferred route compared to the previous route followed by a less preferred route. Secondary flaps are generally monotonically changing (increasing or decreasing) in preference
Stable Paths Problem & Simple Path Vector Protocol Formal model of BGP Stable Paths Problem & Simple Path Vector Protocol
Stable Paths Problem A simple formal model specifying Single-node AS graph and single prefix Strict path ranking functions at nodes Adoption of highest-ranked peer route Stability: when no node node wishes to change route Can we determine if a stable configuration exists? Do all ranking functions (or import policies) result in a stable configuration?
Examples: stable, shortest path
Examples: stable, non-shortest path
Examples: unsafe unstable non-shortest path Unsafe: can persistently oscillate Unstable: will persistently oscillate
Examples: stable, non-unique solution
Theoretical results Solvability: The problem of determining whether an instance of SPP has a solution is NP-complete. Sufficient condition: If an SPP instance has no dispute wheel, it is solvable and has a unique solution Converse is not true, i.e., dispute wheel does not preclude stability, safety, or uniqueness of solution
Dispute wheel Key idea: a cyclic in route preferences amongst a set of nodes, e.g., A prefers to route through B, B prefers to route through C, and C prefers to route through A Each ui prefers the counter-clockwise neighbor route RiQi+1 over the direct route Qi Subscripts are mod k, so uk-1 prefers Rk-1Q0
Simple Path Vector Protocol (SPVP) A simple model of BGP that receives into RIB most recent route announced by a neighbor if FIB entry is not the best route available in RIB Apply import policy to adopt best RIB route in FIB Re-advertise adopted route to all neighbors Fair activation sequence: any sent and queued message will be eventually processed by intended recipient Convergence result: If an instance of SPP has no dispute wheel, then SPVP is guaranteed to converge.
Gao-Rexford Conditions
Causes of inconsistency in BGP