Internet Routing (COS 598A) Today: Topology Size Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm
BGP distribution inside an AS Outline BGP distribution inside an AS Full-mesh iBGP Route reflectors Routing anomalies caused by route reflectors Pros and cons of proposed solutions IGP topology OSPF areas Summarization Multiple ASes
Routers Running eBGP, iBGP, and IGP destination AS A AS B r1 r2 r3 r4 Legend eBGP session iBGP session IGP link 2 1 2 r1 r2 r4 4 Route r1 has closest egress point
Roles of eBGP, iBGP, and IGP eBGP: External BGP Learn routes from neighboring ASes Advertise routes to neighboring ASes iBGP: Internal BGP Disseminate BGP information within the AS IGP: Interior Gateway Protocol Compute shortest paths between routers in AS Identify closest egress point in BGP path selection
Full Mesh iBGP Configuration Internal BGP session Forward best BGP route to a neighbor Do not send from one iBGP neighbor to another Full-mesh configuration iBGP session between each pair of routers Ensures complete visibility of BGP routes
Why Do Point-to-Point Internal BGP? Reusing the BGP protocol iBGP is really just BGP … except you don’t add an AS to the AS path … or export routes between iBGP neighbors No need to create a second protocol Another protocol would add complexity And, full-mesh is workable for many networks Well, until they get too big…
Scalability Limits of Full Mesh on the Routers Number of iBGP sessions TCP connection to every other router Bandwidth for update messages Every BGP update sent to every other router Storage for the BGP routing table Storing many BGP routes per destination prefix Configuration changes when adding a router Configuring iBGP session on every other router
Relax the iBGP propagation rule Route reflector Route Reflectors Relax the iBGP propagation rule Allow sending updates between iBGP neighbors Route reflector Receives iBGP updates from neighbors Send a single BGP route to the clients Very much like provider, peer, and customer To client: send all BGP routes To peer route reflector: send client-learned routes To route reflector: send all client-learned routes
Example: Single Route Reflector 1 2 2 4 r2 Router only learns about r2
The Problem About Route Reflectors Advantage: scalability Fewer iBGP sessions Lower bandwidth for update messages Smaller BGP routing tables Lower configuration overhead Disadvantage: changes the answers Clients only learn a subset of the BGP routes Does not result is same choices as a full mesh ... especially if RR sees different IGP distances
Routing Anomaly: Forwarding Loop 1 r2 r1 1 1 Picks r2 Picks r1 Packet deflected toward other egress point, causing a loop
Routing Anomaly: Protocol Oscillation 1 2 3 1 1 1 5 5 5 r1 r2 r3 RR1 prefers r2 over r1 RR2 prefers r3 over r2 RR3 prefers r1 over r3
Avoiding Routing Anomalies Reduce impact of route reflectors Ensure route reflector is close to its clients … so the RR makes consistent decisions Sufficient conditions for ensuring consistency RR preferring routes through clients over “peers” BGP messages should traverse same path as data Forces a high degree of replication Many route reflectors in the network E.g., a route reflector per PoP for correctness E.g. have a second RR per PoP for reliability
Possible Solution: Disseminating More Routes Make route reflectors more verbose Send all BGP routes to clients, not just best route Send all equally-good BGP routes (up to IGP cost) Advantages Client routers have improved visibility Make the same decisions as in a full mesh Disadvantages Higher overhead for sending and storing routes Requires protocol changes to send multiple routes Not backwards compatible with legacy routers r1 r2 r3 r4 r1 r2 r4 1 2 2 4 r1, r2, r4
Possible Solution: Customized Dissemination Make route reflector more intelligent Send customized BGP route to each client Tell each client what he would pick himself Advantages Make the same decisions as in a full mesh Remain compatible with legacy routers Disadvantages Intelligent RR must make decisions per client … and select closest egress from each viewpoint r1 r2 r3 r4 r1 r2 r4 1 2 2 4 r1
Possible Solution: Multicast/Flooding Replace point-to-point distribution Apply a multicast protocol to distribute messages Or, flood the BGP messages to all routers Advantages Complete distribution without route reflectors Avoids configuration overhead of a full mesh Disadvantages Requires an additional, new protocol Not backwards-compatible with legacy routers Large BGP routing tables, like in a full mesh
Possible Solution: Tunnel Between Edge Routers Tunneling through the core Ingress router selects ingress point Other routers blindly forward to the egress Advantages No risk of forwarding loops No BGP running on interior routers Disadvantages Overhead of tunneling protocol/technology Still has a risk of protocol oscillations r1 r2 1 tunnel
State-of-the-Art of BGP Distribution in an AS When full-mesh doesn’t scale Hierarchical route-reflector configuration One or two route reflectors per PoP Some networks use “confederations” (mini ASes) Recent ideas Sufficient conditions to avoid anomalies Enhanced RRs sending multiple or custom routes Flooding/multicast of BGP updates Tunneling to avoid packet deflections Open questions Are the sufficient conditions too restrictive? Good comparison of the various approaches
IGP Topology
Interior Gateway Protocols (IGPs) Protocol overhead depends on the topology Bandwidth: flooding of link state advertisements Memory: storing the link-state database Processing: computing the shortest paths 2 1 3 1 3 2 1 5 4 3
Dijkstra’s shortest-path algorithm Improving the Scaling Dijkstra’s shortest-path algorithm Simplest version: O(N2), where N is # of nodes Better algorithms: O(L*log(N)), where L is # links Incremental algorithms: great for small changes Timers to pace operations Minimum time between LSAs for the same link Minimum time between path computations More resources on the routers Routers with more CPU and memory
Introducing Hierarchy: OSPF Areas Divide network into regions Backbone (area 0) and non-backbone areas Each area has its own link-state database Advertise only path distances at area boundaries Area 0 Area 1 Area 2 Area 3 Area 4 area border router 3 2 1 4 5 To area 0 cost 3 cost 8
Summarization at Area Boundaries Areas only help so much Advertising path costs to reach each component Single link failure may change multiple path costs Summarization: LSA for multiple components LSA for an IP prefix containing the addresses LSA carries cost for the maximum path cost 2 1 3 1 To area 0 3 2 1 5 cost 8 4 3
Assigning OSPF Areas Group related routers Enable summarization E.g., in a Point-of-Presence Assign to single OSPF area Put inter-PoP links in area 0 Enable summarization Select an address block for the equipment in the area Assign IP addresses in the block to router CPUs and interfaces Inter-PoP Intra-PoP Other networks
Pros and Cons of Summarization Advantages: scalability Reduce the size of the link-state database One entry per summary prefix Isolate the rest of the network from changes Only advertise when max path cost changes Disadvantages Complexity Extra configuration details for areas & summarization Requires tight coupling with IP address assignment Inefficiency Summarization hides details that affect path selection Data packets may traverse a less-attractive path
Dividing into Multiple ASes Divide the network into regions Separate instance of IGP per region Interdomain routing between regions Loss of visibility into differences within region 50 100 50 100 50 100 20 20 20 20 20 20 100 50 100 50 100 50 North America Europe Asia
Multi-AS Networks, Not Just for Scalability Administrative reasons Separate networks per geographic region Mergers/acquisitions that combine networks Why not merge to single AS? Using different intradomain protocols Managed by different people Fear of encountering scalability problems Fear of losing the benefits of isolation Why merge to a single AS? Simpler configuration More efficient routing Avoid having separate AS hop in BGP AS paths
Which Approach is Better? Ideal: flat IGP network Single AS Single IGP instance, no areas Hierarchical IGP Single IGP instance, using areas & summarization Multiple ASes Separate IGP instances Some other approach???
Scalability Efficiency Comparison Metrics Protocol overhead Storing and flooding link-state advertisements Overhead of Dijkstra shortest-path computation Effects of topology changes Number of advertisements after a change Likelihood a change must be propagated Efficiency Stretch: comparing path lengths In ideal flat intradomain routing In alternative scheme How much longer do the paths get?
Interesting Research Questions Routing protocols that achieve small stretch Theory work on algorithms to minimize stretch Protocol work on hierarchy and aggregation Any new distributed protocols with low stretch? Avoid sharp boundaries between areas/ASes? Identifying good places to hide information Given a network graph with link weights Decide where to put area and AS boundaries … with the goal of minimizing stretch … within some max size of each area or AS
Networks are getting bigger Conclusion Networks are getting bigger Growth of a network topology Merger/acquisition of other networks Techniques for scaling the routing design BGP route reflection OSPF areas Multiple BGP ASes Relatively open research area Rich theoretical tradition on compact routing Common operational practices for protocol scaling Not much work has been done in between
Next Time: Router Configuration Two papers “Automated provisioning of BGP customers” (just sections 1-3) “Detecting BGP faults with static analysis” Review only of second paper Summary Why accept Why reject Future work Optional Short survey on BGP routing policies for ISPs NANOG video covering material in second paper