Stable Internet Routing Without Global Coordination Jennifer Rexford AT&T Labs--Research Joint work with Lixin Gao
Internet Architecture Divided into Autonomous Systems –Distinct regions of administrative control –Set of routers and links managed by a single institution –Service provider, company, university, … Hierarchy of Autonomous Systems –Large, tier-1 provider with a nationwide backbone –Medium-sized regional provider with smaller backbone –Small network run by a single company or university Interaction between Autonomous Systems –Internal topology is not shared between ASes –… but, neighboring ASes interact to coordinate routing
Autonomous Systems (ASes) Client Web server Path: 6, 5, 4, 3, 2, 1
Interdomain Routing Convergence Challenges Must scale –Destination address blocks: 150,000 and growing –Autonomous Systems: 20,000 visible ones, and growing –AS paths and routers: at least in the millions… Must support flexible policy –Path selection: selecting which path your AS wants to use –Path export: controlling who can send packets through your AS Must converge, and quickly –VoIP and video games need convergence in tens of milliseconds –Routing protocol convergence can take several (tens of) minutes –… and the routing system doesn’t necessarily converge at all! Goal: Guaranteed convergence of the global routing system with purely local control.
Interdomain Routing: Border Gateway Protocol ASes exchange info about who they can reach –IP prefix: block of destination IP addresses –AS path: sequence of ASes along the path Policies configured by the AS’s network operator –Path selection: which of the paths to use? –Path export: which neighbors to tell? “I can reach /24” “I can reach /24 via AS 1” data traffic
Conflicting Policies Cause Convergence Problems Pick the highest-ranked path consistent with your neighbors’ choices. Only choice! Top choice! Only choice! Better choice! Only choice! Better choice!
Global Control is Not Workable Create a global Internet routing registry –Keeping the registry up-to-date would be difficult Require each AS to publish its routing policies –ASes may be unwilling to reveal BGP policies Check for conflicting policies, and resolve conflicts –Checking for convergence problems is NP-complete –Link/router failure may result in an unstable system Need a solution that does not require global coordination.
Think Globally, Act Locally Key features of a good solution –Flexibility: allow diverse local policies for each AS –Privacy: do not force ASes to divulge their policies –Backwards-compatibility: no changes to BGP –Guarantees: convergence even when system changes Restrictions based on AS relationships –Path selection rules: which route you prefer –Export policies: who you tell about your route –AS graph structure: who is connected to who
Customer-Provider Relationship Customer pays provider for access to the Internet –Provider exports its customer’s routes to everybody –Customer exports provider’s routes only to downstream customers d d provider customer provider Traffic to the customerTraffic from the customer advertisements traffic
Peer-Peer Relationship Peers exchange traffic between their customers –AS exports only customer routes to a peer –AS exports a peer’s routes only to its customers peer Traffic to/from the peer and its customers d advertisements traffic
Hierarchical AS Relationships Provider-customer graph is a directed, acyclic graph –If u is a customer of v and v is a customer of w –… then w is not a customer of u u v w
Our Local Path Selection Rules Classify routes based on next-hop AS –Customer routes, peer routes, and provider routes Rank routes based on classification –Prefer customer routes over peer and provider routes Allow any ranking of routes within a class –E.g., can rank one customer route higher than another –Gives network operators the flexibility they need Consistent with traffic engineering practices –Customers pay for service, and providers are paid –Peer relationship contingent on balanced traffic load
Solving the Convergence Problem Restrictions –Export policies based on AS relationships –Path selection rule that favors customer routes –Acyclic provider-customer graph Result –Safety: guaranteed convergence to a unique stable solution –Inherent safety: holds under failures and policy changes Sketch of (constructive) proof –System state: the current best route at each AS, for one prefix –Activating an AS: revisiting decision based on neighbors’ choices –Stable state: find an activation sequence that leads to a stable state –Convergence: any “fair” sequence includes this sequence
Proof, Phase 1: Selecting Customer Routes Activate ASes in customer-provider order –AS picks a customer route if one exists –Decision of one AS cannot cause an earlier AS to change its mind d An AS picks a customer route when one exists
Proof, Phase 2: Selecting Peer and Provider Routes Activate rest of ASes in provider-customer order –Decision of one phase-2 AS cannot cause an earlier phase-2 AS to change its mind –Decision of phase-2 AS cannot affect a phase 1 AS AS picks a peer or provider route when no customer route is available d
Economic Incentives Affect Protocol Behavior ASes already follow our rules, so system is stable –High-level argument »Export and topology assumptions are reasonable »Path selection rule matches with financial incentives –Empirical results [IMW’02] »BGP routes for popular destinations are stable for ~10 days »Most instability from failure/recovery of a few destinations ASes should follow our rules to make system stable –Need to encourage operators to obey these guidelines –… and provide ways to verify the network configuration –Need to consider more complex relationships and graphs
Playing One Condition Off Against Another All three conditions are important –Path ranking, export policy, and graph structure Allowing more flexibility in ranking routes –Allow same preference for peer and customer routes –Never choose a peer route over a shorter customer route … at the expense of stricter AS graph assumptions –Hierarchical provider-customer relationship (as before) –No private peering with (direct or indirect) providers Peer-peer
Extension to Backup Relationships [INFOCOM’01] Backups: more liberal export policies, and different ranking –The motivation is increased reliability –…but ironically it may cause routing instability! Generalize rule: prefer routes with fewest backup links –Need to maintain a count of the # of backup links in the path backup path primary provider backup provider failure Backup Provider backup path failure peer provider Peer-Peer Backup [RFC 1998]
Results Hold Under More Complex Scenarios Complex AS relationships –AS pair with different relationship for different prefixes –AS pair with both a backup and a peer relationships –AS providing transit service between two peer ASes Stability under changing AS relationships –Customer-provider to/from peer-peer –Customer-provider to/from provider-customer
Conclusions Avoiding convergence problems –Hierarchical AS relationships –Export policies based on commercial relationships –Path ranking based on AS relationships Salient features –No global coordination (locally implementable) –No changes to BGP protocol or decision process –Guaranteed convergence, even under failures –Guidelines consistent with financial incentives
Broader Influence of the Work Influence of AS relationships on BGP convergence –Algebraic framework and design principles for policy languages –Fundamental limits on relaxing the assumptions Application of the idea to internal BGP inside an AS –Sufficient conditions for iBGP convergence inside an AS –“What-if” tool for traffic engineering inside an AS AS-level analysis of the Internet topology –Inference of AS relationships and policies from routing data –Characterization of AS-level topology and growth Practical applications of knowing AS relationships –Analyzing your competitors’ business relationships –Identifying BGP routes that violate export conditions