Internet Routing (COS 598A) Today: Router Software Jennifer Rexford Tuesdays/Thursdays 11:00am-12:20pm
Outline Continuing discussion from last class –Proposals for removing routing from routers –Feasibility, collecting data, computing paths, etc. BGP implementation –Storage overhead –CPU overhead Recent proposals –Graceful restart to limit effects of resets –Tunneling to limit hot-potato changes –Computing routes for groups of routers
Proposal #1: Routing As a Service Goal: third parties pick end-to-end paths for clients to satisfy diverse user objectives Forwarding infrastructure –Basic routing (e.g., default routing) –Primitives for inserting routes Route selector –Aggregates network information –Selects routes on behalf of clients –Competes with other selectors for customers End host –Queries route selector to set up paths
Proposal #2: Routing Control Platform Goal: Move beyond today’s artifacts, while remaining compatible with the legacy routers Incentive compatibility: phased evolution –Intelligent route reflector in a single AS –Learning eBGP routes directly from neighbor ASes –Interdomain routing between RCPs Backwards compatibility: internal BGP –Using iBGP to “push” answers to the routers –No need to change the legacy routers at all –Keep message format and change decision rules iBGP eBGP RCP iBGP eBGP RCP AS 3 AS 2 AS 1 iBGP Physical peering Inter-AS Protocol RCP
Proposal #3: Wafer-Thin Control Plane Goal: Refactor the data, control, and management planes from scratch Management plane Decision plane –Operates on network-wide view and objectives –Directly controls the data plane in real time Control plane Discovery plane –Responsible for providing the network-wide view –Topology discovery, traffic measurement, etc. Data plane –Queues, filters, and forwards data packets –Accepts direct instruction from the decision plane Simple routers that have no control-plane configuration
How Does These Differ From Overlays Overlays: circumventing the underlay –Host nodes throughout the network –Logical links between the host nodes –Active probes to observe the performance –Direct packets through good intermediate nodes Routing services: controlling the underlay –Servers collect data directly from the routers –Servers compute forwarding tables for the routers –Data packets do not go through the servers –Like an overlay for managing the underlay Maybe some combination of the two makes sense?
Practical Issues: Feasibility Fast reaction to failures –Routers are closer to the failures –Can a service react quickly enough? Scalability with network size –State and computation grow with the topology –Can a service manage a large network? Reliability? –Service is now a point of failure –Is simple replication enough? Security? –Service is now a natural point of attack –Easier (or harder) to protect than the routers?
Practical Issues: Collecting Measurement Data All three proposals make measurement a first- order part of running the network Routers have only two jobs –Forward packets –Collect measurement data What measurements? –Topology discovery –Traffic demands –Performance statistics –…?
Practical Issues: Path-Computation Algorithms Selecting routes should be easier –Complete view of network topology and traffic –Possibility of using centralized algorithms –Direct control over forwarding tables …but what algorithms to use? –Still need a separation of timescale, but how? Fast reaction to topological changes Semi-offline optimization of routing … and how to compute end-to-end paths? –Policy-based path vector protocol? –Publish/subscribe system? –Something else?
Practical Issues: Solving Real Problems? Customer load-balancing –Trading off load, performance, and cost –Controlling inbound and outbound traffic –Avoiding small subnets and BGP tweaks Preventing overloading router resources –Minimum-sized forwarding table per router –Minimum stretch while obeying memory limits Flexible end-to-end path selection –Satisfy the goals of end users and providers –Handle pricing/economics in the right way
Other Thoughts?
Router Software
Basic BGP Implementation RIB-in-1 RIB-in-2 RIB-in-n RIB-out-1 RIB-out-2 RIB-out-n RIB Import Export Decision process
Storage Overhead: RIB-In Storing routes learned from each neighbor –Before applying the import policy Advantages of keeping a RIB-In –Verify receipt of routes that have been filtered –Use as input to simulate import-policy changes –Apply new policies directly on local RIB-In Alternatives for keeping a RIB-In –Reset the session after any policy change Undesirable, unless policy changes are infrequent –Route-refresh option to signal neighbor to resend Relatively new feature, so not universally supported
Storage Overhead: Main RIB Storing all candidate routes –All routes after import processing –Keep track of the best route for each prefix Advantages –Necessary to store at least one copy of each route –… since BGP is an incremental protocol Alternatives –Store only the RIB-In for each neighbor Require rerunning import policies per decision
Storage Overhead: RIB-Out Storing routes sent to each neighbor –After applying the export policy Advantages of keeping a RIB-Out –Verify sending of route to the neighbor –Compare routes to suppress unnecessary updates No update message if all attributes are the same No withdrawal message if there was no advertisement Alternative to keeping a RIB-Out –Reapply export policy to recompute the route … or send some unnecessary update messages –Single RIB-Out per export policy (peer groups)
BGP Peer Groups Group of BGP neighbors with same policies –Avoid repetitive configuration –Avoid reapplying the same policy –Avoid duplicating the storage Example iBGP peer groups –Route-reflector clients –Route-reflector peers Example eBGP peer groups –Customers –Peers
CPU Overhead: New BGP Update Message When receiving a new BGP update –Apply import policy and update the RIB –Re-run the BGP decision process for this prefix –If best route changes, apply export policies and send update message to affected neighbors Running decision process –Ideally, just compare with the best route Withdraw non-best route: no change Update non-best route: compare to current best –But, BGP does not always form a total ordering MED attribute compared only for same next-hop AS Re-run decision process for deterministic outcome
CPU Overhead: Events that Amplify Work BGP session failure –Must discard all routes learned from this neighbor –… and run decision process for affected prefixes Policy change –Must apply the new routing policy to all routes learned from (or sent to) this neighbor –… and run decision process for affected prefixes Intradomain change –Must revisit BGP decision for affected prefixes Exclude routes with unreachable next-hop Prefer the route with the closest egress point
CPU Overhead: Deferring Heavy Jobs Event-driven approach –Process most events as they occur –Defer heavy-load items to background task –Make sure these tasks can run soon –Example: XORP handling session failures Timer-driven approach –Periodic timer driving the operation –Scan the data structures when the timer expires –… and identify and perform any needed work –Example: Cisco scan timer for IGP changes
Reducing Overhead: Operational Practices Avoiding RIB-In storage –Configuring router not to store RIB-In –Convincing neighbors to support route-refresh Configuring peer groups –Limiting the number of unique export policies … or limiting the number of these per router –Putting all possible sessions in same peer group Selecting good timer settings –Allow grouping of update messages –Avoid false detection of session failures
Reducing the Effects of Session Failures Separating control from data –Suppose a router’s BGP process fails –… but the data plane is just fine When the neighbor’s BGP process fails –Do not delete routes learned from neighbor –Continue to forward data packets When the neighbor’s process restarts –Refresh the neighbor by re-sending BGP routes –Neighbor re-builds its RIB and goes back to normal BGP “Graceful Restart” mechanism –New BGP capability for neighbors to negotiate –Mark routes from the neighbor as “stale” –Refresh by resending RIB-Out with End-of-RIB marker FIB RIB data
Reducing the Effects of IGP Changes Circumvent hot-potato routing –Avoid small IGP changes leading to BGP changes –… and avoid the software overhead on BGP Tunneling between edge routers –Create tunnel from ingress to egress router –Assign a weight to the tunnel (e.g., air miles) –Tunnel weight does not depend on IGP path A B C D G E F A B dst
Reducing Overhead for Groups of Routers Additional overhead in RCP-like approaches –Computing routes on behalf of many routers –Could lead to a linear increase in overhead Store a single copy of each BGP route –One big global RIB for the network –Plus, avoid repeating some of decision process Compute for groups of routers (e.g., PoP) –One shared RIB-out for each group of routers –Plus, avoid repeating the decision process Reduce the overhead of IGP changes –E.g., by use of tunnel, as on previous slide
Conclusion Router software –Very challenging systems problem –New open-source software (Quagga, OpenBGPd) Improving scalability –Scaling with # of routers, sessions, and prefixes –Trading off memory and CPU resources –Avoiding events that create excessive work Newly active research area –Importance of control plane in network performance, reliability, and security –Creation of new platforms for router software
Next Time: BGP Security Two papers –“Beware of BGP Attacks” –“Secure Border Gateway Protocol (Secure-BGP)” Review just of second paper –Summary –Why accept –Why reject –Future work Optional NANOG video –See the Web site later today…