Traffic Engineering in IP Networks Jennifer Rexford Computer Science Department Princeton University; Princeton, NJ
A Little About Me… Technical interests: data networking –Network measurement –Network operations –Internet routing and infrastructure Job history –2005-onward: Professor in Computer Science at Princeton – : AT&T Labs—Research »Technology transfer of tools for measurement, configuration, troubleshooting, and traffic engineering to network operators DARPA involvement –Member of the ISAT study group –Knowledge Plane seedling project –NetVision2012 workshops
Outline Internet routing protocols Traffic engineering using traditional protocols –Optimizing configuration to the traffic –Needs topology, routing, and traffic data Traffic demands –Volume of load between edges of the network –Measuring the traffic demands Route optimization –Tuning the link weights to the traffic –Satisfying the operational constraints Conclusions
Internet Architecture Divided into Autonomous Systems –Regions of administrative control –Routers and links managed by an institution –Service provider, company, university, … Hierarchy of Autonomous Systems –Tier-1 provider with nationwide backbone –Medium-sized regional provider –Campus or corporate network Interaction between Autonomous Systems –Internal topology is not shared –… but, ASes interact to coordinate routing
Path Traversing Multiple ASes Client Web server Path: 6, 5, 4, 3, 2, 1
Interdomain Routing: Border Gateway Protocol ASes exchange info about who they can reach –IP prefix: block of destination IP addresses –AS path: sequence of ASes along the path Policies configured by the AS’s network operator –Path selection: which of the paths to use? –Path export: which neighbors to tell? Client ( ) “I can reach /24” “I can reach /24 via AS 1”
Intradomain Routing: OSPF or IS-IS Shortest path routing based on link weights –Routers flood the link-state information to each other –Routers compute the “next hop” to reach other routers Weights configured by the AS’s network operator –Simple heuristics: link capacity or physical distance –Traffic engineering: tuning the link weights to the traffic
Motivating Problem: Congested Link Detecting that a link is congested –Utilization statistics suggest overloaded link –Probe traffic suffers degraded performance –Customers complain (via the phone network?) Reasons why the link might be congested –Increase in the offered traffic –Routing change due to equipment failure –Routing change due to a change in another AS Challenges –Know the cause, not just the manifestations –Predict the effects of possible changes to link weights
Traffic Engineering in an ISP Backbone Network topology –Connectivity and capacity of routers and links Traffic demands –Offered load between points in the network Routing configuration –Link weights for selecting paths Performance objective –Balanced load, low latency, … Question: Given the topology and traffic demands in an IP network, what link weights should be used?
Modeling Traffic Demands Volume of traffic V(s,d,t) –From a source s –To a destination d –Over a time period t Time period –Performance debugging – minutes –Traffic engineering – hours or days –Network design – days to weeks Sources and destinations –Hosts – interesting, but huge, and hard to measure –IP prefixes – still big, and not seen by any one AS –Edge routers – hmmm….
Traffic Matrix in out Traffic matrix: V(in,out,t) for all pairs (in,out)
Problem: Hot Potato Routing AS is in the middle of the Internet –Multiple connections to multiple other ASes –Egress point depends on intradomain routing Problem with point-to-point models –Want to predict impact of changing intradomain routing –But, a change in weights may change the egress point!
Traffic Demand: Multiple Egress Points Definition: V(in, {out}, t) –Entry link (in) –Set of possible egress links ({out}) –Time period (t) –Volume of traffic (V(in,{out},t)) Computing the traffic demands –Measure the traffic where it enters the ISP backbone –Identify the set of egress links where traffic could leave –Sum over all traffic with same in, {out}, and t
Traffic Mapping: Ingress Measurement Packet measurement (e.g., Netflow, sampling) –Ingress point i –Destination prefix d –Traffic volume V id i d ingress destination
Traffic Mapping: Egress Point(s) Routing data (e.g., forwarding tables) –Destination prefix d –Set of egress points e d d destination
Traffic Mapping: Combining the Data Combining multiple types of data –Traffic: V id (ingress i, destination prefix d) –Routing: e d (set e d of egress links toward d) –Combining: sum over V id with same e d i ingress egress set
Application on the AT&T Backbone Measurement data –Netflow data (ingress traffic) –Forwarding tables (sets of egress points) –Configuration files (topology and link weights) Effectiveness –Ingress traffic could be matched with egress sets –Simulated flow of traffic consistent with link loads Challenges –Loss of Netflow records during delivery (can correct for it!) –Egress set changes between table dumps (not very many) –Topology changes between configuration dumps (just one!)
Three Traffic Demands in San Francisco
Underpinnings of the Optimization Route prediction engine (“what-if” tool) –Model the influence of link weights on traffic flow »Select a closest exit point based on link weights »Compute shortest path(s) based on link weights »Capture splitting over multiple shortest paths –Sum the traffic volume traversing each link Objective function –Rate the “goodness” of a setting of the link weights –E.g., “max link utilization” or “sum of exp(utilization)”
Weight Optimization Local search –Generate a candidate setting of the weights –Predict the resulting load on the network links –Compute the value of the objective function –Repeat, keeping solution with lowest objective function Efficient computation –Explore the “neighborhood” around good solutions –Exploit efficient incremental graph algorithms
Incorporating Operational Realities Minimize configuration changes –Changing just one or two link weights is often enough Tolerate equipment failures –Weights settings usually remain good after failure –… or can be fixed by changing one or two weights Limit the number of weight values –Small number of integer values is sufficient Tolerate inaccuracy in the traffic demands –Good weights remain good after introducing random noise Limit frequency of link-weight changes –Joint optimization for day and night traffic matrices
Application to AT&T’s Backbone Network Performance of the optimized weights –Good solution within a few minutes –Much better than traditional heuristics –Competitive with multi-commodity flow solution How AT&T changes the link weights –Maintenance done from midnight to 6am –Predict effects of removing link(s) –Reoptimize the link weights to avoid congestion –Configure new weights before disabling equipment
Conclusions Our approach –Measure: network-wide view of traffic and routing –Model: data representations and “what-if” tools –Control: intelligent changes to operational network Application in AT&T’s network –Capacity planning –Customer acquisition –Preparing for maintenance activities –Comparing different routing protocols
Stepping Back: IP Network Management Lessons learned –Good: network-wide views, control and objectives –Bad: indirect control and non-real-time control Next steps: Routing Control Platform –Direct, real-time control over the routing –Network control entirely in the management system –Routers just forward packets and provide measurements –Initial prototype and results are very promising –Platform for incremental deployment of secure protocols
To Learn More… Traffic engineering overview –“Traffic engineering for IP networks” ( –“Traffic engineering with traditional IP routing protocols” ( Traffic measurement –"Measurement and analysis of IP network usage and behavior” ( –“Deriving traffic demands for operational IP networks” ( Route optimization –“Internet traffic engineering by optimizing OSPF weights” ( –“Optimizing OSPF/IS-IS weights in a changing world” ( Routing Control Platform –“The case for separating routing from routers” ( –“Design and implementation of a Routing Control Platform”