Rethinking Internet Traffic Management Using Optimization Theory

Rethinking Internet Traffic Management Using Optimization Theory
Jennifer Rexford Princeton University Joint work with Jiayue He, Martin Suchara, Ma’ayan Bresler, and Mung Chiang

Clean-Slate Network Architecture
More than designing a single protocol Definition and placement of function Clean-slate design Without the constraints of today’s artifacts To have a stronger intellectual foundation And move beyond the incremental fixes But, how do we do clean-slate design?

Two Ways to View This Talk
A design process Based on optimization decomposition A new design For traffic management

Why Traffic Management?
Traffic management is important Determines traffic rate along each path Major resource-allocation issues Routing, congestion control, traffic engineering, … Some traction studying mathematically Reverse engineering of TCP Redesigning protocols (e.g., TCP variants) Mathematical tools for tuning protocols But still does not have a holistic view…

Traffic Management Today
Operator: Traffic Engineering Evolved organically without conscious design 1. If we consider traffic management to be the process of controlling both the amount of traffic in the network and where the traffic goes, then today’s traffic management system contains 3 components: routers, users and operators. The routers route traffic that enters the network; the users send traffic according to congestion control algorithms and operators monitor the network for congestion and shifts traffic patterns if necessary. 2. This whole process evolved organically. 3. Routing had been there since day one of the Internet, then congestion control developed in the late 80s in response to congestion collapse in the Internet and finally traffic engineering only took off in the last 10 years when there was a concern about a bandwidth shortage in the core. With the popularity of file sharing, video, gaming and voice over IP, the network is getting congested again. Routers: Routing Protocols User: Congestion Control

Shortcoming of Today’s Traffic Management
Protocol interactions ignored Congestion control assumes routing is fixed TE assumes the traffic is inelastic Inefficiency of traffic engineering Link-weight tuning problem is NP-hard TE at the timescale of hours or days Only limited use of multiple paths The ad-hoc manner in which traffic management evolved today means congestion control assumes routing is fixed and traffic engineering assumes traffic is inelastic. In addition, the traffic engineering problem is NP hard and adapts much slower than traffic shifts. The key question we want to ask is whether we would have done things differently if we designed traffic management from scratch. The natural question is HOW? Optimization theory has shown a lot of promise for analysis & design of optimization protocols. What would a clean-slate redesign look like? 6

Distributed Solutions
Top-down Redesign Problem Formulation Optimization decomposition Distributed Solutions Compare using simulations TRUMP algorithm In this talk, we take a top-down approach and use optimization theory to guide protocol design. We start with selecting an intuitive and objective function, then proceeds to derive four distributed algorithms using know optimization techniques. Optimization theory guarantees that these algorithms converge to a stable and optimal point, while simulations allow us to compare rate of convergence and sensitivity to tunable parameters. Next, we combine the best features of each algorithm to construct a simple yet effective traffic management algorithm. Finally, we flesh out some details not defined by the mathematics for a traffic management protocol. While mathematics guide certain protocol details and choices, it still leaves many architecture choices open to interpretation, which allows us to adapt the protocol to constraints imposed by other aspects of the overall architecture. Translate into packet version TRUMP Protocol

Congestion Control Implicitly Maximizes Aggregate User Utility
Source-destination pair indexed by i aggregate utility User Utility Ui(xi) max.∑i Ui(xi) s.t. ∑i Rlixi ≤ cl var. x Fair rate allocation amongst greedy users routing matrix want to look in more detail at CC and TE today to help guide the formulation of the problem we think the combined system should be solving. In the last ten years, many TCP variants have been reverse engineered to implicitly solve the following problem: maximize the aggregate utility as a function of throughput x_i, subject to the capacity constraints. Here $R_li$ is the routing matrix where each entry represents the fraction of source $i$’s traffic on link $l$. $R_li$ is 0 is there is no traffic on link $l$ from source $i$. The users do not control how the traffic is routed through the network, but congestion control regulates their throughput. Note that in this talk, when I talk about source, I really mean a source destination pair. The utility function is meant as a reflection of the happiness of the user. The utility is a concave function of throughput, you can imagine the user is quite unhappy if it is getting no bandwidth and becomes much happier when it is receiving some. But past a certain point, getting further bandwidth doesn’t’ make it anymore happy since the bandwidth it has currently can already support the applications it is running. Different shapes of the utility curve can represent how elastic the traffic is – in other words how much does it react to congestion in the network. For the whole network, congestion control is a way to provide fair rate allocation amongst users. source rate Source rate xi Utility represents user satisfaction and elasticity of traffic

Traffic Engineering Explicitly Minimizes Network Congestion
Links are indexed by l aggregate congestion cost Cost f(ul) min. ∑l f(ul) s.t. ul =∑i Rlixi/cl var. R ul =1 Avoids bottlenecks in the network Now I will introduce a commonly used traffic engineering model, which is an offline optimization imposed on the network. Note this is a simplified abstraction of the real traffic engineering model for which the variables are link weights. The network is trying to minimize the total congestion cost as a function of link utilization. Here R is the variable and x is fixed. The cost function f is determined by the operators. It is a convex function to give increasingly heavier penalty as link load approaches link capacity. The intuition behind choosing this is two fold. One, this roughly models queuing delay and two, network operators do not care much when the link utilization is low: anything below 50% doesn’t make too much difference. They want to penalize solutions with many links at or near capacity. Network wide, this allows traffic engineering to avoid bottlenecks in the network and therefore making it robust to high volume traffic bursts. Link Utilization ul link utilization Cost function is a penalty for approaching capacity

A Balanced Objective max. ∑iUi(xi) - w∑lf(ul) Penalty weight
Happy users Maximize throughput Generate bottlenecks Happy operators Minimize delay Avoids bottlenecks For traffic management as a whole, we propose a balanced objective that captures the needs of users and operators alike. The optimization here is the design objective, driving our design of new protocols similar to how TCP is implicitly solving an optimization problem today. Operators also want to keep users happy or they’ll lose their clients! By having a penalty term for congestion cost, we avoid pushing the network to bottleneck solutions where if a new flow arrives, the link will be overloaded and immediately the network would have congestion. As mentioned earlier, f can also be interpreted as roughly modeling queuing delay, so this can also be seen as balancing throughput and queuing delay. Notice that we have inserted an additional term $w$ as a penalty weight. Depending on the network properties, an operator can adjust the value of $w$. For example, if you know the traffic is fairly static, you might pick a smaller $w$ and push the network closer to bottleneck solutions. If the traffic is very dynamic, then you may chose to be more conservative in order to smoothly handle traffic bursts. 10

Top-down Redesign Problem Formulation Optimization decomposition Distributed Solutions Compare using simulations TRUMP algorithm Now that we have established an optimization problem, our next step is to solve it using optimization techniques. Here we will see how optimization theory can have practical implications on the protocol properties. Translate into packet version TRUMP Protocol Optimization decomposition requires convexity

Convex Problems are Easier to Solve
Non-convex Convex 1. Here we see a convex and non-convex functions. We can easily see that convex functions have a global minimum, this makes them easy to solve. Because when you start at any point on the curve and follow the direction of descent, you end up at the global minimum. In a non-convex function, you can end up at a local minimum. We also require the constraint set to be convex so that the global minimum is actually included in the solution space. 2. If the problem is convex, i.e. both the objective function and the constraint set are convex, then there are many possible optimization techniques that can be applied to solve the problem. One such technique is decomposition which is really just the process of breaking up one big problem into several smaller problems that can be solved locally. Optimization theory guarantees that under certain conditions, the distributed solution will converge to the global optimal solution. Convex problems have a global minimum Distributed solutions that converge to global minimum can be derived using decomposition

How to make our problem convex?
Single path routing is non convex Multipath routing + flexible splitting is convex max. ∑i Ui(∑j zji) – w∑l f(ul) s.t. link load ≤ cl var. path rates z z11 i source-destination pair, j path number Single path routing where all the traffic goes on one path is not convex and thus it is a difficult optimization problem, but if we allow multiple paths and flexible splitting between them, then the optimization problem is convex. So the problem is then formulated in terms of a path rate z, where $i$ indexes the source destination pair and $j$ indexes the paths. In this Abilene topology, we demonstrate that for the source destination pair 9-6, indexed as pair 1, there are three paths and we index them 1,2,3 respectively. The path rate simultaneously captures the amount of traffic flowing between each source destination pair and the percentage of splitting between the multiple paths. By controlling path rates directly, this also shifts the system from hop-by-hop routing to per path routing. z21 z31 Path rate captures source rates and routing

Overview of Distributed Solutions
Operator: Tune w, U, f Other parameters s s s 1. Decomposition is the process of breaking up the original problem into subproblems that can be solved locally at links and sources. By sources, we can mean end hosts or edge routers. Here we show them abstractly as edge nodes and will discuss possible architecture choices later on in the talk. I will be presenting four different decompositions, but at a high-level they are more similar than they are different. 2. In all solutions, link prices are computed to reflect the level of congestion locally based on the locally measured link loads. What are link prices? They are not monetary values. In congestion control today, link price update is like the active queue management. Today packets are dropped or marked to signal congestion. Link prices are just another explicit way to mark the packets. 3. The link prices are then fed back to the edge nodes, who use them to compute their path rates. The feedback can be either explicit probing or be piggy baked on acknowledgement packets. 4. I will get into the details of how s & z are computed in a few slides. 5. Finally, the operators now just tunes $w$ at a much longer timescale than they currently tune link weights today. Routers: Set up multiple paths Measure link load Update link prices s Edge node: Update path rates z Rate limit incoming traffic

Optimization Decomposition
Deriving prices and path rates Prices: penalties for violating a constraint Path rates: updates driven by penalties Example: TCP congestion control Link prices: packet loss or delay Source rates: AIMD based on prices Our problem is more complicated More complex objective, multiple paths Primal means: contained in the original problem

Effective Capacity (Links)
Rewrite capacity constraint: Subgradient feedback price update: Stepsize controls the granularity of reaction Stepsize is a tunable parameter Effective capacity keeps system robust link load = yl effective capacity yl≤ cl link load ≤ cl sl(t+1) = [sl(t) – stepsize*(yl – link load)]+ By rewriting the capacity constraint, we introduce a new variable effective capacity y. Effective capacity is an important concept for the first three decompositions. In all three cases, the feedback prices is updated as a subgradient update, which means, it is dependent on its old value at time t and the difference between current link load and effective capacity. By having the feedback dependent on effective capacity rather than actual capacity, it is like the ECN where you are getting advanced warning that you are filling up a link. The stepsize here controls how much the new feedback price depends on changes in link load. You can imagine if the stepsize is too big, you might be reacting too aggressively and thus causing the system to oscillate. If the stepsize is too small on the other hand, the system might be adapting too slow to changes in the network and take a very long time to converge. Stepsize is another parameter that an operator can set and a feature associated with any subgradient update.

Key Architectural Principles
Effective capacity Advance warning of impending congestion Simulates the link running at lower capacity and give feedback on that Dynamically updated Consistency price Allowing some packet loss Allowing some overshooting in exchange for faster convergence

Four Decompositions - Differences
Differ in how link & source variables are updated Algorithms Features Parameters Partial-dual Effective capacity 1 Primal-dual 3 Full-dual Effective capacity, Allow packet loss 2 Primal-driven Direct s update So to summarize, the four algorithms have different number of variables and tunable parameters. Generally, the fewer the tunable parameters, the less work there is for operators. Iterative updates contain parameters: They affect the dynamics of the distributed algorithms

Top-down Redesign Problem Formulation Optimization decomposition Distributed Solutions Compare using simulations TRUMP algorithm So optimization theory guarantees that for diminishing stepsize, these algorithms converge to the global optimum. However, diminishing stepsize is unrealistic in practice as it become unclear how new flows are handled. Constant stepsizes are much more practical, but the known theoretical bounds for convergence rate are very loose. Thus we will compare our four distributed solutions using numerical experiments. The rest of talk will move further and further away from the math and use more human intuition to guide our thinking. Translate into packet version Final Protocol Optimization doesn’t answer all the questions

Evaluating Four Decompositions
Theoretical results and limitations: All proven to converge to global optimum for well-chosen parameters No guidance for choosing parameters Only loose bounds for rate of convergence Sweep large parameter space in MATLAB Effect of w on convergence Compare rate of convergence Compare sensitivity of parameters

Effect of Penalty Weight w
Topology dependent Percent of max. achievable utility First of all, all four algorithms can converge with well chosen tunable parameters. Here we plot the percentage of the maximum achievable utility (ignoring any penalty on the links) versus w on the x-axis. When w is small, the converged solution achieves a very high aggregate utility since the system is being pushed close to bottleneck solutions. When w is big, the converged solution achieves lower aggregate utility because the operator’s penalty function kicks in. I’d like to note that for w=0.01 to w=0.3, the overall utility achieved is quite high. While the specific values do depend a little on the topology, the overall trend is the same. User utility w operator penalty Can achieve high aggregate utility for a range of w

Convergence Properties: Partial Dual
o average value x actual values Iterations to convergence parameter sensitivity Here I am just showing one of the many plots we generated of this nature. On the y-axis we are plotting the number of iterations to convergence, where each iteration can be thought of as on the order of RTTs versus stepsize on the x-axis. The first thing to note is that the tunable parameters do indeed affect convergence, or you would see a flat line. The crosses on the graph are different capacity distributions and the circles represent the average values. The two parameters we use to quantify are first what is the fastest convergence rate possible and what is the sensitivity of the tunable parameter, measured by the width of the graph. Best rate stepsize Tunable parameters impact convergence time

Convergence Properties (MATLAB)
Parameter sensitivity correlated to rate of convergence Algorithms Convergence Properties All Converges slower for small w Partial-dual vs. Primal-dual Extra parameters do not improve convergence Full-dual Allow some packet loss may improve convergence Primal-driven Direct updates converges faster than iterative updates We generated many plots like the one on the previous page, but rather than showing you many plots, here’s a summary of our findings. The extra tunable parameters expands the search space and makes finding the best set of tunable parameters more difficult, but doesn’t actually improve the convergence rates. Consistency price can help by relaxing a secondary constraint and allowing more freedom. The local optimization is usually faster because it can take bigger steps at the beginning and smaller steps when it gets closer to the solution, unlike the subgradient update where the step is similar each time, forcing the choice of a smaller step even at the beginning. Finally, all algorithms converges slower for smaller w because as you push the network to bottleneck solutions, it becomes easier to overshoot and go over capacity, forcing you to choose smaller stepsizes in order to make the algorithm converge at all, leading to longer convergence times.

TRUMP: TRaffic-management Using Multipath Protocol
Insights from simulations: Have as few tunable parameters as possible Use direct update when possible Allow some packet loss Cherry-pick different parts of previous algorithms to construct TRUMP One tunable parameter

TRUMP Algorithm Link l: loss pl(t+1) = [pl(t) – stepsize(cl – link load)]+ queuing delay ql(t+1) = wf’(ul) Price for path j = ∑ l on path j (pl+ql) 1. Decomposition is the process of breaking up the original problem into subproblems that can be solved locally at links and sources. By sources, we can mean end hosts or edge routers. Here we show them abstractly as edge nodes and will discuss possible architecture choices later on in the talk. I will be presenting four different decompositions, but at a high-level they are more similar than they are different. 2. In all solutions, link prices are computed to reflect the level of congestion locally based on the locally measured link loads. What are link prices? They are not monetary values. In congestion control today, link price update is like the active queue management. Today packets are dropped or marked to signal congestion. Link prices are just another explicit way to mark the packets. 3. The link prices are then fed back to the edge nodes, who use them to compute their path rates. The feedback can be either explicit probing or be piggy baked on acknowledgement packets. 4. I will get into the details of how s & z are computed in a few slides. 5. Finally, the operators now just tunes $w$ at a much longer timescale than they currently tune link weights today. Source i: Path rate zji(t+1) = max. Ui(∑kzki) – zji *path price

TRUMP versus Other Algorithms
TRUMP is not another decomposition We can prove convergence, but only under more restrictive conditions From MATLAB: Faster rate of convergence Easy to tune parameter

Top-down Redesign Problem Formulation Optimization decomposition Distributed Solutions Compare using simulations TRUMP algorithm Due to the original simplifying assumptions, the algorithm itself doesn’t specify what happens when new flows join, how to deal with real packets etc. So using past protocol design as guidance, we add the necessary details to transition TRUMP the algorithm to TRUMP the protocol. Translate into packet version TRUMP Protocol So far, assume fluid model, constant feedback delay

TRUMP: Packet-Based Version
Link l: link load = (bytes in period T) / T Update link prices every T Arrival and departure of flows are notified implicitly through price changes In our TRUMP algorithm, there are two link prices, one taken from the full-dual, and the other from the primal driven. They are summed to be feedback to the edge node, which updates the path rate in the same manner as all the dual algorithms. The local optimization at the sources is actually just a function evaluation. Source i: Update path rates at maxj {RTTji}

Packet-level Experiments (NS-2)
Set-up: Realistic topologies and delays of large ISPs Selected flows and paths Realistic ON-OFF traffic model Questions: Do MATLAB results still hold? Does TRUMP react quickly to link dynamics? Can TRUMP handle ON-OFF flows?

TRUMP versus Partial dual
TRUMP trumps partial dual for w=1

TRUMP versus Partial dual
TRUMP trumps partial dual for w=1/3

TRUMP Link Dynamics TRUMP reacts quickly to link dynamics Link failure
or recovery TRUMP reacts quickly to link dynamics

TRUMP versus file size Worse for smaller files Still faster than TCP
TRUMP’s performance is independent of variance

Summary of TRUMP Properties
Property TRUMP Tuning Parameters One easy to tune parameter Only need to be tuned for small w Robustness to link dynamics Reacts quickly to link failures and recoveries Robustness to flow dynamics Independent of variance of file sizes, more efficient for larger files General Trumps other algorithms Feedback Possible with implicit feedback

Division of Functionality
Today TRUMP Operators Tune link weights Set penalty function (Set-up multipath) Tune w & stepsize Sources Adapt source rates Adapt path rates Routers Shortest path routing (Compute prices) Sources: end hosts or edge routers? Feedback: implicit or explicit? Computation: centralized or distributed? Taking a big step back, we see that compared to today, TRUMP has changed the division of labor. Mostly, the operators do much less now and link weights have disappeared entirely. They just tune w and possibly set-up multiple paths. There are many ways to set-up multiple paths, which are summarized in a survey paper we wrote, so I won’t go into it here. The sources now adapt path rates rather than source rates and routers may be computing and sending explicit feedback to the paths. As hinted earlier, mathematics can only answer the questions that have been placed into the modeling. So while certain protocol properties like multipath falls right out of the math, there are many open questions when it comes to the rest of the architecture. This means there’s more leeway to go one way or another depending on the other restrictions. For example, if the easiest to imagine scenario would be run TRUMP on edge routers, then rate limit the incoming traffic. This scales nicely by aggregating user traffic. It also has the advantage of the explicit feedback being trustworthy. However, it is possible to also consider an implicit version of TRUMP where the prices are interpreted loosely as packet loss and delay. In that case, it is more feasible to think of sources as end hosts. Mathematics leaves open architecture questions

Conclusions Contributions: Extensions to TRUMP
Design with multiple decompositions New TRUMP traffic-management protocol Extensions to TRUMP Implicit feedback based on loss and delay Interdomain realization of the protocol Our two contributions are: first designing with multiple decompositions and bridging the gaps not provided by optimization theory. Second is of course TRUMP itself. Optimization leaves us some flexibility and that’s a good thing.

Ongoing Work: Multiple Traffic Classes
Different application requirements Throughput-sensitive: file transfers Delay-sensitive: VoIP and gaming Optimization formulation Weighted sum of the two objectives Per-class variables for routes and rates Decompose into two subproblems Two virtual networks with custom protocols Simple dynamic update to bandwidth shares Theoretical foundation for adaptive network virtualization

Rethinking Internet Traffic Management Using Optimization Theory

Similar presentations

Presentation on theme: "Rethinking Internet Traffic Management Using Optimization Theory"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Rethinking Internet Traffic Management Using Optimization Theory

Similar presentations

Presentation on theme: "Rethinking Internet Traffic Management Using Optimization Theory"— Presentation transcript:

Similar presentations

About project

Feedback