Presentation is loading. Please wait.

Presentation is loading. Please wait.

Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University), N. Taft (Sprint Labs), K. Salamatian (University.

Similar presentations


Presentation on theme: "Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University), N. Taft (Sprint Labs), K. Salamatian (University."— Presentation transcript:

1 Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University), N. Taft (Sprint Labs), K. Salamatian (University of Paris VI), S. Bhattacharyya, C. Diot (Sprint Labs) Presented by Matthew Caesar

2 Problem scope Environment: –Single ISP, provides SLAs to customers Goal: Estimate traffic matrix –Amount of traffic flowing between each (origin, destination) pair –Hard to measure exactly (requires extensive logging and/or offline parsing) Why would we want to know the traffic matrix? –Helps determine load balancing, routing protocols configuration, dimensioning, provisioning, failover strategies –Allows quantification of cost of providing QoS vs. overprovisioning

3 Solution idea Main idea: –Measure utilization (“link count”) on each network link Can be easily done in router fast path Done via snmp query –Find a set of OD flows that would produce the measured link counts Sticky issue: how to find the set of OD flows? –Three techniques: Linear Programming (LP) Bayesian estimation Expectation Maximization (EM)

4 Traffic Estimation Assumptions can be operator’s knowledge (eg. maybe some pairs are always zero) Prior TM: sometimes need seed TM to start with Routing Matrix Link counts (link utilizations)

5 Problem setup See whiteboard

6 Scheme #1: Linear Programming (LP) Linear program: –Objective function + constraints Main idea: –Try to maximize the total amount of traffic routed through the network –Given contraints: Total traffic must be less than the measured link count Flow conservation Observations: –Leads to solutions where OD pairs with few intermediate hops will be assigned large amts of bandwidth, while more distant pairs will get much less bandwidth –Solution: put more weight on pairs separated by greater distances

7 Scheme #2: Bayesian Inference See whiteboard

8 Scheme #3: Expectation Maximization (EM) See whiteboard

9 Evaluation Method Impossible to obtain “real” traffic matrix via direct measurement. –Therefore, use simulations How to characterize flow between OD pairs? –Tried Constant, Poisson, Gaussian, Uniform and Bimodal (flash crowd) TMs

10 Results: Linear programming vs. Statistical methods Linear programming method performs poorly –Assigns zero to many OD pairs, increasing error –Problem: tries to match OD pairs to link counts –Different objective functions give similar results –  error too high for use in practical networks Bayesian and EM: –EM beats Bayesian in terms of average error and worst case error –Estimation errors correlated to heavily shared links (links with many OD flows are more likely to be mis- estimated)

11 Results: Goodness of prior Goodness of prior matrix (seed values) –Bayesian is much more sensitive to the prior matrix than EM However, EM is also quite sensitive Perhaps because: EM method has deterministic convergence behavior (can be analyzed) while Bayesian has stochastic convergence (it oscillates) –After a certain point, additional measurements don’t provide additional gain Measuring over long periods of time only gives small additional improvement

12 Results: Marginal gains What improvement could be gained if we could measure some components of the traffic matrix directly? –Carrier may have the option to deploy a certain amount of monitoring equipment 3 ways to add rows: –Randomly, row-sum (by traffic volume), and error magnitude Results: –Error rate drops off roughly linearly with each additional row added –Bayesian not sensitive to order rows are added –EM does better when rows added by largest-error first –  reduction in adding a row is 2% for 13 OD pairs

13 Other results Which OD pairs are most difficult to estimate? –Error increases as the link-sharing factor increases, also as path length increases How to characterize OD flows? –Poisson and Gaussian assumption holds well, but only for certain hours during the day.

14 Recommendations Network operators know a lot about their network. We need to devise methods to allow incorporation of network specific information into the estimation scheme. We need a better model of OD flows through an ISP. –Possible solution: “gravity models” based on utility factor (see whiteboard) We need a good way to generate good prior TMs.

15

16

17 References: Statistical INference: http://ic.arc.nasa.gov/ic/projects/bayes-group/html/bayes- theorem-long.html http://www.math.uah.edu/stat/prob/prob5.html http://www.statisticalengineering.com/bayes_thinking.htm http://www.stat.psu.edu/~jls/stat544/2001/lec22.pdf http://www- eksl.cs.umass.edu/library/Statistics/Expectation- Maximization/ http://www.owlnet.rice.edu/~msmiley/elec431/em.htm Traffic Matrix Estimation: http://dimacs.rutgers.edu/Workshops/MiningTutorial/grossgla user-slides.ppt


Download ppt "Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University), N. Taft (Sprint Labs), K. Salamatian (University."

Similar presentations


Ads by Google