Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research.

Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research

Traffic Engineering overload! Two large flows

Traffic Engineering overload! New egress point for first flow Multi-homed customer

Traffic Engineering overload! OSPF shortest path splitting

Traffic Engineering Goal: domain-wide control & management to –Satisfy performance goals –Use resources efficiently Knobs: –Configuration & topology: provisioning, capacity planning –Routing: OSPF weights, MPLS tunnels, BGP policies,… –Traffic classification (diffserv), admission control,… Measurements are key: closed control loop –Characterize demand: what’s coming in? –Observe network state: how is the network reacting? (low-level adaptivity!) –Check performance: what’s the customer’s QoS?

Traffic Matrix vs. Path Matrix Traffic matrix –# bytes from ingress i to egress j Path matrix –Spatial flow of traffic through domain –# bytes for every path from i to j

flow 1flow 2flow 3 flow 4 Flow Measurement IP flow abstraction –Set of packets with “same” src and dest IP addresses –Packets that are “close” together in time (a few seconds) Cisco NetFlow –Router maintains a cache of statistics about active flows –Router exports a measurement record for each flow

Inferring the Path Matrix from the Traffic Matrix

Network State Uncertainty Hard to get an up-to-date snapshot of… …routing –Large state space –Vendor-specific implementation –Deliberate randomness –Multicast …element states –Links, cards, protocols,… …element performance –Packet loss, delay at links

missing “down” alarms spurious down noise missing alarms

Direct Traffic Observation Goal: direct observation –No network model & state estimation Basic idea: –Sample packets at each link –Sampling decision based on hash over packet content –Consistent sampling  trajectories –Labels based on second hash function Exploit entropy in packet content to obtain statistically representative set of trajectories

Sampling and Labeling Fields of interest collected only once Multicast: trajectory is a tree

Fields Included in Hashes

Collisions: Identical Packets

Sampling and Labeling Hashes x: subset of packet bits, represented as binary number Sampling hash –h(x) = x mod A –Sample if h(x) < r –r/A: thinning factor Labeling hash –g(x) = x mod M Make appropriate choice of A, M –predictable patterns should “mix” well

Pseudo-Random Sampling Goal: infer metrics of interest from trajectory samples –E.g., what fraction of traffic of customer x on a link y? Question: is sample set statistically representative? –Obvious for “really random” sampling –Distribution of a field in the sampled subset = real distribution? –In other words: does the complement of the field provide enough entropy?

Quality of Deterministic Sampling Experiment: statistical test to check if sampled and full distributions are close –Chi-square statistic to verify independence hypothesis –Hypothesis: sampled distribution consistent with full distribution –Confidence level C(T) for hypothesis, where C is cdf of with I-1 degrees of freedom

Chi-square Test on Source Address If, then accept hypothesis

Bitwise Independence 2x2 contingency table formed by –sampling decision –l-th bit of packet

Optimal Sampling Fix amount of measurement traffic c per time period Problem: –n: number of samples in sampling period –M: alphabet size, m=log2(M) bits/label –nm: total amount of measurement traffic [bits] –Goal: maximize # unique labels, subject to nm<c Result: –optimal alphabet size M*=c log(2) –optimal number of samples n*=M*/log(M*) –example: c=1Mb/period 

Label Collisions and Trajectory Ambiguity

Ambiguity cont. Rule for acyclic subgraphs + unicast packets: –unambiguous if each connected component of the subgraph is (a) a source tree (b) a sink tree without loss

Inference Experiment Experiment: infer from trajectory samples –Estimate fraction of traffic from customer –Source address  customer –Source address  sampling + label Fraction of customer traffic on backbone link:

Estimated Fraction (c=1000bit)

Estimated Fraction (c=10kbit)

Sampling Device MPLS: simple additional logic to look “behind” label stack

Sampling Device Implementation Interface vs. processing speed –OC-192: 10 Gbps –State of the art DSP: Proc: 600M MACs x 32 bit: 20 Gbps I/O: 300MHz x 256 bit: 70 Gbps –Moore’s law vs. interface speed growth Vendor interest: cisco, juniper, avici

Summary Advantages –Trajectory sampling estimates path matrix …and other metrics: loss, link delay –Direct observation: no routing model + network state estimation –No router state –Multicast (source tree), DDoS (sink tree) –Control over measurement overhead –Small measurement delay Disadvantages –Requires support on linecards Open questions & research problems –Collection, storage, querying (in progress) –Management interface

Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research.

Similar presentations

Presentation on theme: "Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research.

Similar presentations

Presentation on theme: "Trajectory Sampling for Direct Traffic Observation Matthias Grossglauser joint work with Nick Duffield AT&T Labs – Research."— Presentation transcript:

Similar presentations

About project

Feedback