Download presentation
Presentation is loading. Please wait.
Published byHubert Shelton Modified over 8 years ago
1
Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005 1
2
Outline Introduction Background Network Anomography ◦ Anomaly detection method ◦ Inference algorithm Dynamic Network Anomography Evaluation Methodology Results Conclusions & comments 2
3
Introduction In IP network, anomaly detection is a first and important step needed ◦ Respond to unexpected problems ◦ Assure high performance and security Many types of network problems cause abnormal patterns to appear in the network traffic ◦ DDoS attacks, network worms, vendor implementation bugs, network misconfigurations 3
4
Introduction Network Anomography ◦ Combining “anomalous” with “tomography” ◦ Inferring network-level anomalies from widely available data aggregates Network Tomography ◦ Inference of traffic matrices from individual link load measurement ◦ Simple Network Management Protocol (SNMP) [34] 4
5
Introduction Link loads and traffic matrices are simply related by a linear equation b = Ax ◦ b: link loads measurements ◦ A: routing matrix ◦ x: traffic matrix Anomography is more complex ◦ Anomaly detection is performed on a series of measurements over a period of time ◦ Anomalies have dramatically different properties from normal traffic 5
6
Introduction Major contributions ◦ A powerful framework that encompasses a wide class of methods for network anomography Clearly decouple the inference and anomaly detection step ◦ A new algorithm for dynamic anomography Isolating routing changes from traffic anomalies Robust to missing link load measurement ◦ Using data sets from Tier-1 ISP and Internet2’s Abilene network, an extensive and thorough evaluation 6
7
Background Network Tomography ◦ Inferring Origin-Destination (OD) traffic from link load measurement, b = Ax ◦ For a network with n links, m OD flows Routing matrix is the n*m matrix A A = [a ij ], indicates the fraction of traffic from flow j to appear on link i 7 [5, 14, 22, 23, 28, 31, 34, 35]
8
Network Anomography Assume that routing matrices A are time- invariant B = AX ◦ B = [b 1 b 2 …b t ] ◦ X = [x 1 x 2 …x t ] Two basic solution strategies to network anomography ◦ Early inverse ◦ Late inverse 8
9
Network Anomography Early-inverse ◦ Network tomography Anomaly detection ◦ Drawback Error in inference problem can contaminate the anomaly detection step Computationally expensive Late-inverse ◦ “lossy” inference ◦ Extract the anomalous traffic from the link load observation, then form and solve a new set of inference problems 9
10
Network Anomography is formed by multiplying B with a transformation matrix T ◦ Spatial anomography ◦ Temporal anomography 10
11
Spatial Anomography Spatial PCA ◦ PCA (Principal Component Analysis) Finding dominant patterns In [19], Lakhina et al. proposed a subspace analysis of link traffic for anomaly detection ◦ Identify a coordinate transformation of B The link traffic data under the new coordinate systems have the greatest degree of variance along the first axis, and so forth These axes are called the principal components 11
12
Spatial Anomography Principal component matrix ◦ P = [v 1 v 2 …v m ] T Divide the link traffic space into normal and anomalous subspace ◦ Lakhina et al. [19] developed a threshold-based separation methed Examining the projection of link load data on each axis in order A projection is found that contains a 3 σ deviation from the mean, the principal component and all subsequent components are assigned to the anomalous subspace 12
13
Spatial Anomography Anomalous subspace ◦ P a = [v r v r+1 …v m ] T ◦ v r is the first component that fail to pass the threshold test Anomalous traffic can be extracted from link load observation by ◦ First projecting the data into the anomalous subspace and then transforming it back Transformation matrix 13
14
Temporal Anomography ARIMA (AutoRegressive Moving Average) ◦ Linear time-series forecasting technique ◦ Capture the linear dependency of the future values on the past ◦ Simply identify the forecast errors as anomalous link traffic 14
15
Temporal Anomography Fourier Analysis ◦ Decomposing a complex periodic waveform into a set of sinusoids with different amplitudes, frequencies and phases Low frequency components capture the daily and weekly traffic pattern, While high frequency components represent the sudden changes in traffic behavior ◦ High frequency components in the traffic data will use as anomalous link 15
16
Temporal Anomography Wavelet Analysis ◦ Mathematical functions that cut up data into different frequency components ◦ Superior to traditional Fourier methods ◦ Filtering the low frequency components Temporal PCA ◦ Apply PCA on B T as opposed to B as used in spatial PCA 16
17
Inference Algorithms Present three common inference algorithm for solving linear inverse problem ◦ Deal with the underconstrained linear system by searching for a solution that minimizes some notions of vector norm Pseudoinverse Solution Sparsity Maximization ◦ Greedy algorithm 17
18
Dynamic Network Anomography Goal ◦ Allow for dynamic routing changes The normal “self-healing” behavior of the network ◦ If some measurements are missing (at time j), can still form a consistent problem By setting the appropriate rows of A j to zero ◦ Seek a solution which is consistent with the equation, but also minimizes the norm 18
19
Dynamic Network Anomography Use ARIMA model because ◦ it can be written in a form such that the set of constraints does not grow with t ◦ Also developed two techniques [33] to reduce the size of the minimization problems Routing change is infrequent (i.e. not in every time interval) and local (i.e. only in a small subset of rows) 19
20
Evaluation Methodology Two large backbone networks (USA) ◦ Internet2’s Abilene network 12 routers, 15 backbone links, 144 OD flows ◦ Tier-1 ISP Hundreds of routers, thousands of links, reduce the total number of OD flows to about 6000 20
21
Evaluation Methodology Ideally, compare the set of anomalies identified by each of the method to the set of “true” network anomalies ◦ Very difficult task Instead, perform pair-wise comparisions, base on the top ranked anomalies identified by each of the anomography methods ◦ Set B M (j) : apply anomaly detection method j directly to the OD flow data, top ranked M anomalies ◦ Set A N (i) : each of anomography method i examine the set of N largest anomalies inferred from link load data 21
22
Evaluation Methodology A N (i) : Anomography mathod i B M (j) : Benchmark j False Positives ◦ N < M (ex: N=30, M=50) ◦ Top30 anomalies of A but not in top50 of B False Negatives ◦ N > M (ex: N=50, M=30) ◦ Top 50 anomalies of A but not in top30 of B Detective rate ◦ Is the ratio of the overlap between the two sets 22
23
Results – Inference Techniques 23
24
Results – Inference Techniques 24
25
Results – Robustness 25
26
Results – Impact of Route Change 26
27
Results – Anomography Methods 27
28
Results – Cross Validation 28
29
Conclusions A powerful framework for anomography ◦ Separate the anomaly detection component from inference component Put forward a number of novel algorithms ◦ Anomaly detection, inference component ◦ Spatial versus temporal approaches New dynamic anomography algorithm ◦ Handle routing change ◦ Robust to missing data Evaluate anomography methods ◦ ARIMA based methods and l 1 norm minimization shows high fidelity and robustness 29
30
Comments Traffic pattern anomaly detection ◦ Only volume and number of flows An idea about finding deviation ◦ PCA, 3 σ deviation from the mean Mathematics is too hard 30
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.