Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005.

Similar presentations


Presentation on theme: "Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005."— Presentation transcript:

1 Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005 1

2 Outline Introduction Background Network Anomography ◦ Anomaly detection method ◦ Inference algorithm Dynamic Network Anomography Evaluation Methodology Results Conclusions & comments 2

3 Introduction In IP network, anomaly detection is a first and important step needed ◦ Respond to unexpected problems ◦ Assure high performance and security Many types of network problems cause abnormal patterns to appear in the network traffic ◦ DDoS attacks, network worms, vendor implementation bugs, network misconfigurations 3

4 Introduction Network Anomography ◦ Combining “anomalous” with “tomography” ◦ Inferring network-level anomalies from widely available data aggregates Network Tomography ◦ Inference of traffic matrices from individual link load measurement ◦ Simple Network Management Protocol (SNMP) [34] 4

5 Introduction Link loads and traffic matrices are simply related by a linear equation b = Ax ◦ b: link loads measurements ◦ A: routing matrix ◦ x: traffic matrix Anomography is more complex ◦ Anomaly detection is performed on a series of measurements over a period of time ◦ Anomalies have dramatically different properties from normal traffic 5

6 Introduction Major contributions ◦ A powerful framework that encompasses a wide class of methods for network anomography  Clearly decouple the inference and anomaly detection step ◦ A new algorithm for dynamic anomography  Isolating routing changes from traffic anomalies  Robust to missing link load measurement ◦ Using data sets from Tier-1 ISP and Internet2’s Abilene network, an extensive and thorough evaluation 6

7 Background Network Tomography ◦ Inferring Origin-Destination (OD) traffic from link load measurement, b = Ax ◦ For a network with n links, m OD flows  Routing matrix is the n*m matrix A  A = [a ij ], indicates the fraction of traffic from flow j to appear on link i 7 [5, 14, 22, 23, 28, 31, 34, 35]

8 Network Anomography Assume that routing matrices A are time- invariant B = AX ◦ B = [b 1 b 2 …b t ] ◦ X = [x 1 x 2 …x t ] Two basic solution strategies to network anomography ◦ Early inverse ◦ Late inverse 8

9 Network Anomography Early-inverse ◦ Network tomography  Anomaly detection ◦ Drawback  Error in inference problem can contaminate the anomaly detection step  Computationally expensive Late-inverse ◦ “lossy” inference ◦ Extract the anomalous traffic from the link load observation, then form and solve a new set of inference problems  9

10 Network Anomography is formed by multiplying B with a transformation matrix T ◦ Spatial anomography  ◦ Temporal anomography  10

11 Spatial Anomography Spatial PCA ◦ PCA (Principal Component Analysis)  Finding dominant patterns  In [19], Lakhina et al. proposed a subspace analysis of link traffic for anomaly detection ◦ Identify a coordinate transformation of B  The link traffic data under the new coordinate systems have the greatest degree of variance along the first axis, and so forth  These axes are called the principal components 11

12 Spatial Anomography Principal component matrix ◦ P = [v 1 v 2 …v m ] T Divide the link traffic space into normal and anomalous subspace ◦ Lakhina et al. [19] developed a threshold-based separation methed  Examining the projection of link load data on each axis in order  A projection is found that contains a 3 σ deviation from the mean, the principal component and all subsequent components are assigned to the anomalous subspace 12

13 Spatial Anomography Anomalous subspace ◦ P a = [v r v r+1 …v m ] T ◦ v r is the first component that fail to pass the threshold test Anomalous traffic can be extracted from link load observation by ◦ First projecting the data into the anomalous subspace and then transforming it back   Transformation matrix 13

14 Temporal Anomography ARIMA (AutoRegressive Moving Average) ◦ Linear time-series forecasting technique ◦ Capture the linear dependency of the future values on the past ◦ Simply identify the forecast errors as anomalous link traffic 14

15 Temporal Anomography Fourier Analysis ◦ Decomposing a complex periodic waveform into a set of sinusoids with different amplitudes, frequencies and phases  Low frequency components capture the daily and weekly traffic pattern,  While high frequency components represent the sudden changes in traffic behavior ◦ High frequency components in the traffic data will use as anomalous link 15

16 Temporal Anomography Wavelet Analysis ◦ Mathematical functions that cut up data into different frequency components ◦ Superior to traditional Fourier methods ◦ Filtering the low frequency components Temporal PCA ◦ Apply PCA on B T as opposed to B as used in spatial PCA 16

17 Inference Algorithms Present three common inference algorithm for solving linear inverse problem ◦ Deal with the underconstrained linear system by searching for a solution that minimizes some notions of vector norm Pseudoinverse Solution Sparsity Maximization ◦ Greedy algorithm 17

18 Dynamic Network Anomography Goal ◦ Allow for dynamic routing changes  The normal “self-healing” behavior of the network ◦ If some measurements are missing (at time j), can still form a consistent problem  By setting the appropriate rows of A j to zero ◦ Seek a solution which is consistent with the equation, but also minimizes the norm 18

19 Dynamic Network Anomography Use ARIMA model because ◦ it can be written in a form such that the set of constraints does not grow with t ◦ Also developed two techniques [33] to reduce the size of the minimization problems  Routing change is infrequent (i.e. not in every time interval) and local (i.e. only in a small subset of rows) 19

20 Evaluation Methodology Two large backbone networks (USA) ◦ Internet2’s Abilene network  12 routers, 15 backbone links, 144 OD flows ◦ Tier-1 ISP  Hundreds of routers, thousands of links, reduce the total number of OD flows to about 6000 20

21 Evaluation Methodology Ideally, compare the set of anomalies identified by each of the method to the set of “true” network anomalies ◦ Very difficult task Instead, perform pair-wise comparisions, base on the top ranked anomalies identified by each of the anomography methods ◦ Set B M (j) : apply anomaly detection method j directly to the OD flow data, top ranked M anomalies ◦ Set A N (i) : each of anomography method i examine the set of N largest anomalies inferred from link load data 21

22 Evaluation Methodology A N (i) : Anomography mathod i B M (j) : Benchmark j False Positives ◦ N < M (ex: N=30, M=50) ◦ Top30 anomalies of A but not in top50 of B False Negatives ◦ N > M (ex: N=50, M=30) ◦ Top 50 anomalies of A but not in top30 of B Detective rate ◦ Is the ratio of the overlap between the two sets 22

23 Results – Inference Techniques 23

24 Results – Inference Techniques 24

25 Results – Robustness 25

26 Results – Impact of Route Change 26

27 Results – Anomography Methods 27

28 Results – Cross Validation 28

29 Conclusions A powerful framework for anomography ◦ Separate the anomaly detection component from inference component Put forward a number of novel algorithms ◦ Anomaly detection, inference component ◦ Spatial versus temporal approaches New dynamic anomography algorithm ◦ Handle routing change ◦ Robust to missing data Evaluate anomography methods ◦ ARIMA based methods and l 1 norm minimization shows high fidelity and robustness 29

30 Comments Traffic pattern anomaly detection ◦ Only volume and number of flows An idea about finding deviation ◦ PCA, 3 σ deviation from the mean Mathematics is too hard 30


Download ppt "Network Anomography Yin Zhang – University of Texas at Austin Zihui Ge and Albert Greenberg – AT&T Labs Matthew Roughan – University of Adelaide IMC 2005."

Similar presentations


Ads by Google