Download presentation
Presentation is loading. Please wait.
1
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005
2
January 2005 Problem Statement Accurate network traffic modeling and prediction are important for: Network provisioning Problem diagnosis But, network traffic is highly dynamic Exhibits multi-timescale properties (temporal domain) Anomalies (failures, attacks) interfere with analysis Need to isolate anomalies from normal traffic variation to achieve better modeling and prediction
3
January 2005 Our Work We decompose traffic signals into two parts Normal variations: follow certain law, can be modeled and are predictable Anomalies: consist of sudden changes and are not predictable Method: Multi-scale signal analysis and modeling for Network traffic prediction [almost done] Volume anomaly detection and identification [in progress]
4
January 2005 Properties of Real Life Data Streams Three real-world data traces and a random trace
5
January 2005 Observation: Multi-Scale Property Data source: bandwidth measurements for the CUDI network interface on an Abilene router with 5-minute average. Multi-scale property of traffic data in a weekly measurement Long-term trend Transient Variation Sudden Changes (anomalies)
6
January 2005 Multi-Scale Prediction of Future Data Traditional Approaches Last seen data as approximation for current estimation Linear Prediction Exploit and leverage statistical properties of data stream on temporal domain in a window of size T Exploit temporal correlation in different time scale Long term trend: B-spline estimation High frequency residual: ARMA modeling ARMA stands for AutoRegressive and Moving Average model, which is a standard time series technique to model chaotic data stream
7
January 2005 Two-Level Modeling and Prediction B-spline modeling for long term trend Piecewise continuous, low-degree B-spline can represent complex shapes Least-square B-spline regression for two-level decomposition B-Spline extension for future forecasting ARMA forecasting for transient oscillation System Identification to determine the order of the model Parameter estimation by optimization algorithm Low complexity recursive equation for future forecasting Statistical properties for the calibration of prediction results
8
January 2005 Complexity of Prediction Algorithms Legend T: the window size of history data m: the order of the linear predictor K: the order of the ARMA model d: the degree of B-spline curve c: the increase in storage due to multi-level data representation
9
January 2005 Performance of Prediction Algorithms Performance of Prediction Algorithms On Network Traffic Mean Relative Increment
10
January 2005 Unpredictability: Anomalies in Data Signal The data stream can be decomposed into two layers: the long-term trend, which is the modeled pattern; the residual, high frequency with anomalies Monday Data Long-term trend (modeled) Residual with anomalies
11
January 2005 Anomalies Detection and Identification Volume anomaly: Sudden change in link/flow’s traffic count Network failures, attacks, flash crowds Measurement anomalies Anomalies are not normal variations of network traffic and are not predictable Worse yet, anomalies skew the prediction models! For better modeling and prediction, need to detect and isolate anomalies from data The rest of the talk focuses on anomaly detection algorithms Existing algorithms: single-link vs. network-wide analysis New directions
12
January 2005 Single-link Anomalies Detection Multi-scale analysis to capture temporal correlation Use wavelets for multi-scale data decomposition Isolate characteristics of traffic signal on different timescales Expose the details of both ambient (modeled) and anomalous traffic Detection of sharp increase in the local variance in a moving time window on different time-scale Disadvantages Diagnose on single link or at single router, and is impractical to do analysis for large network Many anomalies are across multiple links, and is not obvious on single link
13
January 2005 Network-Wide Anomalies Detection Diagnose traffic anomalies spanning multiple links Capture the spatial correlation cross links Analyze origin-destination flows with known traffic matrix Principle Component Analysis (PCA) for dimension reduction and signal decomposition Separate traffic signal into modeled and residual space Scoring/prediction method to detect abnormal changes in residual space Disadvantages Need ISP support Single time scale analysis Centralized algorithm
14
January 2005 New Direction (1): Multi-Scale PCA PCA analysis on wavelets representation of traffic data Wavelets capture correlation within a single link PCA captures the correlation across links and transforms the multivariate space into a subspace which preserves maximum variance of the original space PCA analyzes data on single time-scale and can not utilize the information pertaining to the frequency Multi-scale PCA combines two extremes of wavelets and PCA based analysis of multivariate data Benefits: detection of multi-timescale anomalies
15
January 2005 New Direction (2): Distributed Algorithm Distributed anomalies detection based on partial flow information No ISP support, no information about OD flows and traffic matrix Diagnose volume anomalies based on network monitoring data – flow and link information from a subset of places in the network Network-wide traffic modeling, inference and prediction improve measured data Distributed algorithms for network-wide data reduction, decomposition and anomaly detection Collective PCA Benefit: local analysis and low cost
16
January 2005 Conclusion and Future Work Summary Apply statistical algorithms to network traffic analysis Multi-scale analysis is effective in traffic modeling and prediction Contribution: using multi-scale network-wide analysis to capture both temporal correlation within single link and spatial correlation cross links Future Work Develop distributed algorithms based on multi-scale PCA Exploit tradeoff between detection accuracy, false positive and computation cost Build real system for applications in traffic analysis and network health monitoring
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.