Download presentation
Presentation is loading. Please wait.
1
Traffic Analysis: Tools for Mining Time Series
C. Faloutsos 18-731 Traffic Analysis: Tools for Mining Time Series Christos Faloutsos CMU
2
Copyright: C. Faloutsos, 2006
18-731 Thanks Prof. Dimitris Gunopulos (UCR) Mengzhi Wang (CMU) Deepay Chakrabarti (CMU) Spiros Papadimitriou (CMU) Prof. Byoung-Kee Yi (Pohang U.) 18-731 Copyright: C. Faloutsos, 2006
3
Copyright: C. Faloutsos, 2006
Outline Motivation DSP (Digital Signal Processing) Linear Forecasting (Bursty traffic - fractals and multifractals) Conclusions 18-731 Copyright: C. Faloutsos, 2006
4
Copyright: C. Faloutsos, 2006
Problem definition Given: one or more sequences x1 , x2 , … , xt , … (y1, y2, … , yt, … … ) Find similar sequences; forecasts patterns; clusters; outliers 18-731 Copyright: C. Faloutsos, 2006
5
Motivation - Applications
Financial, sales, economic series Medical ECGs +; blood pressure etc monitoring reactions to new drugs elderly care 18-731 Copyright: C. Faloutsos, 2006
6
Motivation - Applications (cont’d)
‘Smart house’ sensors monitor temperature, humidity, air quality video surveillance 18-731 Copyright: C. Faloutsos, 2006
7
Motivation - Applications (cont’d)
civil/automobile infrastructure bridge vibrations [Oppenheim+02] road conditions / traffic monitoring 18-731 Copyright: C. Faloutsos, 2006
8
Motivation - Applications (cont’d)
Weather, environment/anti-pollution volcano monitoring air/water pollutant monitoring 18-731 Copyright: C. Faloutsos, 2006
9
Motivation - Applications (cont’d)
Computer systems ‘Active Disks’ (buffering, prefetching) web servers (ditto) network traffic monitoring ... 18-731 Copyright: C. Faloutsos, 2006
10
Stream Data: Disk accesses
#bytes time 18-731 Copyright: C. Faloutsos, 2006
11
Copyright: C. Faloutsos, 2006
Problem #1: Goal: given a signal (eg., #packets over time) Find: patterns, periodicities, and/or compress count lynx caught per year (packets per day; temperature per day) year 18-731 Copyright: C. Faloutsos, 2006
12
Copyright: C. Faloutsos, 2006
18-731 Problem#2: Forecast Given xt, xt-1, …, forecast xt+1 90 80 70 60 Number of packets sent ?? 50 40 30 20 What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 10 1 3 5 7 9 11 Time Tick 18-731 Copyright: C. Faloutsos, 2006
13
Copyright: C. Faloutsos, 2006
18-731 Problem #3: Given: A set of correlated time sequences Forecast ‘Sent(t)’ What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 18-731 Copyright: C. Faloutsos, 2006
14
Differences from DSP/Stat
C. Faloutsos 18-731 Differences from DSP/Stat Semi-infinite streams we need on-line, ‘any-time’ algorithms Can not afford human intervention need automatic methods sensors have limited memory / processing / transmitting power need for (lossy) compression 18-731 Copyright: C. Faloutsos, 2006
15
Important observations
C. Faloutsos 18-731 Important observations Patterns, rules, forecasting and similarity indexing are closely related: To do forecasting, we need to find patterns/rules to find similar settings in the past to find outliers, we need to have forecasts (outlier = too far away from our forecast) 18-731 Copyright: C. Faloutsos, 2006
16
Important topics NOT in this tutorial:
Continuous queries [Babu+Widom ] [Gehrke+] [Madden+] Categorical data streams [Hatonen+96] Outlier detection (discontinuities) [Breunig+00] 18-731 Copyright: C. Faloutsos, 2006
17
Copyright: C. Faloutsos, 2006
DSP (Digital Signal Processing) 18-731 Copyright: C. Faloutsos, 2006
18
Copyright: C. Faloutsos, 2006
Outline Motivation DSP (DFT, DWT) Linear Forecasting Bursty traffic - fractals and multifractals Conclusions 18-731 Copyright: C. Faloutsos, 2006
19
Copyright: C. Faloutsos, 2006
Outline DFT Definition of DFT and properties how to read the DFT spectrum DWT Definition of DWT and properties how to read the DWT scalogram 18-731 Copyright: C. Faloutsos, 2006
20
Introduction - Problem#1
Goal: given a signal (eg., packets over time) Find: patterns and/or compress count lynx caught per year (packets per day; automobiles per hour) year 18-731 Copyright: C. Faloutsos, 2006
21
Copyright: C. Faloutsos, 2006
What does DFT do? A: highlights the periodicities 18-731 Copyright: C. Faloutsos, 2006
22
Copyright: C. Faloutsos, 2006
Skip DFT: definition For a sequence x0, x1, … xn-1 the (n-point) Discrete Fourier Transform is X0, X1, … Xn-1 : inverse DFT 18-731 Copyright: C. Faloutsos, 2006
23
Copyright: C. Faloutsos, 2006
Skip DFT: definition Good news: Available in all symbolic math packages, eg., in ‘mathematica’ x = [1,2,1,2]; X = Fourier[x]; Plot[ Abs[X] ]; 18-731 Copyright: C. Faloutsos, 2006
24
DFT: Amplitude spectrum
Skip DFT: Amplitude spectrum Amplitude: count Ampl. freq=0 freq=12 year Freq. 18-731 Copyright: C. Faloutsos, 2006
25
Copyright: C. Faloutsos, 2006
Skip DFT: examples flat Amplitude time freq 18-731 Copyright: C. Faloutsos, 2006
26
Copyright: C. Faloutsos, 2006
Skip DFT: examples Low frequency sinusoid time freq 18-731 Copyright: C. Faloutsos, 2006
27
Copyright: C. Faloutsos, 2006
Skip DFT: examples Sinusoid - symmetry property: Xf = X*n-f time freq 18-731 Copyright: C. Faloutsos, 2006
28
Copyright: C. Faloutsos, 2006
Skip DFT: examples Higher freq. sinusoid time freq 18-731 Copyright: C. Faloutsos, 2006
29
Copyright: C. Faloutsos, 2006
Skip DFT: examples examples + = + 18-731 Copyright: C. Faloutsos, 2006
30
Copyright: C. Faloutsos, 2006
Skip DFT: examples examples Ampl. Freq. 18-731 Copyright: C. Faloutsos, 2006
31
DFT: Amplitude spectrum
count Ampl. freq=0 freq=12 year Freq. 18-731 Copyright: C. Faloutsos, 2006
32
DFT: Amplitude spectrum
count Ampl. freq=0 freq=12 year Freq. 18-731 Copyright: C. Faloutsos, 2006
33
DFT: Amplitude spectrum
count Ampl. freq=0 freq=12 year Freq. 18-731 Copyright: C. Faloutsos, 2006
34
DFT: Amplitude spectrum
excellent approximation, with only 2 frequencies! so what? Freq. 18-731 Copyright: C. Faloutsos, 2006
35
DFT: Amplitude spectrum
excellent approximation, with only 2 frequencies! so what? A1: (lossy) compression A2: pattern discovery 18-731 Copyright: C. Faloutsos, 2006
36
DFT: Amplitude spectrum
excellent approximation, with only 2 frequencies! so what? A1: (lossy) compression A2: pattern discovery 18-731 Copyright: C. Faloutsos, 2006
37
Copyright: C. Faloutsos, 2006
DFT - Conclusions It spots periodicities (with the ‘amplitude spectrum’) can be quickly computed (O( n log n)), thanks to the FFT algorithm. standard tool in signal processing (speech, image etc signals) (closely related to DCT and JPEG) 18-731 Copyright: C. Faloutsos, 2006
38
Copyright: C. Faloutsos, 2006
Outline Motivation DSP DFT DWT Definition of DWT and properties how to read the DWT scalogram 18-731 Copyright: C. Faloutsos, 2006
39
Copyright: C. Faloutsos, 2006
Problem #1: Goal: given a signal (eg., #packets over time) Find: patterns, periodicities, and/or compress count lynx caught per year (packets per day; virus infections per month) year 18-731 Copyright: C. Faloutsos, 2006
40
Copyright: C. Faloutsos, 2006
Wavelets - DWT DFT is great - but, how about compressing a spike? value time 18-731 Copyright: C. Faloutsos, 2006
41
Copyright: C. Faloutsos, 2006
Wavelets - DWT DFT is great - but, how about compressing a spike? A: Terrible - all DFT coefficients needed! value Ampl. time Freq. 18-731 Copyright: C. Faloutsos, 2006
42
Copyright: C. Faloutsos, 2006
Wavelets - DWT DFT is great - but, how about compressing a spike? A: Terrible - all DFT coefficients needed! value time 18-731 Copyright: C. Faloutsos, 2006
43
Copyright: C. Faloutsos, 2006
Wavelets - DWT Similarly, DFT suffers on short-duration waves (eg., baritone, silence, soprano) time value 18-731 Copyright: C. Faloutsos, 2006
44
Copyright: C. Faloutsos, 2006
Wavelets - DWT Solution#1: Short window Fourier transform (SWFT) But: how short should be the window? freq time value time 18-731 Copyright: C. Faloutsos, 2006
45
Copyright: C. Faloutsos, 2006
Wavelets - DWT Answer: multiple window sizes! -> DWT Time domain DFT SWFT DWT freq time 18-731 Copyright: C. Faloutsos, 2006
46
Copyright: C. Faloutsos, 2006
Haar Wavelets subtract sum of left half from right half repeat recursively for quarters, eight-ths, ... 18-731 Copyright: C. Faloutsos, 2006
47
Wavelets - construction
x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
48
Wavelets - construction
level 1 d1,0 d1,1 s1,1 + - x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
49
Wavelets - construction
level 2 d2,0 s1,0 d1,0 d1,1 s1,1 + - x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
50
Wavelets - construction
etc ... s2,0 d2,0 s1,0 d1,0 d1,1 s1,1 + - x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
51
Wavelets - construction
Q: map each coefficient on the time-freq. plane f s2,0 d2,0 t s1,0 d1,0 d1,1 s1,1 + - x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
52
Wavelets - construction
Q: map each coefficient on the time-freq. plane f s2,0 d2,0 t s1,0 d1,0 d1,1 s1,1 + - x0 x1 x2 x3 x4 x5 x6 x7 18-731 Copyright: C. Faloutsos, 2006
53
Copyright: C. Faloutsos, 2006
Haar wavelets - code #!/usr/bin/perl5 # expects a file with numbers # and prints the dwt transform # The number of time-ticks should be a power of 2 # USAGE # haar.pl <fname> # the smooth component of the signal # the high-freq. component # collect the values into the while(<>){ @vals = , split ); } my $len = my $half = int($len/2); while($half >= 1 ){ for(my $i=0; $i< $half; $i++){ $diff [$i] = ($vals[2*$i] - $vals[2*$i + 1] )/ sqrt(2); print "\t", $diff[$i]; $smooth [$i] = ($vals[2*$i] + $vals[2*$i + 1] )/ sqrt(2); } print "\n"; @vals $half = int($half/2); print "\t", $vals[0], "\n" ; # the final, smooth component 18-731 Copyright: C. Faloutsos, 2006
54
Wavelets - construction
Observation1: ‘+’ can be some weighted addition ‘-’ is the corresponding weighted difference (‘Quadrature mirror filters’) Observation2: unlike DFT/DCT, there are *many* wavelet bases: Haar, Daubechies-4, Daubechies-6, Coifman, Morlet, Gabor, ... 18-731 Copyright: C. Faloutsos, 2006
55
Wavelets - how do they look like?
E.g., Daubechies-4 18-731 Copyright: C. Faloutsos, 2006
56
Wavelets - how do they look like?
E.g., Daubechies-4 ? ? 18-731 Copyright: C. Faloutsos, 2006
57
Wavelets - how do they look like?
E.g., Daubechies-4 18-731 Copyright: C. Faloutsos, 2006
58
Copyright: C. Faloutsos, 2006
Outline Motivation DSP DFT DWT Definition of DWT and properties how to read the DWT scalogram 18-731 Copyright: C. Faloutsos, 2006
59
Copyright: C. Faloutsos, 2006
Wavelets - Drill#1: Q: baritone/silence/soprano - DWT? t f time value 18-731 Copyright: C. Faloutsos, 2006
60
Copyright: C. Faloutsos, 2006
Wavelets - Drill#1: Q: baritone/soprano - DWT? f t time value 18-731 Copyright: C. Faloutsos, 2006
61
Copyright: C. Faloutsos, 2006
Wavelets - Drill#2: Q: spike - DWT? t f 18-731 Copyright: C. Faloutsos, 2006
62
Copyright: C. Faloutsos, 2006
Wavelets - Drill#2: Q: spike - DWT? -0.35 0.35 f t 18-731 Copyright: C. Faloutsos, 2006
63
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: weekly + daily periodicity, + spike - DWT? f t 18-731 Copyright: C. Faloutsos, 2006
64
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: weekly + daily periodicity, + spike - DWT? f t 18-731 Copyright: C. Faloutsos, 2006
65
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: weekly + daily periodicity, + spike - DWT? f t 18-731 Copyright: C. Faloutsos, 2006
66
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: weekly + daily periodicity, + spike - DWT? f t 18-731 Copyright: C. Faloutsos, 2006
67
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: weekly + daily periodicity, + spike - DWT? f t 18-731 Copyright: C. Faloutsos, 2006
68
Copyright: C. Faloutsos, 2006
Wavelets - Drill#3: Q: DFT? DWT DFT t f f t 18-731 Copyright: C. Faloutsos, 2006
69
Advantages of Wavelets
Better compression (better RMSE with same number of coefficients - used in JPEG-2000) fast to compute (usually: O(n)!) very good for ‘spikes’ mammalian eye and ear: Gabor wavelets 18-731 Copyright: C. Faloutsos, 2006
70
Copyright: C. Faloutsos, 2006
Overall Conclusions DFT, DCT spot periodicities DWT : multi-resolution - matches processing of mammalian ear/eye better All three: powerful tools for compression, pattern detection in real signals All three: included in math packages (matlab, ‘R’, mathematica, … - often in spreadsheets!) 18-731 Copyright: C. Faloutsos, 2006
71
Copyright: C. Faloutsos, 2006
Overall Conclusions DWT : very suitable for self-similar traffic DWT: used for summarization of streams [Gilbert+01], db histograms etc 18-731 Copyright: C. Faloutsos, 2006
72
Resources - software and urls
: Nice java applets for FFT voice frequency analyzer (needs microphone) 18-731 Copyright: C. Faloutsos, 2006
73
Resources: software and urls
xwpl: open source wavelet package from Yale, with excellent GUI : wavelets and scalograms 18-731 Copyright: C. Faloutsos, 2006
74
Copyright: C. Faloutsos, 2006
Books William H. Press, Saul A. Teukolsky, William T. Vetterling and Brian P. Flannery: Numerical Recipes in C, Cambridge University Press, 1992, 2nd Edition. (Great description, intuition and code for DFT, DWT) C. Faloutsos: Searching Multimedia Databases by Content, Kluwer Academic Press, 1996 (introduction to DFT, DWT) 18-731 Copyright: C. Faloutsos, 2006
75
Copyright: C. Faloutsos, 2006
Additional Reading [Gilbert+01] Anna C. Gilbert, Yannis Kotidis and S. Muthukrishnan and Martin Strauss, Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries, VLDB 2001 [HHP03] A. Hussain, J. Heidemann, C. Papadopoulos, A Framework for Classifying Denial of Service Attacks Proc. SIGCOMM 2003 18-731 Copyright: C. Faloutsos, 2006
76
Copyright: C. Faloutsos, 2006
Linear forecasting 18-731 Copyright: C. Faloutsos, 2006
77
Copyright: C. Faloutsos, 2006
Outline Motivation DSP Linear Forecasting Bursty traffic - fractals and multifractals Conclusions 18-731 Copyright: C. Faloutsos, 2006
78
Copyright: C. Faloutsos, 2006
Forecasting "Prediction is very difficult, especially about the future." - Nils Bohr 18-731 Copyright: C. Faloutsos, 2006
79
Copyright: C. Faloutsos, 2006
Outline Motivation ... Linear Forecasting Auto-regression: Least Squares; RLS Co-evolving time sequences Examples Conclusions 18-731 Copyright: C. Faloutsos, 2006
80
Copyright: C. Faloutsos, 2006
18-731 Problem#2: Forecast Example: give xt-1, xt-2, …, forecast xt 90 80 70 60 Number of packets sent ?? 50 40 30 20 What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 10 1 3 5 7 9 11 Time Tick 18-731 Copyright: C. Faloutsos, 2006
81
Forecasting: Preprocessing
C. Faloutsos 18-731 Forecasting: Preprocessing MANUALLY: remove trends spot periodicities 7 days What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. time time 18-731 Copyright: C. Faloutsos, 2006
82
Copyright: C. Faloutsos, 2006
18-731 Problem#2: Forecast Solution: try to express xt as a linear function of the past: xt-2, xt-2, …, (up to a window of w) Formally: 10 20 30 40 50 60 70 80 90 1 3 5 7 9 11 Time Tick ?? What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 18-731 Copyright: C. Faloutsos, 2006
83
(Problem: Back-cast; interpolate)
C. Faloutsos 18-731 (Problem: Back-cast; interpolate) Solution - interpolate: try to express xt as a linear function of the past AND the future: xt+1, xt+2, … xt+wfuture; xt-1, … xt-wpast (up to windows of wpast , wfuture) EXACTLY the same algo’s ?? 90 80 70 60 50 What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 40 30 20 10 1 3 5 7 9 11 Time Tick 18-731 Copyright: C. Faloutsos, 2006
84
Linear Regression: idea
C. Faloutsos 18-731 Linear Regression: idea 85 Body height 80 75 70 65 60 55 50 45 40 15 25 35 45 Body weight express what we don’t know (= ‘dependent variable’) as a linear function of what we know (= ‘indep. variable(s)’) For those in the audience who are not familiar with linear regression, let me give you an example to show how it works. Here we have a scatter plot of sample points we want to regress. The X-axis is for the independent variable and the Y-axis for the dependent variable. The goal is to find a line which best approximated the trend of the samples. More specifically we seek a line which minimize the sum of squared estimation errors. This is a uni-variate regression. In the multi-variate case, it is done in a higher dimensional space and the line becomes a hyper-plane in that case. 18-731 Copyright: C. Faloutsos, 2006
85
Linear Auto Regression:
C. Faloutsos 18-731 Linear Auto Regression: For those in the audience who are not familiar with linear regression, let me give you an example to show how it works. Here we have a scatter plot of sample points we want to regress. The X-axis is for the independent variable and the Y-axis for the dependent variable. The goal is to find a line which best approximated the trend of the samples. More specifically we seek a line which minimize the sum of squared estimation errors. This is a uni-variate regression. In the multi-variate case, it is done in a higher dimensional space and the line becomes a hyper-plane in that case. 18-731 Copyright: C. Faloutsos, 2006
86
Linear Auto Regression:
C. Faloutsos 18-731 Linear Auto Regression: 85 ‘lag-plot’ 80 75 70 Number of packets sent (t) 65 60 55 50 45 40 15 25 35 45 Number of packets sent (t-1) lag w=1 Dependent variable = # of packets sent (S [t]) Independent variable = # of packets sent (S[t-1]) For those in the audience who are not familiar with linear regression, let me give you an example to show how it works. Here we have a scatter plot of sample points we want to regress. The X-axis is for the independent variable and the Y-axis for the dependent variable. The goal is to find a line which best approximated the trend of the samples. More specifically we seek a line which minimize the sum of squared estimation errors. This is a uni-variate regression. In the multi-variate case, it is done in a higher dimensional space and the line becomes a hyper-plane in that case. 18-731 Copyright: C. Faloutsos, 2006
87
Copyright: C. Faloutsos, 2006
Outline Motivation ... Linear Forecasting Auto-regression: Least Squares; RLS Co-evolving time sequences Examples Conclusions 18-731 Copyright: C. Faloutsos, 2006
88
Copyright: C. Faloutsos, 2006
18-731 More details: Q1: Can it work with window w>1? A1: YES! xt This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. xt-1 xt-2 18-731 Copyright: C. Faloutsos, 2006
89
Copyright: C. Faloutsos, 2006
18-731 More details: Q1: Can it work with window w>1? A1: YES! (we’ll fit a hyper-plane, then!) xt This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. xt-1 xt-2 18-731 Copyright: C. Faloutsos, 2006
90
Copyright: C. Faloutsos, 2006
18-731 More details: Q1: Can it work with window w>1? A1: YES! (we’ll fit a hyper-plane, then!) xt This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. xt-1 xt-2 18-731 Copyright: C. Faloutsos, 2006
91
Copyright: C. Faloutsos, 2006
18-731 Skip More details: Q1: Can it work with window w>1? A1: YES! The problem becomes: X[N w] a[w 1] = y[N 1] OVER-CONSTRAINED a is the vector of the regression coefficients X has the N values of the w indep. variables y has the N values of the dependent variable This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. 18-731 Copyright: C. Faloutsos, 2006
92
Copyright: C. Faloutsos, 2006
18-731 Skip More details: X[N w] a[w 1] = y[N 1] Ind-var1 Ind-var-w time This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. 18-731 Copyright: C. Faloutsos, 2006
93
Copyright: C. Faloutsos, 2006
18-731 Skip More details: X[N w] a[w 1] = y[N 1] Ind-var1 Ind-var-w time This presentation is organized as follows: In the introduction, let me describe what motivated this research. I also describe the kinds of problems we are trying to solve. Nest, I’ll talk about our main contributions, namely, MUSCLES and Selective MUSCLES. Then, we report some of experimental results followed by concluding remarks and discussion of future research direction. 18-731 Copyright: C. Faloutsos, 2006
94
Copyright: C. Faloutsos, 2006
Skip More details Q2: How to estimate a1, a2, … aw = a? A2: with Least Squares fit (Moore-Penrose pseudo-inverse) a is the vector that minimizes the RMSE from y a = ( XT X )-1 (XT y) 18-731 Copyright: C. Faloutsos, 2006
95
Copyright: C. Faloutsos, 2006
Skip Even more details Q3: Can we estimate a incrementally? A3: Yes, with the brilliant, classic method of ‘Recursive Least Squares’ (RLS) (see, e.g., [Yi+00], for details) - pictorially: 18-731 Copyright: C. Faloutsos, 2006
96
Copyright: C. Faloutsos, 2006
Skip Even more details Given: Independent Variable Dependent Variable 18-731 Copyright: C. Faloutsos, 2006
97
Copyright: C. Faloutsos, 2006
Skip Even more details . new point Dependent Variable Independent Variable 18-731 Copyright: C. Faloutsos, 2006
98
Copyright: C. Faloutsos, 2006
Skip Even more details RLS: quickly compute new best fit new point Dependent Variable Independent Variable 18-731 Copyright: C. Faloutsos, 2006
99
Copyright: C. Faloutsos, 2006
Skip Even more details Straightforward Least Squares Needs huge matrix (growing in size) O(N×w) Costly matrix operation O(N×w2) Recursive LS Need much smaller, fixed size matrix O(w×w) Fast, incremental computation O(1×w2) N = 106, w = 1-100 18-731 Copyright: C. Faloutsos, 2006
100
Copyright: C. Faloutsos, 2006
Skip Even more details Q4: can we ‘forget’ the older samples? A4: Yes - RLS can easily handle that [Yi+00]: 18-731 Copyright: C. Faloutsos, 2006
101
Adaptability - ‘forgetting’
C. Faloutsos 18-731 Skip Adaptability - ‘forgetting’ Independent Variable eg., #packets sent Dependent Variable eg., #bytes sent Now, we discuss about the issue of adaptability. In the scatter plot presented here, we have <click> some older samples and <click> some newer samples. <click> As we can easily see, there was a sudden change of trend in the middle. What we wish to have after we saw the new samples is <click> the green regression line, but direct application of linear regression gives <click> the red line since it considers old and new samples equally important. To address the issue, we introduce lambda, a forgetting factor, into the framework. Then older data get forgotten smoothly as time goes by. 18-731 Copyright: C. Faloutsos, 2006
102
Adaptability - ‘forgetting’
C. Faloutsos 18-731 Skip Adaptability - ‘forgetting’ Trend change (R)LS with no forgetting Dependent Variable eg., #bytes sent Independent Variable eg. #packets sent Now, we discuss about the issue of adaptability. In the scatter plot presented here, we have <click> some older samples and <click> some newer samples. <click> As we can easily see, there was a sudden change of trend in the middle. What we wish to have after we saw the new samples is <click> the green regression line, but direct application of linear regression gives <click> the red line since it considers old and new samples equally important. To address the issue, we introduce lambda, a forgetting factor, into the framework. Then older data get forgotten smoothly as time goes by. 18-731 Copyright: C. Faloutsos, 2006
103
Adaptability - ‘forgetting’
C. Faloutsos 18-731 Skip Adaptability - ‘forgetting’ Trend change (R)LS with no forgetting Dependent Variable (R)LS with forgetting Independent Variable Now, we discuss about the issue of adaptability. In the scatter plot presented here, we have <click> some older samples and <click> some newer samples. <click> As we can easily see, there was a sudden change of trend in the middle. What we wish to have after we saw the new samples is <click> the green regression line, but direct application of linear regression gives <click> the red line since it considers old and new samples equally important. To address the issue, we introduce lambda, a forgetting factor, into the framework. Then older data get forgotten smoothly as time goes by. RLS: can *trivially* handle ‘forgetting’ 18-731 Copyright: C. Faloutsos, 2006
104
Copyright: C. Faloutsos, 2006
How to choose ‘w’? goal: capture arbitrary periodicities with NO human intervention on a semi-infinite stream 18-731 Copyright: C. Faloutsos, 2006
105
Copyright: C. Faloutsos, 2006
Answer: ‘AWSOM’ (Arbitrary Window Stream fOrecasting Method) [Papadimitriou+, vldb2003] idea: do AR on each wavelet level in detail: 18-731 Copyright: C. Faloutsos, 2006
106
Copyright: C. Faloutsos, 2006
AWSOM xt t W1,1 W1,2 W1,3 W1,4 W2,1 W2,2 W3,1 V4,1 time frequency = t 18-731 Copyright: C. Faloutsos, 2006
107
Copyright: C. Faloutsos, 2006
AWSOM xt t W1,1 W1,2 W1,3 W1,4 W2,1 W2,2 W3,1 V4,1 time frequency t 18-731 Copyright: C. Faloutsos, 2006
108
Copyright: C. Faloutsos, 2006
AWSOM - idea Wl,t Wl,t-1 Wl,t-2 Wl,t l,1Wl,t-1 l,2Wl,t-2 … Wl’,t’-1 Wl’,t’-2 Wl’,t’ l’,1Wl’,t’-1 l’,2Wl’,t’-2 … Wl’,t’ 18-731 Copyright: C. Faloutsos, 2006
109
Copyright: C. Faloutsos, 2006
More details… Update of wavelet coefficients Update of linear models Feature selection Not all correlations are significant Throw away the insignificant ones (“noise”) (incremental) (incremental; RLS) (single-pass) 18-731 Copyright: C. Faloutsos, 2006
110
Results - Synthetic data
AWSOM AR Seasonal AR Triangle pulse Mix (sine + square) AR captures wrong trend (or none) Seasonal AR estimation fails 18-731 Copyright: C. Faloutsos, 2006
111
Copyright: C. Faloutsos, 2006
Results - Real data Automobile traffic Daily periodicity Bursty “noise” at smaller scales AR fails to capture any trend Seasonal AR estimation fails 18-731 Copyright: C. Faloutsos, 2006
112
Copyright: C. Faloutsos, 2006
Results - real data Sunspot intensity Slightly time-varying “period” AR captures wrong trend Seasonal ARIMA wrong downward trend, despite help by human! 18-731 Copyright: C. Faloutsos, 2006
113
Copyright: C. Faloutsos, 2006
Skip Complexity Model update Space: OlgN + mk2 OlgN Time: Ok2 O1 Where N: number of points (so far) k: number of regression coefficients; fixed m: number of linear models; OlgN 18-731 Copyright: C. Faloutsos, 2006
114
Copyright: C. Faloutsos, 2006
Outline Motivation ... Linear Forecasting Auto-regression: Least Squares; RLS Co-evolving time sequences Examples Conclusions 18-731 Copyright: C. Faloutsos, 2006
115
Co-Evolving Time Sequences
C. Faloutsos 18-731 Co-Evolving Time Sequences Given: A set of correlated time sequences Forecast ‘Repeated(t)’ ?? What are co-evolving time sequences? We define them as a set of correlated time sequences possibly with some time lags. Let us take an example. Here we have three sequences. Taking a closer look, we can see that the blue sequence is the same as the red one except that it is a shifted version. Also, the green sequence is somehow correlated with the red one. What we propose is to study them jointly, to take advantage of the existence of such correlations. 18-731 Copyright: C. Faloutsos, 2006
116
Copyright: C. Faloutsos, 2006
Solution: Q: what should we do? 18-731 Copyright: C. Faloutsos, 2006
117
Copyright: C. Faloutsos, 2006
Solution: Least Squares, with Dep. Variable: Repeated(t) Indep. Variables: Sent(t-1) … Sent(t-w); Lost(t-1) …Lost(t-w); Repeated(t-1), ... (named: ‘MUSCLES’ [Yi+00]) 18-731 Copyright: C. Faloutsos, 2006
118
Examples - Experiments
C. Faloutsos 18-731 Skip Examples - Experiments Datasets Modem pool traffic (14 modems, 1500 time-ticks; #packets per time unit) AT&T WorldNet internet usage (several data streams; 980 time-ticks) Measures of success Accuracy : Root Mean Square Error (RMSE) We performed experiments on three real datasets. Currency exchange rates, modem pool traffic data, and AT&T WorldNet internet usage data. Detailed description of the datasets are in the proceeding. We compared our proposed method with two competitors. Auto-Regressive, AR for short, is a method that consider only past values of a sequence. “yesterday” method is the simplest one which, as the name suggests, uses the previous value. It, however, has been known to be very effective for financial time sequences and used as a yard-stick method. For comparisons, we measured accuracy and response time. 18-731 Copyright: C. Faloutsos, 2006
119
Copyright: C. Faloutsos, 2006
18-731 Skip Accuracy - “Modem” MUSCLES outperforms AR & “yesterday” 18-731 Copyright: C. Faloutsos, 2006
120
Copyright: C. Faloutsos, 2006
18-731 Skip Accuracy - “Internet” MUSCLES consistently outperforms AR & “yesterday” 18-731 Copyright: C. Faloutsos, 2006
121
B.II - Time Series Analysis - Outline
Skip B.II - Time Series Analysis - Outline Auto-regression Least Squares; recursive least squares Co-evolving time sequences Examples Conclusions 18-731 Copyright: C. Faloutsos, 2006
122
Conclusions - Practitioner’s guide
AR(IMA) methodology: prevailing method for linear forecasting Brilliant method of Recursive Least Squares for fast, incremental estimation. See [Box-Jenkins] very recently: AWSOM (no human intervention) 18-731 Copyright: C. Faloutsos, 2006
123
Resources: software and urls
MUSCLES: Prof. Byoung-Kee Yi: or free-ware: ‘R’ for stat. analysis (clone of Splus) 18-731 Copyright: C. Faloutsos, 2006
124
Copyright: C. Faloutsos, 2006
Books George E.P. Box and Gwilym M. Jenkins and Gregory C. Reinsel, Time Series Analysis: Forecasting and Control, Prentice Hall, 1994 (the classic book on ARIMA, 3rd ed.) Brockwell, P. J. and R. A. Davis (1987). Time Series: Theory and Methods. New York, Springer Verlag. 18-731 Copyright: C. Faloutsos, 2006
125
Copyright: C. Faloutsos, 2006
Additional Reading [Papadimitriou+ vldb2003] Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos Adaptive, Hands-Off Stream Mining VLDB 2003, Berlin, Germany, Sept. 2003 [Yi+00] Byoung-Kee Yi et al.: Online Data Mining for Co-Evolving Time Sequences, ICDE (Describes MUSCLES and Recursive Least Squares) 18-731 Copyright: C. Faloutsos, 2006
126
Copyright: C. Faloutsos, 2006
Bursty traffic and multifractals 18-731 Copyright: C. Faloutsos, 2006
127
Copyright: C. Faloutsos, 2006
Outline Motivation DSP Linear Forecasting Bursty traffic - fractals and multifractals Conclusions 18-731 Copyright: C. Faloutsos, 2006
128
Copyright: C. Faloutsos, 2006
Outline Motivation ... Linear Forecasting Bursty traffic - fractals and multifractals Problem Main idea (80/20, Hurst exponent) Results 18-731 Copyright: C. Faloutsos, 2006
129
Copyright: C. Faloutsos, 2006
Recall: Problem #1: Goal: given a signal (eg., #bytes over time) Find: patterns, periodicities, and/or compress #bytes Bytes per 30’ (packets per day; earthquakes per year) time 18-731 Copyright: C. Faloutsos, 2006
130
Copyright: C. Faloutsos, 2006
Problem #1 model bursty traffic generate realistic traces (Poisson does not work) # bytes Poisson time 18-731 Copyright: C. Faloutsos, 2006
131
Copyright: C. Faloutsos, 2006
Motivation predict queue length distributions (e.g., to give probabilistic guarantees) “learn” traffic, for buffering, prefetching, ‘active disks’, web servers 18-731 Copyright: C. Faloutsos, 2006
132
Copyright: C. Faloutsos, 2006
Q: any ‘pattern’? Not Poisson spike; silence; more spikes; more silence… any rules? # bytes time 18-731 Copyright: C. Faloutsos, 2006
133
solution: self-similarity
# bytes # bytes time time 18-731 Copyright: C. Faloutsos, 2006
134
Copyright: C. Faloutsos, 2006
But: Q1: How to generate realistic traces; extrapolate; give guarantees? Q2: How to estimate the model parameters? 18-731 Copyright: C. Faloutsos, 2006
135
Copyright: C. Faloutsos, 2006
Outline Motivation ... Linear Forecasting Bursty traffic - fractals and multifractals Problem Main idea (80/20, Hurst exponent) Results 18-731 Copyright: C. Faloutsos, 2006
136
Copyright: C. Faloutsos, 2006
Approach Q1: How to generate a sequence, that is bursty self-similar and has similar queue length distributions 18-731 Copyright: C. Faloutsos, 2006
137
Copyright: C. Faloutsos, 2006
Approach A: ‘binomial multifractal’ [Wang+02] ~ ‘law’: 80% of bytes/queries etc on first half repeat recursively b: bias factor (eg., 80%) 18-731 Copyright: C. Faloutsos, 2006
138
Copyright: C. Faloutsos, 2006
binary multifractals 20 80 18-731 Copyright: C. Faloutsos, 2006
139
Copyright: C. Faloutsos, 2006
binary multifractals 20 80 18-731 Copyright: C. Faloutsos, 2006
140
Copyright: C. Faloutsos, 2006
Parameter estimation Q2: How to estimate the bias factor b? 18-731 Copyright: C. Faloutsos, 2006
141
Copyright: C. Faloutsos, 2006
Parameter estimation Q2: How to estimate the bias factor b? A: MANY ways [Crovella+96] Hurst exponent variance plot even DFT amplitude spectrum! (‘periodogram’) More robust: ‘entropy plot’ [Wang+02] 18-731 Copyright: C. Faloutsos, 2006
142
Some more entropy plots:
Poisson vs real 1 0.73 Poisson: slope = ~1 -> uniformly distributed 18-731 Copyright: C. Faloutsos, 2006
143
Copyright: C. Faloutsos, 2006
Model validation Linear entropy plots Bias factors b: smallest b / smoothest: nntp traffic 18-731 Copyright: C. Faloutsos, 2006
144
Copyright: C. Faloutsos, 2006
Web traffic - results LBL, NCDF of queue lengths (log-log scales) Prob( >l) (queue length l) 18-731 Copyright: C. Faloutsos, 2006
145
Copyright: C. Faloutsos, 2006
Conclusions Multifractals (80/20, ‘b-model’, Multiplicative Wavelet Model (MWM)) for analysis and synthesis of bursty traffic 18-731 Copyright: C. Faloutsos, 2006
146
Copyright: C. Faloutsos, 2006
Books Fractals: Manfred Schroeder: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W.H. Freeman and Company, 1991 (Probably the BEST book on fractals!) 18-731 Copyright: C. Faloutsos, 2006
147
Copyright: C. Faloutsos, 2006
Further reading: Crovella, M. and A. Bestavros (1996). Self-Similarity in World Wide Web Traffic, Evidence and Possible Causes. Sigmetrics. [ieeeTN94] W. E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On the Self-Similar Nature of Ethernet Traffic, IEEE Transactions on Networking, 2, 1, pp 1-15, Feb 18-731 Copyright: C. Faloutsos, 2006
148
Copyright: C. Faloutsos, 2006
Further reading [Riedi+99] R. H. Riedi, M. S. Crouse, V. J. Ribeiro, and R. G. Baraniuk, A Multifractal Wavelet Model with Application to Network Traffic, IEEE Special Issue on Information Theory, 45. (April 1999), [Wang+02] Mengzhi Wang, Tara Madhyastha, Ngai Hang Chang, Spiros Papadimitriou and Christos Faloutsos, Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic, ICDE 2002, San Jose, CA, 2/26/ /1/2002. 18-731 Copyright: C. Faloutsos, 2006
149
Copyright: C. Faloutsos, 2006
Overall conclusions Signal processing: DWT is a powerful tool Linear Forecasting: AR (Box-Jenkins) methodology; AWSOM Bursty traffic: multifractals (80-20 ‘law’) 18-731 Copyright: C. Faloutsos, 2006
150
Copyright: C. Faloutsos, 2006
THANK YOU! 18-731 Copyright: C. Faloutsos, 2006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.