Download presentation
Presentation is loading. Please wait.
Published byTeresa Cunningham Modified over 8 years ago
1
1 Interesting Links
2
On the Self-Similar Nature of Ethernet Traffic Will E. Leland, Walter Willinger and Daniel V. Wilson BELLCORE Murad S. Taqqu BU Analysis and Prediction of the Dynamic Behavior of Applications, Hosts, and Networks
3
3 Overview What is Self Similarity? Ethernet Traffic is Self-Similar Source of Self Similarity Implications of Self Similarity
4
4 Background Network traffic did not obey Poisson assumptions used in queuing analysis This paper, for the first time, provided an explanation and a systematic approach to modeling realistic data traffic patterns Sparked off research around the globe: Results show self-similarity in ATM traffic, compressed digital video streams, and Web Traffic
5
5 Why is Self-Similarity Important? In this paper, Ethernet traffic has been identified as being self-similar. Models like Poisson are not able to capture the self-similarity property. This leads to inaccurate performance evaluation
6
Section 1: What is Self-Similarity ?
7
7 What is Self-Similarity? Self-similarity describes the phenomenon where a certain property of an object is preserved with respect to scaling in space and/or time. If an object is self-similar, its parts, when magnified, resemble the shape of the whole. In case of stochastic objects like time-series, self-similarity is used in the distributional sense
8
8 Intuition of Self-Similarity Something “feels the same” regardless of scale (also called fractals)
9
9
10
10
11
11
12
12 Self-Similarity in Traffic Measurement ( Ⅰ ) Traffic Measurement
13
13 Pictorial View of Self-Similarity
14
14 The Famous Data Leland and Wilson collected hundreds of millions of Ethernet packets without loss and with recorded time-stamps accurate to within 100µs. Data collected from several Ethernet LAN’s at the Bellcore Morristown Research and Engineering Center at different times over the course of approximately 4 years.
15
15
16
16 Why is Self-Similarity Important? Recently, network packet traffic has been identified as being self-similar. Current network traffic modeling using Poisson distributing (etc.) does not take into account the self-similar nature of traffic. This leads to inaccurate modeling which, when applied to a huge network like the Internet, can lead to huge financial losses.
17
17 Problems with Current Models Current modeling shows that as the number of sources (Ethernet users) increases, the traffic becomes smoother and smoother Analysis shows that the traffic tends to become less smooth and more bursty as the number of active sources increases
18
18 Consequences of Self-Similarity Traffic has similar statistical properties at a range of timescales: ms, secs, mins, hrs, days Merging of traffic (as in a statistical multiplexer) does not result in smoothing of traffic Bursty Data Streams Aggregation Bursty Aggregate Streams
19
19 Problems with Current Models Cont.’d Were traffic to follow a Poisson or Markovian arrival process, it would have a characteristic burst length which would tend to be smoothed by averaging over a long enough time scale. Rather, measurements of real traffic indicate that significant traffic variance (burstiness) is present on a wide range of time scales
20
20 Pictorial View of Current Modeling
21
21 Side-by-side View
22
22 Definitions and Properties Long-range Dependence autocorrelation decays slowly Hurst Parameter Developed by Harold Hurst (1965) H is a measure of “burstiness” also considered a measure of self-similarity 0 < H < 1 H increases as traffic increases
23
23 Definitions and Properties Cont.’d low, medium, and high traffic hours as traffic increases, the Hurst parameter increases i.e., traffic becomes more self-similar
24
24 Self-Similarity in Traffic Measurement ( Ⅱ ) Network Traffic
25
25 Properties of Self Similarity X = (X t : t = 0, 1, 2, ….) is covariance stationary random process (i.e. Cov(X t,X t+k ) does not depend on t for all k) Let X (m) ={X k (m) } denote the new process obtained by averaging the original series X in non-overlapping sub-blocks of size m. E.g. X (1) = 4,12,34,2,-6,18,21,35 Then X (2) =8,18,6,28 X (4) =13,17 Mean , variance 2 Autocorrelation Function r(k) ~ k -b, where 0 < b < 1. Suppose that r(k) k -β, 0<β<1
26
26 Auto-correlation Definition X is exactly second-order self-similar if The aggregated processes have the same autocorrelation structure as X. i.e. r (m) (k) = r(k), k 0 for all m =1,2, … X is [asymptotically] second-order self-similar if the above holds when [ r (m) (k) r(k), m Most striking feature of self-similarity: Correlation structures of the aggregated process do not degenerate as m
27
27 Traditional Models This is in contrast to traditional models Correlation structures of their aggregated processes degenerate as m i.e. r (m) (k) 0 as m for k = 1,2,3,... Example: Poisson Distribution Self-Similar Distribution
28
28
29
29 Long Range Dependence Processes with Long Range Dependence are characterized by an autocorrelation function that decays hyperbolically as k increases Important Property: This is also called non-summability of correlation The intuition behind long-range dependence: While high-lag correlations are all individually small, their cumulative affect is important Gives rise to features drastically different from conventional short-range dependent processes
30
30 Intuition Short-range processes: Exponential Decay of autocorrelations, i.e.: r(k) ~ p k, as k , 0 < p < 1 Summation is finite Non-summability is an important property Guarantees non-degenerate correlation structure of the aggregated processes X (m) as m
31
31 The Measure of Self-Similarity Hurst Parameter H, 0.5 < H < 1 Three approaches to estimate H (Based on properties of self-similar processes) Variance Analysis of aggregated processes Analysis of Rescaled Range (R/S) statistic for different block sizes A Whittle Estimator
32
32 Variance Analysis Variance of aggregated processes decays as: Var(X (m) ) = am -b as m inf, For short range dependent processes (e.g. Poisson Process), Var(X (m) ) = am -1 as m inf, Plot Var(X (m) ) against m on a log-log plot Slope > -1 indicative of self-similarity
33
33
34
34 The R/S statistic where For a given set of observations, Rescaled Adjusted Range or R/S statistic is given by
35
35 Example X k = 14,1,3,5,10,3 Mean = 36/6 = 6 W 1 =14-(1.6 )=8 W 2 =15-(2.6 )=3 W 3 =18-(3.6 )=0 W 4 =23-(4.6 )=-1 W 5 =33-(5.6 )=3 W 6 =36-(6.6 )=0 R/S = 1/S*[8-(-1)] = 9/S
36
36 The Hurst Effect For self-similar data, rescaled range or R/S statistic grows according to cn H H = Hurst Paramater, > 0.5 For short-range processes, R/S statistic ~ dn 0.5 History: The Nile river In the 1940-50’s, Harold Edwin Hurst studies the 800-year record of flooding along the Nile river. (yearly minimum water level) Finds long-range dependence.
37
37
38
38 Whittle Estimator Provides a confidence interval Property: Any long range dependent process approaches FGN, when aggregated to a certain level Test the aggregated observations to ensure that it has converged to the normal distribution
39
39 Self Similarity X is exactly second-order self-similar with Hurst parameter H (= 1- β/2) if for all m, Var(X (m) ) = 2 m -β, and r (m) (k) = r(k), k 0 X is [asymptotically] second-order self-similar if the above holds when [ r (m) (k) r(k), m ∞
40
40 Modeling Self-Similarity Fractional Gaussian noise (FGN) Gaussian process with mean , variance 2, and Autocorrelation function r(k)=(|k+1| 2H -|k| 2H +|k-1| 2H ), k>0 Exactly second-order self-similar with 0.5<H<1 Fractional ARIMA(p,d,q) Asymptotically second-order self-similar with H=d+0.5 where 0<d<0.5 Discrete time M/G/ input model Service time X given by heavy tail distribution (i.e. E[x]< 2= Example : Pareto distribution P(X>k) k -α, 1< α<2 N = {N t,t=1,2,…} is self-similar with H=(3- α)/2 where N t denotes # of members being serviced at time t
41
Section 2: Ethernet Traffic is Self-Similar
42
42 Plots Showing Self-Similarity ( Ⅰ ) H=0.5 H=1 Estimate H 0.8
43
43 Plots Showing Self-Similarity ( Ⅱ ) Higher Traffic, Higher H High Traffic Mid Traffic Low Traffic 1.3%-10.4% 3.4%-18.4% 5.0%-30.7%
44
44 Observation shows “contrary to Poisson” Network UtilizationH As we shall see shortly, H measures traffic burstiness As number of Ethernet users increases, the resulting aggregate traffic becomes burstier instead of smoother H : A Function of Network Utilization
45
45 Difference in low traffic H values Pre-1990: host-to-host workgroup traffic Post-1990: Router-to-router traffic Low period router-to-router traffic consists mostly of machine-generated packets Tend to form a smoother arrival stream, than low period host-to-host traffic
46
46 H : Measuring “Burstiness” Intuitive explanation using M/G/ Model As α 1, service time is more variable, easier to generate burst Increasing H ! Wrong way to measure “burstiness” of self- similar process Peak-to-mean ratio Coefficient of variation (for interarrival times)
47
47 Summary Ethernet LAN traffic is statistically self-similar H : the degree of self-similarity H : a function of utilization H : a measure of “burstiness” Models like Poisson are not able to capture self-similarity
48
48 Discussions How to explain self-similarity ? Heavy tailed file sizes How this would impact existing performance? Limited effectiveness of buffering Effectiveness of FEC How to adapt to self-similarity? Prediction Adaptive FEC
49
49
50
50
51
Thanks !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.