Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Traffic Modeling Punit Shah CSE581 Internet Technologies OGI, OHSU 2002, March 6.

Similar presentations


Presentation on theme: "Network Traffic Modeling Punit Shah CSE581 Internet Technologies OGI, OHSU 2002, March 6."— Presentation transcript:

1

2 Network Traffic Modeling Punit Shah (pshah@cse.ogi.edu) CSE581 Internet Technologies OGI, OHSU 2002, March 6

3 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 2 Papers Generating Representative Web Workloads for Networks and Server performance Evaluation –Paul Bardford, Mark Crovells. Comp Sci Department, Boston University. Self-Similarity in WWW traffic: Evidence and possible cause –Mark Crovells, Azer Bestavros. Comp Sci Department, Boston University. On the Self-Similar Nature of Ethernet Traffic –Will Leland et al. IEEE members. Funded by Boston University.

4 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 3 Traffic modeling Understand a nature of the network traffic – Establish a traffic pattern –Characteristics, metrics varies by the network stack layer

5 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 4 Why to model a traffic ? Understand behavior of the servers, network etc. in workload conditions. –Capacity management –infrastructure planning –Performance improvement –Design of the software and services –Testing and Validation Developing a simulators (work load generators), e.g. ns (CMU), SURGE, SpecWeb96 and many commercially available simulators.

6 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 5 Model Parameters Application layer (HTTP) –server file size distribution –request size distribution (file size + protocol headers) –temporal locality (caching) etc. Data Link layer (Ethernet) –packets per second –mean time between two consecutive packets –bandwidth utilization –effect of number of hosts etc.

7 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 6 Time Series Analysis Primer Correlation –Under similar circumstances if any two events exhibits an identical(opposite) pattern, then events are called positively(negatively) correlated. –Range for degree of correlation is [-1, 1]. –Correlation models. Long range dependence –Current event is positively correlated to the future event. Heavy tail –Non-negligible random distribution in the tail, e.g. hyperbolic CDF plot. Simplest distribution is Pareto. p(x) ~ x -  ; 0<  < 2

8 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 7 Self-Similarity Term introduced by Mandelbrot in 1965. Let X = (Xt: t = 0, 1, 2, ….) be a time series mean  and variance  2 lim r(k) = k (-  ), 0 <  < 1 k  autocorrelation function For each m = 1, 2, 3 … X (m) = (X k (m) : k = 1, 2 …m) is new time series, i.e. original series is divided into m non- overlapping segments, whose autocorrelation function is r (m) (k). If r (m) (k) = r(k), then X is called (asymptotically) second order self-similar with degree H = 1 -  /2. Where X k (m) = (X km-m+1 + … + X km )/m Also by  k r(k) = , self-similar means long-range dependence.

9 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 8 Self-similar 81 17 6 99 2521 45 4 20 1856 7 21 82 118 65 34 9 20

10 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 9 Self-similar 81 17 6 99 2521 45 4 20 1856 7 21 82 118 65 34 9 20  Xi = 228 108 177 136 i=1,m ‘Self-Similarity’ == Burstiness

11 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 10 Ethernet Traffic Data Collection Data collected over four years, Aug 1989 to Feb 1992 to account for various network topologies. Main traffic at the time (1994) rlogin,e- mail, NFS, local radio station audio. Hosts 140 - 1200. ~27M packets. An instance of data collection encompassed low, medium, busy hours. Timestamp with 20  s accuracy.

12 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 11 Packets/unit time (empirical)

13 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 12 Packets/unit time (synthetic)

14 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 13 Statistical tests for self-similarity Variance-time plot –variance of log(X (m) ) is plotted against log(m); straight line with slope -  > -1; H = 1 -  /2 R/S plot (rescaled adjusted range stats.) –plot grows according to power law with exponent H as a n, i.e. n H periodogram –slope of the power spectrum of the series

15 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 14 Ethernet Variance Time plot Increasing m, slowly decreasing variance. Curve will cross threshold-line, if not self-similar.

16 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 15 Ethernet Traffic Analysis Ethernet traffic is self-similar. Unlike common belief, during busy times degree if self-similarity (burstiness) increases. >>50% traffic TCP packets, but no apparent effect of the non-TCP packets.

17 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 16 Web Traffic Data collection Traces collected from the real users accessing the web documents (Nov 94 - May 95) using HTTP v0.9 and 1.0 (No parallel connections) –4700 sessions –591 users –575,775 URL requests (46,830 unique per session) –130,140 files transferred Each file request is logged –URL –session, user, workstation ID –timestamp –size of doc, file transfer time

18 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 17 Trace Analysis Web traffic is self-similar

19 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 18 Reasons for the self-similarity Web transmission times –Distribution is highly variable. Available files are heavy-tailed. –Multi-media files to be blamed (image, audio, video) Quite time –Active off and inactive off

20 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 19 Quite Times

21 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 20 Quite Time Distribution

22 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 21 Generating Web Workload SURGE User Equivalence (UE) –Synthesized behavior should emulate the users –Multi-threaded program. HTTP v1.0. No parallel connections Distribution models –File sizes –Request sizes File size + Protocol Headers zero, if already cached Popularity –Zipf’s law: if files are ordered in decreasing popularity, then reference to a file is inversely proportional to its rank. P  1/r –Empirical data shows the popular web-docs are extremely popular and others receive a few hits

23 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 22 Model Parameters (contd.) Embedded object count –Determines a quite time, specifically ‘active off’ Temporal Locality (Caching) –Probability that same object would be requested again –Effect on network access –Stack distance OFF Times –Important parameter, self-similarity is lost if OFF times are ignored Matching problem: Assign the popularity to each file for given distribution of the file size and empirical request size (count?) distribution

24 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 23 SURGE Approach Use different (well known) models for each of the model parameter

25 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 24 SURGE Validation Compared with SpecWeb96 (specbench.org) –#of HTTP requests per second (h) –#of threads (t), per thread h/t requests Packets/sec - baseline –tests for 70,300, 500 packets/sec

26 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 25 Results Roughly similar #of TCP packets and requests in 30min run Mean active TCP connection is 0.028 v/s 13.9 for SURGE, with very high variance of 3.92 (0.18) indicating self-similarity Server CPU utilization, active TCP connections are quite higher then the SepcWeb96

27 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 26 Active TCP Connections PPS  70 300 500 SpecWeb96 SURGE

28 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 27 CPU Utilization SpecWeb96 SURGEPPS  70 300 500

29 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 28 Self-Similarity SpecWeb96 SURGEPPS  70 300 500

30 03/06/2002CSE581, Winter 2002 | Punit Shah (pshah@cse.ogi.edu) 29 Conclusion Self-similarity (burstiness) is integral part of the network traffic behavior. Degree of self-similarity increases with the load. Server and network load is radically different than the non-self-similar models. Nature of the congestion produced by the self- similar traffic is drastically different from the non self-similar traffic.


Download ppt "Network Traffic Modeling Punit Shah CSE581 Internet Technologies OGI, OHSU 2002, March 6."

Similar presentations


Ads by Google