Variance of Aggregated Web Traffic Robert Morris MIT Laboratory for Computer Science IEEE INFOCOM 2000’

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Doc.: IEEE /1216r1 Submission November 2009 BroadcomSlide 1 Internet Traffic Modeling Date: Authors: NameAffiliationsAddressPhone .
Managerial Economics in a Global Economy
2014 Examples of Traffic. Video Video Traffic (High Definition) –30 frames per second –Frame format: 1920x1080 pixels –24 bits per pixel  Required rate:
An Empirical Study of Real Audio Traffic A. Mena and J. Heidemann USC/Information Sciences Institute In Proceedings of IEEE Infocom Tel-Aviv, Israel March.
On the Self-Similar Nature of Ethernet Traffic - Leland, et. Al Presented by Sumitra Ganesh.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
A Nonstationary Poisson View of Internet Traffic T. Karagiannis, M. Molle, M. Faloutsos University of California, Riverside A. Broido University of California,
Chapter 7 Regression and Correlation Analyses Instructor: Prof. Wilson Tang Instructor: Prof. Wilson Tang CIVL 181 Modelling Systems with Uncertainties.
Violations of Assumptions In Least Squares Regression.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
End-to-End Issues. Route Diversity  Load balancing o Per packet splitting o Per flow splitting  Spill over  Route change o Failure o policy  Route.
1 An Information Theoretic Approach to Network Trace Compression Y. Liu, D. Towsley, J. Weng and D. Goeckel.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Quiz 2 Measures of central tendency Measures of variability.
Self-Similarity of Network Traffic Presented by Wei Lu Supervised by Niclas Meier 05/
Linear Regression and Correlation
Simple linear regression and correlation analysis
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
On the Power of Off-line Data in Approximating Internet Distances Danny Raz Technion - Israel Institute.
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Irkutsk State Medical University Department of Faculty Therapy Correlations Khamaeva A. A. Irkutsk, 2009.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
1 Lecture 14 High-speed TCP connections Wraparound Keeping the pipeline full Estimating RTT Fairness of TCP congestion control Internet resource allocation.
Comparison of Public End-to-End Bandwidth Estimation tools on High-Speed Links Alok Shriram, Margaret Murray, Young Hyun, Nevil Brownlee, Andre Broido,
Comparison of Public End-to-End Bandwidth Estimation tools on High- Speed Links Alok Shriram, Margaret Murray, Young Hyun, Nevil Brownlee, Andre Broido,
“ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Statistics 101 Robert C. Patev NAD Regional Technical Specialist (978)
6.3 THE CENTRAL LIMIT THEOREM. DISTRIBUTION OF SAMPLE MEANS  A sampling distribution of sample means is a distribution using the means computed from.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 6 Simple Regression Introduction Fundamental questions – Is there a relationship between two random variables and how strong is it? – Can.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Promoting the Use of End-to-End Congestion Control in the Internet Sally Floyd and Kevin Fall IEEE-ACAM Transactions on Networking, 馬儀蔓.
Burst Metric In packet-based networks Initial Considerations for IPPM burst metric Tuesday, March 21, 2006.
© Buddy Freeman, 2015 Let X and Y be two normally distributed random variables satisfying the equality of variance assumption both ways. For clarity let.
We would expect the ENTER score to depend on the average number of hours of study per week. So we take the average hours of study as the independent.
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Investigating the Prefix-level Characteristics A Case Study in an IPv6 Network Department of Computer Science and Information Engineering, National Cheng.
1 Long-Range Dependence in a Changing Internet Traffic Mix STATISTICAL and APPLIED MATHEMATICAL SCIENCES INSTITUTE Félix Hernández-Campos Don Smith Department.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Inference about the slope parameter and correlation
Internet Traffic Modeling
Minimal Envelopes.
BIVARIATE REGRESSION AND CORRELATION
Quantitative Data Analysis P6 M4
Transport Layer Unit 5.
…Don’t be afraid of others, because they are bigger than you
CHAPTER 26: Inference for Regression
Networking Basics: A Review
Memento: Making Sliding Windows Efficient for Heavy Hitters
1-Way Random Effects Model
Correlation A measure of the strength of the linear association between two numerical variables.
L9: Intro Network Systems
Violations of Assumptions In Least Squares Regression
pathChirp Efficient Available Bandwidth Estimation
Violations of Assumptions In Least Squares Regression
pathChirp Efficient Available Bandwidth Estimation
Queueing Problem The performance of network systems rely on different delays. Propagation/processing/transmission/queueing delays Which delay is affected.
REGRESSION ANALYSIS 11/28/2019.
The Mean Variance Standard Deviation and Z-Scores
Presentation transcript:

Variance of Aggregated Web Traffic Robert Morris MIT Laboratory for Computer Science IEEE INFOCOM 2000’

Agenda Introduction Preliminaries Aggregating bandwidth Correlation ON/OFF Comment

Introduction Internet Traffic Model : Poisson In aggregation of multiple sources Poisson : Size of variations in bandwidth increases with the square root of the total bandwidth

Introduction Against Poisson : Strong 24-hour cycle Shared bottleneck problem Global TCP window synchronization

Preliminaries Port 80 Data from Two 24 hour traces of Internet Traffic (Nearly half) 1. Link between Harvard ’ s main campus and its Internet connection – Point-to-point 100Mb Ethernet between two routers – Internet link is 45Mb T3 line – 3pm 16/4/1998

Preliminaries Port 80 Data from Two 24 hour traces of Internet Traffic (Nearly half) 2.Ethernet which have two Lucent ’ s T1 Internet connections – Serves 900 Lucent Bell labs employees – 7am EST 10/12/1998

Preliminaries Treat localhosts as users Count the number of active users by distinct local IP address that appear in the source or destination IP header in that interval Use 0.1s intervals Because router buffering ~ 0.1s Variations in bandwidth at smaller time scales can be smoothed out using buffering

Preliminaries 1. Smooth = s.d. increases with square root of the average 2. Perfectly bursty = s.d. increases linearly with the average

Aggregating bandwidth Average bandwidth for each minute s.d. each minute, taken from 0.1 second samples Both cases : s.d. rises along with the bandwidth at all times

Aggregate bandwidth Each point ( x ) : minute’s average bandwidth ( y ) : variance measured in 0.1-second interval Variance almost linearly with the bandwidth 1. Smooth = s.d. increases with square root of the average 2. Perfectly bursty = s.d. increases linearly with the average

Changes in number of users Correlation Coefficient : ~0.88 (Harward) ~0.84 (Lucent)

Correlation X i : random variables describe the amount of bandwidth produced by ith user in each 0.1 interval X : total amount of bandwidth from all N users in each 0.1 interval Synthetic : 500 sources, on/off on time 4.5s, off time 360s inter-packet spacing 0.06s during on time

Correlation Samples interval over 0.1s 1s 10s are nearly normal distributed

Correlations Because the bandwidths from individual users are not significantly correlated

Per-user Variance

variation in user bandwidth mainly caused by the different OFF period average ON period and transfer size have less effect on the bandwidth for most user. 80% of OFF periods are at least 10 times bigger than ON periods. similar results in both traces

ON/OFF B : user ’ s average bandwidth in a cycle T on : ON period T off : OFF period X : transfer size N : number of samples in cycle B = f(X, T on, T off )

ON/OFF Assume ON period fixed ON value = c for all users Transfer size is fixed. (cc..cc ) K “ c ” s N

Comment Use ~ Normal distribution confirmed But ~ In link at router level Over thousands of connections Our ~ Estimation : make use of normal distribution End-to-end user level. Latest Internet result! (2004->1998~) Corporate ? Because our estimation is poor in determine variance. If linearly relationship forms between variance and the aggregate bandwidth, we can make use to increase the estimation accuracy.