The Joint Distribution of Internet Flow Sizes and Durations C HEOLWOO P ARK J. S TEPHEN M ARRON The University of North Carolina at Chapel Hill.

Slides:



Advertisements
Similar presentations
STATISTICS Joint and Conditional Distributions
Advertisements

Nanyang Technological University Zou ZiXuan
Inktomi Confidential and Proprietary The Inktomi Climate Lab: An Integrated Environment for Analyzing and Simulating Customer Network Traffic Stephane.
Connection-level Analysis and Modeling of Network Traffic understanding the cause of bursts control and improve performance detect changes of network state.
A First Look at Modern Enterprise Traffic
Pathload A measurement tool for end-to-end available bandwidth Manish Jain, Univ-Delaware Constantinos Dovrolis, Univ-Delaware Sigcomm 02.
SCTP v/s TCP – A Comparison of Transport Protocols for Web Traffic CS740 Project Presentation by N. Gupta, S. Kumar, R. Rajamani.
Efficient Constraint Monitoring Using Adaptive Thresholds Srinivas Kashyap, IBM T. J. Watson Research Center Jeyashankar Ramamirtham, Netcore Solutions.
Active Queue Management: Theory, Experiment and Implementation Vishal Misra Dept. of Computer Science Columbia University in the City of New York.
5/17/20151 Adaptive RED: An Algorithm for Increasing the Robustness of RED’s Active Queue Management or How I learned to stop worrying and love RED Presented.
1 “Tracking the Evolution of Web Traffic: Felix Hernandez-Campos, Kevin Jeffay, F. Donelson Smith IEEE/ACM International Symposium on Modeling,
IEEE PIMRC A Comparative Measurement Study of the Workload of Wireless Access Points in Campus Networks Maria Papadopouli Assistant Professor Department.
1 School of Computing Science Simon Fraser University, Canada Modeling and Caching of P2P Traffic Mohamed Hefeeda Osama Saleh ICNP’06 15 November 2006.
Simulating Exchangeable Multivariate Archimedean Copulas and its Applications Authors: Florence Wu Emiliano A. Valdez Michael Sherris.
EPUNet Conference – BCN 06 “The causal effect of socioeconomic characteristics in health limitations across Europe: a longitudinal analysis using the European.
Copyright © 2005 Department of Computer Science CPSC 641 Winter WAN Traffic Measurements There have been several studies of wide area network traffic.
On the Constancy of Internet Path Properties Yin Zhang, Nick Duffield AT&T Labs Vern Paxson, Scott Shenker ACIRI Internet Measurement Workshop 2001 Presented.
Internet Research Needs a Critical Perspective Towards Models –Sally Floyd –IMA Workshop, January 2004.
Current Research Topics -Sigcomm Sessions -QoS -Network analysis & security -Multicast -giga/tera bit routers /fast classification -web performance -TCP.
Tracking the Evolution of Web Traffic: Felix Hernandez-Campos, Kevin Jeffay F. Donelson Smith IEEE/ACM International Symposium on Modeling, Analysis.
Variance of Aggregated Web Traffic Robert Morris MIT Laboratory for Computer Science IEEE INFOCOM 2000’
CSE 561 – Traffic Models David Wetherall Spring 2000.
Department of Computer Engineering Koc University, Istanbul, Turkey
Investigating Forms of Simulating Web Traffic Yixin Hua Eswin Anzueto Computer Science Department Worcester Polytechnic Institute Worcester, MA.
Inline Path Characteristic Estimation to Improve TCP Performance in High Bandwidth-Delay Networks HIDEyuki Shimonishi Takayuki Hama Tutomu Murase Cesar.
On the Characteristics and Origins of Internet Flow Rates ACM SIGCOMM 2002 Yin Zhang Lee Breslau Vern Paxson Scott Shenker AT&T Labs – Research
1 WAN Measurements Carey Williamson Department of Computer Science University of Calgary.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
TCP-Related Measurements Presented by: Charles Simpson (Robby) September 30, 2003.
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.
M. Papadopouli 1,2,3, M. Moudatsos 1, M. Karaliopoulos 2 1 Institute of Computer Science, FORTH, Heraklion, Crete, Greece 2 University of North Carolina,
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
An Analytical Approach for the Two-Tier Resource Management Model IPS-MOME March 2004 Y. Rebahi.
CS An Overlay Routing Scheme For Moving Large Files Su Zhang Kai Xu.
5th e-VLBI Workshop, September 2006, Haystack Observatory 1 A Simulation model for e-VLBI traffic on network links in the Netherlands Julianne Sansa*
Incorporating heterogeneity in meta-analyses: A case study Liz Stojanovski University of Newcastle Presentation at IBS Taupo, New Zealand, 2009.
Spatio-Temporal Modeling of Traffic Workload in a Campus WLAN Felix Hernandez-Campos 3 Merkouris Karaliopoulos 2 Maria Papadopouli 1,2,3 Haipeng Shen 2.
SMRE Spring SMRE Spring 2008 Project An Excel Model to Estimate Expected Remaining Retirement Lifespan By Bruno Trenkler.
Measuring the Congestion Responsiveness of Internet Traffic Ravi Prasad & Constantine Dovrolis Networking and Telecommunications Group College of Computing.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
A Nonstationary Poisson View of Internet Traffic Thomas Karagiannis joint work with Mart Molle, Michalis Faloutsos, Andre Broido.
On the Constancy of Internet Path Properties ACM SIGCOMM Internet Measurement Workshop November, 2001 Yin Zhang Nick Duffield Vern Paxson Scott Shenker.
1 Hailuoto Workshop A Statistician ’ s Adventures in Internetland J. S. Marron Department of Statistics and Operations Research University of North Carolina.
On the Characteristics and Origins of Internet Flow Rates ACM SIGCOMM 2002 ICIR AT&T Labs – Research
정하경 MMLAB Fundamentals of Internet Measurement: a Tutorial Nevil Brownlee, Chris Lossley, “Fundamentals of Internet Measurement: a Tutorial,” CMG journal.
How Low Can You Go: Balancing Performance with Anonymity in Tor’ DC-Area Anonymity,Privacy, and Security Seminar May 10 th, 2013 Rob Jansen U.S. Naval.
1 Hailuoto Workshop A Statistician ’ s Adventures in Internetland J. S. Marron Department of Statistics and Operations Research University of North Carolina.
1 How to Identify the Speed Limiting Factor of a TCP Flow E2EMON 2006 Vancouver Mark Timmer April 3, 2006 Co-authors: Pieter-Tjerk de Boer and Aiko Pras.
Performance Limitations of ADSL Users: A Case Study Matti Siekkinen, University of Oslo Denis Collange, France Télécom R&D Guillaume Urvoy-Keller, Ernst.
1 Long-Range Dependence in a Changing Internet Traffic Mix STATISTICAL and APPLIED MATHEMATICAL SCIENCES INSTITUTE Félix Hernández-Campos Don Smith Department.
JIVE Integration of HCP Data Qunqun Yu Dr. Steve Marron, Dr. Kai Zhang & Dr. Ben Risk University of North Carolina at Chapel Hill.
4-5 Introduction to Scatter Plots 13 April Agenda What is a Scatter Plot? Correlation of Scatter Plots Line of Fit.
Exercise 1 Content –Covers chapters 1-4 Chapter 1 (read) Chapter 2 (important for the exercise, 2.6 comes later) Chapter 3 (especially 3.1, 3.2, 3.5) Chapter.
1 Stochastic Ordering for Internet Congestion Control Han Cai, Do Young Eun, Sangtae Ha, Injong Rhee, and Lisong Xu PFLDnet 2007 February 7, 2007.
Why Stochastic Hydrology ?
Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers
Normal Hills Route Departure Times from Normal Hills
Mrinalini Sawhney CS-710 Presentation 2006/09/12
SCTP v/s TCP – A Comparison of Transport Protocols for Web Traffic
Open Issues in Router Buffer Sizing
Simulation Results for Box5
CPSC 641: WAN Measurement Carey Williamson
Congestion Control in SDN-Enabled Networks
Vern Paxson and Sally Floyd, "Why We Don't Know How To Simulate The Internet", Proceedings of the 1997 Winter Simulation Conference, Dec1997 Sally Floyd.
Carey Williamson Department of Computer Science University of Calgary
Bivariate Data AS 3.1.
Increased expression levels of CHIT1 in the patients with IPF where they correlate inversely with SMAD7. Increased expression levels of CHIT1 in the patients.
Congestion Control in SDN-Enabled Networks
Adaptive RED: An Algorithm for Increasing the Robustness of RED’s Active Queue Management or How I learned to stop worrying and love RED Presented by:
Bivariate Data.
Presentation transcript:

The Joint Distribution of Internet Flow Sizes and Durations C HEOLWOO P ARK J. S TEPHEN M ARRON The University of North Carolina at Chapel Hill

The Joint Distribution of Internet Flow Sizes and Durations Motivation of the study Data description, scatter plots and density estimation Correlation plots Conclusions and future plans

The Joint Distribution of Internet Flow Sizes and Durations Started from conflict between two papers Extremal Dependence: Internet Traffic Applications (2002) - Felix Hernandez Campos, J. S. Marron, Sidney I. Resnick and Kevin Jeffay On the Characteristics and Origins of Internet Flow Rates (2002) - Yin Zhang, Lee Breslau, Vern Paxson and Scott Shenker, SIGCOMM’02

Why interested in this topic? Size and rate are naturally considered as independent Users determine sizes of files transferred depending on their available bandwidths? Modeling of Internet traffic The Joint Distribution of Internet Flow Sizes and Durations

Nearly contradictory answers! The Joint Distribution of Internet Flow Sizes and Durations Hernandez-Campos et al. (EDA) Zhang et al. (log-log Correlations) S vs. D Inconclusive (0.50 ~ 0.59) Inconclusive (0.10 ~ 0.30) S vs. R Independent (0.23 ~ 0.31) Dependent (0.84 ~ 0.89) D vs. IR Dependent (0.69 ~ 0.71) Inconclusive (0.18 ~ 0.45) Different earlier analyses of Internet Flow Sizes and Durations: S: Size, D: Duration, R (=S/D): Rate, IR: Inverse Rate

Why? Possibilities: Data from different sources? Different types of data? (HTTP Resp. vs all web traces) Different correlation measure? The Joint Distribution of Internet Flow Sizes and Durations Hernandez-Campos et al. (log-log Correlations) Zhang et al. (log-log Correlations) S vs. D Inconclusive (0.65) Inconclusive (0.10 ~ 0.30) S vs. R Independent (-0.06) Dependent (0.84 ~ 0.89) D vs. IR Dependent (0.80) Inconclusive (0.18 ~ 0.45) Different threshold values?

Threshold values: The Joint Distribution of Internet Flow Sizes and Durations applied thresholding to different variables used different threshold values Hernandez-Campos et al.Zhang et al. Size> 100 Kbytes> 0 bytes Duration> 0 sec> 5 sec

The Joint Distribution of Internet Flow Sizes and Durations Motivation of the study Data description, scatter plots and density estimation Correlation plots Conclusions and future plans

Data : HTTP responses Sunday Morning (8:00 AM – 12:00 PM) In April 2001 From UNC Main Link The Joint Distribution of Internet Flow Sizes and Durations Variables of Interest: S : Size (bytes) D : Duration (time in seconds) R : Rate (throughput, byte/sec) IR : Inverse Rate (sec/byte)

Scatterplot log 10 (Size) vs. log 10 (Duration) The Joint Distribution of Internet Flow Sizes and Durations

Scatterplot log 10 (Size) vs. log 10 (Rate) The Joint Distribution of Internet Flow Sizes and Durations

Scatterplot log 10 (Duration) vs. log 10 (Inv. Rate) The Joint Distribution of Internet Flow Sizes and Durations

Motivation of the Study Data description and scatter plots Log-log correlation plots with global thresholdings Conclusions and future plans

The Joint Distribution of Internet Flow Sizes and Durations log 10 (Size) vs. log 10 (Duration)

The Joint Distribution of Internet Flow Sizes and Durations log 10 (Size) vs. log 10 (Rate)

The Joint Distribution of Internet Flow Sizes and Durations log 10 (Duration) vs. log 10 (Inv. Rate)

The Joint Distribution of Internet Flow Sizes and Durations log 10 (Size) vs. log 10 (Rate) Simulated bivariate normal

The Joint Distribution of Internet Flow Sizes and Durations Motivation of the Study Data description and scatter plots Log-log correlation plots with global thresholdings Conclusions and future plans

Conclusions: The blind men and the elephant Thresholding is CRITICAL The Joint Distribution of Internet Flow Sizes and Durations

Deeper investigation: What values should we use ? On Size ? On Duration ? On Both ? How to handle 0 durations ? Which methods are robust to thresholding? The Joint Distribution of Internet Flow Sizes and Durations