Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern.

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

A Measurement Study of Available Bandwidth Estimation Tools MIT - CSAIL with Jacob Strauss & Frans Kaashoek Dina Katabi.
Pathload A measurement tool for end-to-end available bandwidth Manish Jain, Univ-Delaware Constantinos Dovrolis, Univ-Delaware Sigcomm 02.
Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter Steenkiste (CMU) with Ningning Hu (CMU), Oliver Spatscheck.
CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.
1 Locating Internet Bottlenecks: Algorithms, Measurement, and Implications Ningning Hu (CMU) Li Erran Li (Bell Lab) Zhuoqing Morley Mao (U. Mich) Peter.
Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard Published in 2012 ACM’s Internet Measurement Conference (IMC) Five students from.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.
Selfish Behavior and Stability of the Internet: A Game-Theoretic Analysis of TCP Presented by Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
End-to-End Available Bandwidth: Measurement Methodology, Dynamics, and Relation with TCP Throughput Manish Jain Constantinos Dovrolis SIGCOMM 2002 Presented.
1 Estimating Shared Congestion Among Internet Paths Weidong Cui, Sridhar Machiraju Randy H. Katz, Ion Stoica Electrical Engineering and Computer Science.
Model Fitting Jean-Yves Le Boudec 0. Contents 1 Virus Infection Data We would like to capture the growth of infected hosts (explanatory model) An exponential.
1 Modeling and Emulation of Internet Paths Pramod Sanaga, Jonathon Duerig, Robert Ricci, Jay Lepreau University of Utah.
1 Dynamics of End-host controlled Routing Mukund Seshadri Prof. Randy Katz Sahara Retreat Jan 2004.
AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.
1 Modeling and Taming Parallel TCP on the Wide Area Network Dong Lu,Yi Qiao Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern.
A Decentralized Relational Information Service for Large Scale Distributed Computing Thesis Proposal April 2 nd, 2004 Dong Lu Committee Peter A. Dinda.
On the Constancy of Internet Path Properties Yin Zhang, Nick Duffield AT&T Labs Vern Paxson, Scott Shenker ACIRI Internet Measurement Workshop 2001 Presented.
1 Components of a Scalable Distributed Relational Information Service Dong Lu June 14, 2005.
1 Yi Qiao Jason Skicewicz Peter A. Dinda Prescience Laboratory Department of Computer Science Northwestern University Evanston, IL An Empirical Study.
A TCP With Guaranteed Performance in Networks with Dynamic Congestion and Random Wireless Losses Stefan Schmid, ETH Zurich Roger Wattenhofer, ETH Zurich.
Recent Results in Resource Signal Measurement, Dissemination, and Prediction App Transport Network Data Link Physical App Transport Network Data Link Physical.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Variance of Aggregated Web Traffic Robert Morris MIT Laboratory for Computer Science IEEE INFOCOM 2000’
Online Prediction of the Running Time Of Tasks Peter A. Dinda Department of Computer Science Northwestern University
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
High Performance Cooperative Data Distribution [J. Rick Ramstetter, Stephen Jenks] [A scalable, parallel file distribution model conceptually based on.
Prof. Reza Rejaie Computer & Information Science University of Oregon Winter 2003 An Overview of Internet Multimedia Networking.
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
Alok Shriram and Jasleen Kaur Presented by Moonyoung Chung Empirical Evaluation of Techniques for Measuring Available Bandwidth.
Optimal n fe Tian-Li Yu & Kai-Chun Fan. n fe n fe = Population Size × Convergence Time n fe is one of the common used metrics to measure the performance.
Physical Layer Informed Adaptive Video Streaming Over LTE Xiufeng Xie, Xinyu Zhang Unviersity of Winscosin-Madison Swarun KumarLi Erran Li MIT Bell Labs.
Raj Jain The Ohio State University R1: Performance Analysis of TCP Enhancements for WWW Traffic using UBR+ with Limited Buffers over Satellite.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
UDT: UDP based Data Transfer Yunhong Gu & Robert Grossman Laboratory for Advanced Computing University of Illinois at Chicago.
Xiao Liu, Jinjun Chen, Ke Liu, Yun Yang CS3: Centre for Complex Software Systems and Services Swinburne University of Technology, Melbourne, Australia.
Hung X. Nguyen and Matthew Roughan The University of Adelaide, Australia SAIL: Statistically Accurate Internet Loss Measurements.
11 Experimental and Analytical Evaluation of Available Bandwidth Estimation Tools Cesar D. Guerrero and Miguel A. Labrador Department of Computer Science.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
© 2003, Carla Ellis Self-Scaling Benchmarks Peter Chen and David Patterson, A New Approach to I/O Performance Evaluation – Self-Scaling I/O Benchmarks,
1 On Dynamic Parallelism Adjustment Mechanism for Data Transfer Protocol GridFTP Takeshi Itou, Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
Chapter 13 Multiple Regression
Multiplicative Wavelet Traffic Model and pathChirp: Efficient Available Bandwidth Estimation Vinay Ribeiro.
Development of a QoE Model Himadeepa Karlapudi 03/07/03.
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
SERENA: SchEduling RoutEr Nodes Activity in wireless ad hoc and sensor networks Pascale Minet and Saoucene Mahfoudh INRIA, Rocquencourt Le Chesnay.
1 Sheer volume and dynamic nature of video stresses network resources PIE: A lightweight latency control to address the buffer problem issue Rong Pan,
9/29/04 GGF Random Thoughts on Application Performance and Network Characteristics Distributed Systems Department Lawrence Berkeley National Laboratory.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Riptide: Jump-Starting Back-Office Connections in Cloud Systems
Vivaldi: A Decentralized Network Coordinate System
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Rutgers Intelligent Transportation Systems (RITS) Laboratory
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Lecture 19 – TCP Performance
Improving the Freshness of NDN Forwarding States
Elders know best Lifespan-based ideas in P2P systems
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Modeling and Taming Parallel TCP on the Wide Area Network
Javad Ghaderi, Tianxiong Ji and R. Srikant
By Manish Jain and Constantinos Dovrolis 2003
An Empirical Evaluation of Wide-Area Internet Bottlenecks
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

Characterizing and Predicting TCP Throughput on the Wide Area Network Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern University

2 Overview Algorithm for predicting the TCP throughput as function of flow size Minimal active probing Dynamic probe rate adjustment Explaining flow size / throughput correlation Explaining why simple active probing fails Large scale empirical study

3 Outline Why TCP throughput prediction? Particulars of study Flow size / TCP throughput correlation Issues with simple benchmarking DualPats algorithm Stability and dynamic rate adjustment

4 Goal A library call BW = PredictTransfer(src,dst,numbytes); Expected Time = numbytes/BW; Ideally, we want a confidence interval: (BWLow,BWHigh) = PredictTransfer(src,dst,numbytes,p);

5 Available Bandwidth Maximum rate a path can offer a flow without slowing other flows –pathchar, cprobe, nettimer, delphi, IGI, pathchirp, pathload … Available bandwidth can differ significantly from TCP throughput Not real time, takes at least tens of seconds to run

6 Simple TCP Benchmarking Benchmark paths with a single small probe –BW = ProbeSize/Time –Widely used Network Weather Service (NWS) and others (Remos benchmarking collector) Not accurate for large transfers on the current high speed Internet –Numerous papers show this and attempt to fix it

7 Fixing Simple TCP Benchmarking Logs [Sundharshan]: correlate real transfer measurements with benchmarking measurements Recent transfers needed Similar size transfers needed Measurements at application chosen times CDF-matching [Swany]: correlate CDF of real transfer measurements with CDF of benchmarking measurements Recent transfers still needed Measurements at application chosen times

8 Analysis of TCP Extensive research on TCP throughput modeling in networking community Really intended to build better TCPs Difficult to use models online because of hard to measure parameters Future loss rate and RTT Note: we measure goodput

9 Our Measurement Study PlanetLab and additional machines –Located all over the world Measurements of throughput –Wide open socket buffers (1-3 MB) –Simple ttcp-like client/server –scp –GridFTP Four separate sets of measurements

10 Distribution Set For analysis of TCP throughput stability and distributions 60 randomly chosen paths among PlanetLab machines 1.6 million transfers (client/server) –100 KB, 200 KB, 400 KB, … 10 MB flows –3000 consecutive transfers per path+flow size

11 Correlation Set For studying correlation between throughput and flow size, initial testing of algorithm 60 randomly chosen paths among PlanetLab machines 2.4 million transfers, 270 thousand runs, client/server –100 KB, 200 KB, 400 KB, … 10 MB flows –Run = sweep flow size for path

12 Verification Set Test algorithm 30 randomly chosen paths among PlanetLab machines and others 4800 transfers, 300 runs, scp and GridFTP –5 KB to 1 GB flows –Run = sweep flow size for path

13 Online Evaluation Set Test online algorithm 50 randomly chosen paths among PlanetLab machines and others transfers, scp and GridFTP –40 MB or 160 MB file, randomly chosen size –10 days

14 Strong Correlation Between TCP Throughput and Flow Size Correlation and Verification Sets

15 Why Does The Correlation Exist? Slow start and user effects [Zhang] Extant flows Non-negligible startup overheads –Control messages in scp and GridFTP Residual slow start effect –SACK results in slow convergence to equilibrium

16 Why Simple Benchmarking Fails Probes are too small Need more than one probe to capture correlation

17 Our Approach Two consecutive probes, both larger than the noise region

18 Our Approach Two consecutive probes are integrated into a single probe –400KB, 800 KB in single 800 KB probe 0T1 T2 Probe one Probe two

19 Our Approach Flow size Transfer Time Solve For A and B Predict Throughput For Some Other Transfer

20 Model Fit is Excellent Correlation Set Low and Normally Distributed Relative Errors At All Flow Sizes

21 Stability How long does the TCP throughput function remain stable? –How frequently should we probe the path? What’s the distribution of throughput around the function (i.e., the error)?

22 Throughput is Stable For Long Periods Correlation Set Increasing Max/Min Throughput in Interval

23 Throughput Is Normally Distributed In An Interval Distribution Set

24 Online DualPats Algorithm Fetch probe sequence for destination –Start probing process if no data exists Project probe sequence ahead –20 point moving average over values with current sampling interval Apply model using projected data Return result –confidence interval computed using normality assumptions

25 Dynamic Sampling Rate Adjust sampling interval to correspond to the path’s stable intervals Limit rate (20 to 1200 seconds) Additive increase / additive decrease of based on difference between last two probes increase interval > 15% => decrease interval

26 Finding Sufficiently Large Probe Size Default values: 400 KB / 800 KB Upper bound Additive increase until prediction error are less than threshold, all with same sign.

27 Evaluation Mean relative error Mean abs(relative error) Relative error P[mean error < X] Slight conservative bias >90 % of predictions have < 35% error Online Evaluation Set

28 Conclusions Algorithm for predicting the TCP throughput as function of flow size Minimal active probing Dynamic probe rate adjustment Explaining flow size / throughput correlation Explaining why simple active probing fails Large scale empirical study

29 For More Info Prescience Lab – Aqua Lab – D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Modeling and Taming Parallel TCP on the Wide Area Network, IPDPS Y. Qiao, J. Skicewicz, P. Dinda, An Empirical Study of the Multiscale Predictability of Network Traffic, HPDC 2004.