Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting.

Similar presentations

Presentation on theme: "The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting."— Presentation transcript:

1 The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting

2 Problem  High performance computing community is making use of parallel TCP sockets to increase end-to-end throughput  There are concerns about the effectiveness, fairness and efficiency of parallel flows  This research uses simulation to investigate the effectiveness, fairness and efficiency questions  Based on simulations with empirically based loss model, parallel TCP is effective and efficient, but not always fair  May be possible to improve fairness

3 Outline  Introduction  Motivation  Background  Simulation  Evaluation  Conclusion

4 Introduction  HPC community needs high speed bulk throughput  Using parallel TCP flows to increase throughput  Examples  Bbcp - Stanford Linear Accelerator (SLAC)  Globus - Argonne National Lab  GridFTP – Grid Forum and ANL  Storage Resource Broker – San Diego Supercomputer Center  PSockets Library – University of Illinois at Chicago  SLAC has extensive measurements that demonstrates successful use

5 Introduction  Actual end-to-end network throughput is much less than expected  Host and Network tuning helps a little  Infrastructure upgrades help a little  But after tuning, throughput still much less than expected  Network measurements gathered from infrastructure show available unused bandwidth (“head room”)  Observed packet loss rate from transfers are too high to support high throughput bulk data transfers

6 Introduction  Networking community discourages use of Parallel TCP flows  May cause congestion collapse at worst  Unfair to single stream flows at best  This is based on the belief that packet losses are due exclusively to network overload

7 Motivation  This research examines the use of parallel TCP flows on shared networks  Goals of the research are to determine if parallel TCP is  Effective  Fair  Efficient

8 Motivation  Effective  Does the use of parallel TCP flows increase aggregate throughput?  Fair  Does the use of parallel TCP flows steal bandwidth from competing TCP flows?  Efficient  Does the use of parallel TCP flows improve the overall efficiency of the network bottleneck?

9 Outline  Introduction  Motivation  Background  Simulation  Evaluation  Conclusion

10 Background  Factors that affect TCP throughput  Maximum Segment Size (MSS)  Maximum TCP segment size  Limited by maximum frame size supported by network  Round Trip Time (RTT)  Depends on  Length of network  Load on network (queueing delays)  Packet Loss Rate  Number of packets dropped / Number of packets transmitted  Packet losses considered a sign of overload

11 Background  Packet Loss  Most dynamic factor of the three  High rates of packet loss limits throughput  Cause assumed to be exclusively from overload  Statistical distribution of packet loss is important

12 Background  Sources of Packet Loss  Network bottleneck overload  Other sources  Hardware and Software Bugs  Faulty Hardware  Others…

13 Background  Implication  When there is no congestion, packet loss from other sources limits throughput  Evidence of non congestion packet loss  Lack of recorded drops in routers  Underutilized network links  Packet drops present in TCP sessions that are not due to overload

14 Background  Parallel TCP flows  Overcomes effects of packet loss on throughput  Recovers from loss faster than single stream  Averages out effects of non-congestion related packet losses

15 Outline  Introduction  Motivation  Background  Simulation  Evaluation  Conclusion

16 Simulation  NS2 simulation built to investigate the effectiveness, fairness, and efficiency of parallel TCP flows

17 Simulation  Loss Model in simulator is critical  Measurements from real transfers used to build loss model  153 data transfers from U-M to Caltech  Performed over 3 days  Packet traces from experiments analyzed to extract losses  Source of Loss  Network operations centers certified no router drops during test  Bandwidth graph for network bottleneck showed underutilization

18 Simulation  Observed Loss Characteristics

19 Simulation  Right hand side of histogram

20 Simulation  Left Hand Side of Histogram  Intraburst Losses  Collection of exponential distributions  Between 61% and 78% of analyzed intrabursts fit an exponential distribution  Right Hand Side of Histogram  Interburst Losses  Fits a normal distribution

21 Simulation  Loss Models Considered  Constant Loss Probability  Random I.I.D.  Poisson Loss Arrival  Unconditional and Conditional Loss  A.k.a 2-state Markov or Gilbert  Kth Order Markov Loss Model  Extended Gilbert Model

22 Simulation  6-state Markov Model selected  6 states were enough to simulate throughput equivalent to observed  Markov chain used to drive a Markov Modulated Poisson Process (MMPP)  1 state is the loss state, 5 states no-loss  Sojourn time and transition probabilities from observed data  Poisson Loss Model used for the Loss State

23 Simulation  MultiState Loss Model in ns2 used to implement MMPP loss model  Extension made to ns2 to support MultiState Loss Model on multiple links in the simulator  Each simulation instance was run 10 times with different random seeds for the Loss Model  Total number of all simulations was over 3000

24 Outline  Introduction  Motivation  Background  Simulation  Evaluation  Conclusion

25 Evaluation  Effectiveness  Fairness  Efficiency

26 Evaluation  Effectiveness Question  Does the use of parallel TCP flows increase aggregate throughput?  Addressing the Question  Between 1 and 6 parallel flows simulated  No Cross Traffic

27 Evaluation  Effectiveness Results

28 Evaluation  Effectiveness Conclusion  Parallel flows improve aggregate throughput in the presence of systemic non-congestion related packet loss  Corroboration of simulation results with observed results

29 Evaluation  Effectiveness  Fairness  Efficiency

30 Evaluation  Fairness Question  Does the use of parallel TCP flows steal bandwidth from competing TCP flows?  Addressing the Question  Between 1 to 12 parallel flows  Between 1 to 5 cross streams of competing single stream traffic

31 Evaluation  Reading the Graphs Total Parallel Flow Throughput Total Single Stream Flow Throughput Network Bottleneck is 100 Mb/sec

32 Evaluation






38  Fairness Conclusions  Fair when there is approximately more than 10% unused bandwidth  Unfair when there is no available bandwidth  Parallel TCP flows steal bandwidth from competing single stream flows to increase throughput when no unused bandwidth

39 Evaluation  Improving Fairness  Parallel flow aggressiveness due to  Increased recovery rate over single stream  Fractional response to packet drops  If we could make parallel flows only as aggressive as a single stream, can we preserve effectiveness and efficiency while improving fairness?

40 Evaluation  Slight modification to the TCP congestion avoidance algorithm  If n parallel flows are used, increase congestion window one packet for every n packets successfully transmitted, rather than one packet for every one packet successfully transmitted  Overall aggressiveness of n parallel flows is then the same as one single TCP flow  Simulation for 1 and 5 cross streams run with 1 to 20 parallel streams to investigate boundries

41 Evaluation







48  Parallel flows with modification are about ½ as aggressive as parallel flows with no modification  Also found some asymptotic behavior as the number of parallel flows increased

49 Evaluation  Asymptotic behavior  Derived aggregate throughput of parallel flow with modified TCP

50 Evaluation


52  Fairness Conclusions  Fair when there is more than 10% available bandwidth in bottleneck  Parallel flows steal from single stream flows when bottleneck is over 90% utilized  TCP modification  Reduces aggressiveness  Curbs ability of parallel flow to steal bandwidth as number of flows increase

53 Evaluation  Effectiveness  Fairness  Efficiency

54 Evaluation  Efficiency Results  Efficiency is increased when parallel flows used if there is unused bandwidth in bottleneck  When all nodes use same number of parallel flows  Efficiency maintained  Fairness maintained

55 Outline  Introduction  Motivation  Background  Simulation  Evaluation  Conclusion

56 Conclusions  Parallel flows are  Effective  Fair when bottleneck is utilized less than 90%  Unfair when bottleneck is near saturation  Efficient  TCP congestion avoidance algorithm can be modified to  Reduce aggressiveness by approximately 1/2  Maintain effectiveness and efficiency

57 Future Work  Implement modified algorithm for assessment  Further investigate loss models  Parameterization of loss models  Assessment of end-to-end networks loss characteristics  Investigate optimal TCP response to observed loss characteristics  Investigate stochastic analysis of parallel TCP over wide area networks

Download ppt "The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker May 8, 2002 Internet 2 Member Meeting."

Similar presentations

Ads by Google