1 Testing TCP Westwood+ over Transatlantic Links at 10 Gigabit/Second rate Saverio Mascolo Dipartimento di Elettrotecnica ed Elettronica Politecnico di Bari Via Orabona 4, Bari, Italy PFLDNET 05, Feb. 3, 2005, Lyon Giuseppe Racanelli Summer student at CERN IT DIVISION
2Saverio Mascolo – PFLDNET’05Motivation Recent introduction of 10 Gigabit Routers and 10 Gigabit Ethernet cards makes of great importance the issue of designing and testing new protocols capable of efficient utilization of 10 gigabit Internet paths Recent introduction of 10 Gigabit Routers and 10 Gigabit Ethernet cards makes of great importance the issue of designing and testing new protocols capable of efficient utilization of 10 gigabit Internet paths
3Saverio Mascolo – PFLDNET’05 Outline Brief summary of problems of TCP over Gigabit nets Brief summary of problems of TCP over Gigabit nets Brief description of Westwood+ TCP Brief description of Westwood+ TCP Performance evaluation of Westwood+ over the DataTAG at CERN IT division Performance evaluation of Westwood+ over the DataTAG at CERN IT division
4Saverio Mascolo – PFLDNET’05 Standard TCP troughput The long-term throughput T of standard TCP can be approximated as The long-term throughput T of standard TCP can be approximated as which sets a fundamental limitation for the TCP which sets a fundamental limitation for the TCP
5Saverio Mascolo – PFLDNET’05 In other terms… To fill a high speed path with bandwidth B it is necessary to open a congestion window To fill a high speed path with bandwidth B it is necessary to open a congestion window
6Saverio Mascolo – PFLDNET’05 Required packet loss which requires a packet loss probability which requires a packet loss probability i.e., to obtain full link utilization, a lower and lower p is required with increasing B. i.e., to obtain full link utilization, a lower and lower p is required with increasing B.
7Saverio Mascolo – PFLDNET’05 From S. Floyd draft on HS-TCP A Standard TCP connection with 1500-byte packets and a 100 ms round-trip time would require an average congestion window of 83,333 segments to achieve a steady-state throughput of 10 Gbps in the presence of a packet drop rate of at most one loss event every 5,000,000 packets. The average packet drop rate of at most 2*10^- 10, which is needed for full link utilization in this scenario, corresponds to a bit error rate of at most 2*10^-14, which is unrealistic for current networks. A Standard TCP connection with 1500-byte packets and a 100 ms round-trip time would require an average congestion window of 83,333 segments to achieve a steady-state throughput of 10 Gbps in the presence of a packet drop rate of at most one loss event every 5,000,000 packets. The average packet drop rate of at most 2*10^- 10, which is needed for full link utilization in this scenario, corresponds to a bit error rate of at most 2*10^-14, which is unrealistic for current networks.
8Saverio Mascolo – PFLDNET’05 Reasons to investigate Westwood+ For these considerations, the main idea of Westwood+, which consists of shrinking the control windows after congestion by taking into account an estimate of the available bandwidth, is valuable of investigation in the context of very high speed networks. For these considerations, the main idea of Westwood+, which consists of shrinking the control windows after congestion by taking into account an estimate of the available bandwidth, is valuable of investigation in the context of very high speed networks.
9Saverio Mascolo – PFLDNET’05 WESTWOOD+ TCP key idea of Westwood+: use the stream of ack packets to get an e2e estimate of the available bandwidth to be used for setting cwnd and ssthresh after congestion (whereas standard TCP implements a “blind” by half window decrease)
10Saverio Mascolo – PFLDNET’05 TCP Westwood+ Congestion Avoidance Slow start cwnd time Timeout ssthresh BWE*RTTmin Adaptive decrease cwnd=ssthr=BWE*RTTmin Westwood Adaptive decrease vs (New) Reno blind by ½ window shrinking
E2E bandwidth estimation The rate of returning ACKS is exploited to estimate the “best-effort” available bandwidth The rate of returning ACKS is exploited to estimate the “best-effort” available bandwidth ACKs packets Filter RECEIVER SENDER Bandwidth estimate ACKs packets Network
12Saverio Mascolo – PFLDNET’05 Warning… ACKs reach the TCP sender compressed ACKs reach the TCP sender compressed Bandwidth samples Bandwidth samples contain high frequency components that cannot be filtered out by a discrete-time filter due to aliasing contain high frequency components that cannot be filtered out by a discrete-time filter due to aliasing
13Saverio Mascolo – PFLDNET’05 An anti-aliasing filter in packet networks Antialiased samples
14Saverio Mascolo – PFLDNET’05 We are currently using the standard exponential filter We are currently using the standard exponential filter
15Saverio Mascolo – PFLDNET’05 Summary on bandwidth estimate Westwood TCP: one bandwidth sample computed for each ACK (Mobicom 01)=>> Bandwdith overestiamte (when ACK compression) Westwood TCP: one bandwidth sample computed for each ACK (Mobicom 01)=>> Bandwdith overestiamte (when ACK compression) Westwood+ TCP: one bandwidth sample for each RTT (see ACM CCR, April 04) Westwood+ TCP: one bandwidth sample for each RTT (see ACM CCR, April 04)
Known Advantages of Westwood+ TCP higher throughput over wireless links because losses due to unreliable links do not provoke overshrinking of the congestion window Improved fairness wrt to Reno (Reno throughput is proportional to 1/RTT whereas Westwood throughput is proportional to 1/sqrt(RTT) )
17Saverio Mascolo – PFLDNET’05 Pseudo code of Westwood+ a)On ACK reception: a)On ACK reception: -cwnd is increased accordingly to the Reno algorithm; -an estimate BWE of the available bandwdith is computed; b)When 3 DUPACKs are received: b)When 3 DUPACKs are received: ssthresh =max(2, (BWE* RTTmin) / seg_size); cwnd = ssthresh; cwnd = ssthresh; c)When coarse timeout expires: c)When coarse timeout expires: ssthresh = max(2,(BWE* RTTmin) / seg_size); cwnd = 1;
18Saverio Mascolo – PFLDNET’05 Experimental testbed
19Saverio Mascolo – PFLDNET’05 Single Stream Tests congestion window and slow start threshold of a single TCP NewReno stream over a 10Gbps. At t=180s, due to a loss, cwnd reduces from 2.5*10^8 bytes to 2.7*10^7 bytes and the TCP enters the congestion avoidance phase. congestion window and slow start threshold of a single TCP NewReno stream over a 10Gbps. At t=180s, due to a loss, cwnd reduces from 2.5*10^8 bytes to 2.7*10^7 bytes and the TCP enters the congestion avoidance phase.
20Saverio Mascolo – PFLDNET’05 Instantaneous and mean throughput of NewReno TCP it is around 1.8Gbps, which is less than one fifth of the channel capacity. it is around 1.8Gbps, which is less than one fifth of the channel capacity.
21Saverio Mascolo – PFLDNET’05 cwnd and ssthresh dynamics obtained in the same scenario using Westwood+ TCP cwnd after congestion reduces from 2.5*10^8 bytes to 2.3*10^8 bytes, which is remarkably larger than the corresponding value obtained using New Reno. cwnd after congestion reduces from 2.5*10^8 bytes to 2.3*10^8 bytes, which is remarkably larger than the corresponding value obtained using New Reno.
22Saverio Mascolo – PFLDNET’05 Instantaneous and mean throughput of Westwood+ TCP the achieved throughput is now around 7 Gbps
23Saverio Mascolo – PFLDNET’05 Cwnd and ssthresh of Westwood+ TCP an UDP stream at 5Gbps is injected for few seconds; the slow start threshold is set to 3.5*10^7 bytes after congestion and, again, it takes a long time for the TCP in congestion phase to grab all the bandwidth available after the UDP is turned off. an UDP stream at 5Gbps is injected for few seconds; the slow start threshold is set to 3.5*10^7 bytes after congestion and, again, it takes a long time for the TCP in congestion phase to grab all the bandwidth available after the UDP is turned off.
24Saverio Mascolo – PFLDNET’05 Throughput of Westwood+ UDP active for a while: around one tenth of the available bandwidth (i.e. 1.2 Gbps) is achieved. UDP active for a while: around one tenth of the available bandwidth (i.e. 1.2 Gbps) is achieved.
25Saverio Mascolo – PFLDNET’05 TCP Westwood+ with a modified probing phase à la Scalable TCP on ACK reception; If ssthresh < = cwnd < window_threshold cwnd=cwnd+1/cwnd; cwnd=cwnd+1/cwnd; If cwnd> window_threshold If cwnd> window_threshold cwnd=cwnd+0.04 cwnd=cwnd+0.04 By increasing cwnd of 0.04 on every ack reception, cwnd increases of one twenty-fifth per RTT, i.e., the growth is greater with larger windows.
26Saverio Mascolo – PFLDNET’05 cwnd and ssthresh of Westwood+ TCP using the modified congestion avoidance UDP active for a while UDP active for a while
27Saverio Mascolo – PFLDNET’05 Westwood+ TCP using a modified probing phase In this case, even though the setting of the threshold is below the network capacity, the congestion window quickly increases and provides good results in terms of average throughput, which jumps to 6.2 Gbps In this case, even though the setting of the threshold is below the network capacity, the congestion window quickly increases and provides good results in terms of average throughput, which jumps to 6.2 Gbps
28Saverio Mascolo – PFLDNET’05 Multiple Stream Tests The testbed is the 10Gbps connection going from Geneva to Chicago, where the link between the Cisco router 7606 at Geneva and the Extreme router s01gva is at 1 Gbps link. The testbed is the 10Gbps connection going from Geneva to Chicago, where the link between the Cisco router 7606 at Geneva and the Extreme router s01gva is at 1 Gbps link. To investigate fairness in bandwidth utilization we consider 3 flows sharing the bottleneck. To investigate fairness in bandwidth utilization we consider 3 flows sharing the bottleneck.
29Saverio Mascolo – PFLDNET’05 Cwnd of 3 NewReno flows New Reno flows exhibit the classic “sawtooth” oscillatory behaviour of the cwnd due to the by half window reduction.
30Saverio Mascolo – PFLDNET’05 Cwnd of 3 Westwood+ flows REMARK: oscillation free behavior (the congestion window is kept around the same value of 5*10^06 byte during all the test) REMARK: oscillation free behavior (the congestion window is kept around the same value of 5*10^06 byte during all the test)
31Saverio Mascolo – PFLDNET’05 Throughput - 3 New Reno flows the average per-connection throughput in the case of New Reno is 270 Mbps the average per-connection throughput in the case of New Reno is 270 Mbps
32Saverio Mascolo – PFLDNET’05 Throughput - 3 Westwood+ streams the average per-connection throughput in the case of Westwood+ is 320 Mbps. the average per-connection throughput in the case of Westwood+ is 320 Mbps.
33Saverio Mascolo – PFLDNET’05Fairness To provide a mathematical evaluation of the fairness, we plot the dynamics of the Jain fairness index defined as below: To provide a mathematical evaluation of the fairness, we plot the dynamics of the Jain fairness index defined as below: where b i (t) is the instantaneous throughput of the ith connection and M is the number of connections sharing the bottleneck. The Jain fairness index belongs to the interval [0,1] and increases with fairness up to the value of one.
34Saverio Mascolo – PFLDNET’05 Jain Fairness Index of 3 NewReno flows
35Saverio Mascolo – PFLDNET’05 Jain Fairness Index of 3 Westwood+ flows
36Saverio Mascolo – PFLDNET’05 Conclusions The setting of the cwnd and ssthresh a là Westwood provides improvement in throughput and fairness wrt NewReno TCP also in the context of gigabit networks. The setting of the cwnd and ssthresh a là Westwood provides improvement in throughput and fairness wrt NewReno TCP also in the context of gigabit networks. We plan to make much more experiments We plan to make much more experiments We plan to blend the Westwood+ features with more aggresive probing phases such as the ones of Scalable TCP or HS-TCP We plan to blend the Westwood+ features with more aggresive probing phases such as the ones of Scalable TCP or HS-TCP
37Saverio Mascolo – PFLDNET’05 ACKNOWLEDGMENTS We thank Olivier Martin at the IT division of CERN and all the CS group, namely: Sylvain Ravot, Paolo Moroni, Edoardo Martelli and Dan Nae (from Caltech) for their great support and for allowing us to collect measurements reported in this paper. We thank Olivier Martin at the IT division of CERN and all the CS group, namely: Sylvain Ravot, Paolo Moroni, Edoardo Martelli and Dan Nae (from Caltech) for their great support and for allowing us to collect measurements reported in this paper.
38Saverio Mascolo – PFLDNET’05 Thanks for the attention and Questions?