The University of Alberta, June 17th, 2004 Wireless Random Packet Networking, Part II: TCP/IP Over Lossy Links - TCP SACK without Congestion Control Roland Kempter The University of Alberta, June 17th, 2004 Department of Electrical And Computer Engineering University of Utah Salt Lake City, UT, USA Email: rolke@gmx.net
Organization The History of TCP 2. Current TCP Congestion Control 3. Design Ideas: no congestion control at all 4. Measurement Results 5. Future TCP Congestion Control? 6. Conclusion
1. The History of TCP (incomplete) Old Tahoe slow start and congestion avoidance. After packet loss, timeout followed by slowstart Tahoe added fast recovery. Duplicate ACKs initiate fast retransmit, then slowstart [Jaco88] Reno fast retransmit extended by fast recovery [Jaco90] New Reno small optimization of TCP Reno, immediately retransmit the packet following a partial ACK without leaving fast recovery [Hoe96] TCP SACK specify the range of packets that were received out of order. More than one packet per RTT during fast recovery is send [MatmahFlRo96] Old Tahoe New Reno ‘94 TCP FACK ‘96 Vegas ‘90 SACK ‘95 ‘88
2. Congestion Control: slow start Slow Start: with every received ACK, double the number of packets that are sent. Slow start adds a window to the sender's TCP: the congestion window, called cwnd as well as a variable called ssthres exponential growth of the Congestion Window up to ssthres, then linear growth Figure taken from [Jaco88] The congestion window is flow control imposed by the sender. It is based on the sender's educated guess of perceived network congestion. Congestion Control assumes that packets are only lost due to overfull queues.
Fast retransmission/fast recovery 2. Congestion Control: congestion avoidance in TCP Reno window Fast retransmission/fast recovery time SS CA SS: Slow Start CA: Congestion Avoidance [unfortunately, I it slipped my mind where I found this animation: I hope you don‘t mind!]
win=min(snd_cwnd,snd_wnd,snd_bwnd) 2. TCP Congestion Control TCP send rate is determined by three windows: win=min(snd_cwnd,snd_wnd,snd_bwnd) Congestion window assumed bottlenecks: queue sizes in the network Advertised window assumed bottleneck: receiver’s buffer Bandwidth window, “ACK clock” assumed bottleneck: link capacity
2. Congestion Control: congestion avoidance Congestion Control assumes that packets are only lost due to overfull queues Again: When do we need the snd_cwnd ? only if we assume that the queues in the network are the bottlenecks. In real world, is there more to infer from a lost packet than it has to be retransmitted? SACK optimizes the „Retransmission Business“ Also:
3. Design Idea Perform well in lossy (wireless) environments TCP offers: Flow Control Bandwidth Control Congestion Control In-order-delivery Error Control (retransmissions) ...and a lot more TCP does not: offer timely delivery avoid unnecessary overhead under certain conditions (e.g short connections) Perform well in lossy (wireless) environments Why? Because of the way TCP handles congestion control
3. Design Idea: no congestion control at all Recall the sending rate is given by: win=min(snd_cwnd,snd_wnd,snd_bwnd) Now: win=min(snd_bwnd,snd_wnd) Without SACK, this flavor of TCP will perform poorly (waste of bandwidth on duplicate ACKs that can lead to timeouts) SACK gives us control over the now “static” window UDP? In contrast to UDP, the protocol will still guarantee for in-order delivery and will adopt to the link capacity.
4. Measurements: the emulation environment Node 1 Node 0, „base“ Node 3, „base“ Node 2 On all links, delay=10ms. Loss rates varied from p=0, p=0.001, p=0.01, p=0.1 to p=0.2 Packt loss events are uniformly distributed. Emulation has been set up in the emulab environment [emu]. [emu] www.emulab.net
4. Measurements: collection of Data 1. Initialize tcpdump on the to-be-observed node: sudo tcpdump -c num -w file -i if & 2. Start ttcp on the receiving node ttcp -r -s src 3. Start ttcp on the sending node ttcp -t -s -n num dst num - number of packets to be captured file - name of the dump file if - interface to be listened to src - IP address of the sending node dst - IP address of the receiving node Traces have been analyzed off-line with ethereal [eth]. [eth] www.etheral.com, packet sniffer and analyzer
4. Measurements: time-sequence graph of SACKBASE, lossless link tcpdump started on the sender, zoomed into connection „set up“ phase optimum size of the send window in case of a link bottleneck: bandwidth-delay product advertised receiver window seq # ACKs received
4. Measurements: time-sequence graph of SACKEXP, lossless link tcpdump started on the sender, zoomed into connection „set up“ phase
4. Measurements: summary of results, competing flows [KemXinKas04] R. Kempter, B. Xin, S. Kumar Kasera, “Towards a Composable Transport Protocol: TCP without Congestion Control”, submitted to SIGCOMM 2004
4. Measurements: summary of results, competing flows [KemXinKas04] R. Kempter, B. Xin, S. Kumar Kasera, “Towards a Composable Transport Protocol: TCP without Congestion Control”, submitted to SIGCOMM 2004
5. Future TCP Congestion Control? ECN bit Another way to do congestion control: the ECN bit Instead of dropping packets, a router sends a TCP an explicit message stating that the network is becoming congested. The network determines an explicit rate for a sender [RamFloyd99]. Hop-by-Hop vs. End-to-end congestion control [RamFloyd99] Ramakrishnan, K.K., and Floyd, S., A Proposal to add Explicit Congestion Notification (ECN) to IP. RFC 2481, January 1999
6. Conclusion Due to ambiguity in packet loss, current TCP congestion control leads to low throughput over lossy links TCP without any congestion control and without SACK is very inefficient At p=0.1%, SACKEXP achieves 91% of the goodput of a lossless link, whereas TCP SACK only achieves 65% (at identical efficiencies of 91%) As the loss rate increases to 20%, SACKEXP achieves goodputs in the order of 700% at similar efficiencies compared to TCP SACK In the future, we plan to investigate the performance of SACKEXP with Congestion Control based on the ECN bit/ICMP Source Quench Performance of SACKEXP has to be compared to a TCP that can resort to a Link Layer retransmission scheme.
Questions are welcome! THE END [Jaco88] Van Jacobson, “Congestion Avoidance and Control”, ACM SIGCOMM '88 [Jaco90] Van Jacobson, “Modified TCP Congestion Avoidance Algorithm”, email to end2end-interest@ISI.EDU, April 1990 [BraMalPet94] Lawrence S. Brakmo, Sean W. O'Malley, Larry L. Peterson, „TCP Vegas: New Techniques for Congestion Detection and Avoidance“, Sigcomm 1994 [MatMahFlRo96] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow, „TCP Selective Acknowledgement Options“, RFC 2018, April 1996 [Hoe96] Janey C. Hoe, “Improving the start-up behavior of a Congestion Control Scheme for TCP, Sigcomm 1996 [MatMah96] M. Mathis and J. Mahdavi, "Forward acknowledgement: Refining TCP congestion control“, ACM Computer Communication Review, Oct 1996.