High-speed TCP FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low) Modifying TCP's Congestion Control for High Speeds (by S. Floyd, S. Ratnasamy, and S. Shenker) Scalable TCP: Improving Performance in High-Speed WAN (by Tom Kelly)
Problem with TCP Sending rate: T = 1.2 / sqrt(p) packets per rtt where p is packet loss rate Example: 1500bytes packet, 100ms rtt, 10Gbp pipe require window size W = 83,333 packets at most 1 drop every 5,000,000,000 packets at most 1 drop every 6000 seconds W = sqrt(1.5) / sqrt(p); N = W 2 / 1.5 Real drop rate makes TCP a bottleneck which leads to poor network utilization
Problem with TCP AIMD ACK: w = w + a / w (each rtt) Drop: w = w – b * w Slow – StartACK: w = w + c where a = 1, b = 0.5, c = 1 TCP steady state response function
HSTCP: Goals Performance Sustain high speeds without requiring unrealistically low loss rates Reach high speeds reasonably quickly in when slow start Recover from congestion without huge delays
HSTCP: Goals Compatibility Deployment without router involvement Fair treatment of unmodified TCP (unrealistic) Fair treatment of unmodified TCP: original TCP get as much bandwidth as if packet loss rate is very small
HSTCP: Approach Leave slow start phase as it is Needs only 17 rtt make W = packets Change response function by tweaking parameters: a and b Same for p > P = (W = 31) For smaller p treat a and b as functions of current window size
HSTCP: Response function Suggestion of new RF to reach high speed: w = 10 S(logp-lopP)+logW for S = (logW 1 – logW) / (logP 1 – logP) gives w = p S * (1 / P) S * W For two points (P, W) … (P 1, W 1 ) P = , W = 31 P 1 = 10 -7, W 1 = w = 0.15 / p 0.82
HSTCP: Response function
HSTCP: Fairness
HSTCP: Tweaking of a and b For w <= W: a(w) = 1; b(w) = 0.5 For w > W need such a(w) and b(w) that w() gives: p(W) = P, p(W 1 ) = P 1
HSTCP: Testing Not available
STCP: Scalable TCP Same goals More aggressive increase Less aggressive decrease Fair treatment of unmodified TCP Approach ACK: W = W (each ACK) Drop: W = W – [0.125 * W] Doubles sending rate in about 70rtt
STCP: Scaling properties In original TCP scaling depends on sending rate Sending rate = c < Sending rate = C
STCP: Scaling properties In Scalable TCP there is no such dependence Sending rate = c < Sending rate = C
STCP: Response function For P > (W=15) native TCP function is used
STCP: Scaling properties Rate TCP recovery time STCP recovery time 1Mbps 1.7s 2.7s 10Mbps 17s 2.7s 100Mbps 2mins 2.7s 1Gbps 28mins 2.7s 10Gbps 4hrs 43mins 2.7s Environment: 1500 bytes packet, 200ms rtt
STCP: Experiments Implemented in Linux kernel version Competitors: TCP, TCP GB modifications, STCP Topology and environment 2.4Gz Xeon 2Gb RAM Gigabit Ethernet card x 12
STCP: Experiments Experiment #1 4 pairs of 2Gb file exchangers Number of 2Gb transfers completed in 1200 sec
STCP: Experiments Experiment #2 3 pairs of Web-Traffic emulators (1400 users each) 2 pairs of 2Gb file exchangers Concurrent run of 4200 web users and 8 bulk transfers within 1200 sec
Problems with TCP 1.Packet level: AIMD provides slow increase and drastic decrease 2.Flow level: Maintaining large congestion windows requires small equilibrium loss probability 3.Packet level: Binary congestion measure leads to oscillation 4.Flow level: Dynamics is unstable. Resulting oscillations can be reduced only by accurate estimation of packet loss probability and stable design of flow dynamics
HSTCP and STCP vs TCP Reno HSTCP and STC increase more aggressively and decrease less drastically so they can tolerate larger loss probabilities than TCP Reno therefore achieve larger equilibrium windows and solve problems 1 and 2
TCP Oscillations Loss based approach Full utilization – large delays and oscillations Delay based approach Full utilization – stabilized window, predictable delays and no oscillations
FAST TCP: Strategy Window adjustment depends on distance from equilibrium Use queueing delay as congestion measure Multi-bit measure eliminates packet level oscillations Stabilize window near the point where buffer is large and delay is small Stabilizes flow dynamics since queueing delay dynamics scales with respect to network capacity
FAST TCP: Window adjustment Window adjustment is independent of where equilibrium is
FAST TCP: Design Feedback model: Flow level: design such u(w i, T i ) and k(w i, T i ) that feedback model above has an equilibrium fair, efficient, and stable in presence of feedback delay Packet level: take care of issues ignored by flow level such as burstiness control, loss recovery and parameter estimation
FAST TCP: Architecture Data control – which packets to transmit Window control – how many Burstiness control – when Estimation – provides information to above components
FAST TCP: Window update Where: gamma is in (0, 1] baseRTT is a minimum RTT observed so far qdelay is the average end-to-end queueing delay alpha is a constant reflecting # of packets each flow attempts to maintain in network buffer at equilibrium. Provides linear window increase. However it can be constant when qdelay is nonzero. When qdelay is zero increase is exponential Window is updated every 2RTT
FAST TCP: Events and computations Acknowledgement Qdelay Decision about packet injection into the network After packet transmission Time stamp for each packet New window size End of RTT Target throughput Packet loss When to retransmit dropped packets
FAST TCP: Performance Testbed and instrumentation 2.6 GHz Xeon, 2GB RAM Dual onboard gigabit Ethernet interface Network bottleneck capacity 800Mbps and 2000pkts buffer Environment Static and 2 types of dynamic
FAST TCP: Static test X - # of flows, Y - propagation delay, Z – aggregate throughput
FAST TCP: Dynamic test #1 Throughput and window trajectory Queue size, packet losses, link utilization
FAST TCP: Dynamic test #2 Throughput and window trajectory Queue size, packet losses, link utilization
FAST TCP: Overall evaluation ThroughputFairness
FAST TCP: Overall evaluation Stability Responsiveness