1 Experiment And Analysis of Dynamic TCP Acknowledgement Daeseob Lim Sam Lai Wing-Ho Gordon Wong
2 What is the main problem of dynamic TCP acknowledgment? To ACK, or not to ACK: that is the question. When a packet arrives to the receiver, there are two choices: ACK immediately. Wait and ACK later, such that you may have the chance to acknowledge multiple packets with just one ACK.
3 What’s the difference? ACK immediately: low latency : time elapses between the packet arrival and the ACK for this packet is send. No. of ACK packet increase. Wait and ACK later: High latency Small amount of ACK packet generated
4 What consider to be the best solution? Low no. of acknowledge Low aggregate acknowledgement latency for all packets
5 Aggregate latency example: The total ACK latency is 30ms in this case. Packet A Arrive Packet B Arrive Send single ACK for both A and B 10ms ACK latency for B is 10ms ACK latency for A is 20ms
6 Naïve Solutions Send ACK immediately for each packet? Low or even no ACK latency, but this will generate too much ACK packets Send one ACK for all the end of all transmits? Only one ACK is needed However, high latency Don’t know which one is the last packet What if the link is unreliable?
7 An Online Randomized Algorithm Dynamic TCP acknowledgement and other stories about e/(e-1) By Anna R. Kalin, Claire Kenyon, Dana Randall Able to achieve a competitive ratio of 1.58 compare to the optimal solution Competitive ratio = performance of the algorithm / performance of the optimal solution
8 Detail about this algorithm P(t, t`) be the set of packets arrive between time t and t` There exists z such that z is between 0 and 1 inclusively Distributed function to produced z. (randomized factor) Suppose that ith acknowledgement happens at time i and the next one happen at time t i+1 (con’t)
9 Detail about the algorithm By the algorithm, we should locate T i+1 such that t i <= T i+1 <= t i+1 and P(t i, T i+1 )(T i+1 - t i+1 ) = z If we do that, z unit of latency cost will be saved by sending a single additional ACK at T i+1.
10 Before applying the algorithm
11 After the algorithm
12 Why this works? The rectangle is guarantee to have area of at least 1 By sending 1 additional ACK, the acknowledgement cost increase by 1, but the lat ency cost decreases by at least 1. The new sequence is at least as good as the original one. More detail proof in the paper. Dynamic TCP acknowledgement and other stories about e/(e-1)
13 Contribution of this research Implement a randomized online algorithm about delayed ACK into Linux kernel Compare real performance of the randomized algorithm and the current TCP implementation Observe its superiority in terms of cost Analyze its inability in terms of throughput
14 ACK for data packet Data packet Receiver Data packet Receiver ACK packet Immediate ACK Delayed ACK Data packet Receiver ACK packet Schedule a timer Timer expired ≈ 40ms
15 Interval of Delayed-ACK Timer Determined by some factors Minimum/maximum interval by kernel constants Estimated RTT Restrictions by RFC 2581 The maximum is 500ms. Acknowledge at least every second segment. Acknowledge out-of-order data immediately. In most cases, ≈ 40ms (~ 200ms)
16 Implementation of TCP on Linux Need to send an immediate ACK? tcp_rcv_established() __tcp_ack_snd_check() tcp_send_ack() tcp_send_delayed_ack() Received a data packet from IP-layer Yes No, Then why not ‘Delayed ACK’? The point to hack kernel codes !!
17 Hacking protocol stack Cost > Random value ? Send additional ACK ! tcp_send_delayed_ack() Choose a random value Yes No Scale to threshold value Cost = Unacked data size * Elapsed time since last ACK
18 Generating random number Generate random numbers in advance, store them into kernel codes, and select a number sequentially y = e x /(e-1) 1 X Y 0.599, 0.761, 0.232, 0.378, 0.619, 0.997, …. unsigned rand_numbers[1000] = { 0.599, 0.761, 0.232, 0.378, 0.619, 0.997, …. …. }; …. number = rand_numbers[index++]; Generate numbers with off-line program Select random number in the array
19 Test Environment Client Router Server Modified Kernel + Network Sniffer (Ethereal) Network Emulator (ns2)
20 Competitive Ratio Experiment The server sends out 100 packets to the client at random time spacing at most 70ms apart. The competitive ratio is calculated for each cost ratio starting from 0.05 to 0.95 stepping by 0.05, then to stepping by Run on simulated networks having bandwidth of 100Mbps and RTT of 2ms and 100ms for both versions of TCP.
21 Overall Competitive Ratio on 2ms Network
22 Overall Competitive Ratio on 100ms Network
23 Blowup Competitive Ratio on 2ms Network
24 Blowup Competitive Ratio on 100ms Network
25 Analysis For new TCP, overall the competitive ratio is within 1.58 except for borderline cases. Small cost ratio: Expensive latency cost Overhead from network sniffer Overhead from new TCP Large cost ratio: Expensive acknowledgement cost Original TCP acknowledgements Possibility of additional acknowledgement
26 Analysis For original TCP, the competitive ratio starts out extremely high, then converges rapidly with the new TCP. Eventually, it starts to increase, but at a slower rate. Favors delay acknowledgement even when latency cost is high. Always acknowledge within 200ms or every 2 packet full of data even when acknowledgement cost is high.
27 Streaming Data Experiment The client sends out a request to the server asking for data of a certain size to be sent. The server replies with the data. The client measures the total duration to determine throughput. Run on simulated networks having bandwidth of 100Mbps and RTT of 2ms and 100ms for both versions of TCP.
28 Streaming Data Result RTT Original TCP New TCP ThroughputSpeedup 2ms6.926Mbps6.873Mbps ms0.577Mbps 1
29 Analysis The new TCP can not outperform the original TCP in terms of throughput. Intuitively, you can imagine, if the incoming traffic is regular and data keeps pouring in, to optimize throughput, you’d want to delay ack as long as possible. The new TCP can not do delay ack longer than the original TCP. Random scale down of the threshold for sending an additional acknowledgement. Our implementation induces little overhead.
30 Conclusion Prove the randomized algorithm can achieve the competitive ratio of 1.58 in most cases. Our implementation achieves better competitive ratio comparing to the original TCP in most cases. Low overhead implementation. Can not improve network performance in terms of throughput.