Reliable Transport Protocols Should Forbid Reneging Nasif Ekiz Paul D. Amer, Professor Preethi Natarajan Ertugrul Yilmaz Jon Leighton Abu Rahman sponsored by U.S. Army Research Lab
Outline transport layer selective acknowledgments (SACKs) – what are SACKs? – how are SACKs renegable? what is the gain in making selective acks non-renegable? what is the penalty in making selective acks non-renegable? 2 Conclusion: selective acks for reliable transport protocols (e.g., TCP, SCTP) should be non-renegable
Transport layer receive buffer ordered data (ACKed) out-of-order data (SACKed) available space receiving application Internet data sender receive buffer
Types of acknowledgments For ordered data - cumulative ACK n – bytes [... to n-1 ] (TCP) [RFC 793] – segments [... to n ] (SCTP) [RFC 2960] For out-of-order data - selective ACK (SACK) m-n – bytes [ m to n-1 ] (TCP) [RFC 2018] – segments [ m to n ] (SCTP) [RFC 2960] 4 prevent unnecessary retransmissions during loss recovery improve throughput when multiple losses in same window
Types of ACKs data sender receive buffer data receiver ACK 1 2 ACK ACK 2, SACK ACK 2, SACK ACK 2, SACK 4-6 ACK
What is reneging? 6 [RFC 2018]: “The SACK option is advisory, in that, while it notifies the data sender that the data receiver has received the indicated segments, the data receiver is permitted to later discard data which have been reported in a SACK option.” discarding SACKed data before delivery to the receiver application (or socket) is “reneging” TCP and SCTP allow reneging - data sender retains copies of all SACKed data until ACKed
data sender receiver buffer data receiver ACK 1 ACK 2, SACK 4-4 ACK 2, SACK 4-5 ACK ACK 2, SACK Reneging example 3 ACK 3 3 ACK 3, SACK OS needs memory and reneges
Summary of reneging Reneging : data receiver SACKs data, and later discards it (i.e., SACK information is “advisory”, not a delivery guarantee) Reneging is discouraged but permitted Data sender keeps data in a send buffer until cumulatively ACKed (i.e., cum ack is a guarantee) 8 Special Case for SCTP – out-of-order data already delivered to the application is non-renegable by definition
Special case for SCTP: unordered data data sender receive buffer 2 data receiver 1 1 ACK ACK 1, SACK ACK 1, SACK ACK 1, SACK 3-4 unordered 6 ? ?
Special case for SCTP: unordered data data sender receive buffer data receiver ACK 1 ACK 1, SACK ACK 6 ACK 1, SACK 3-4 ACK 1, SACK ACK 7 10 unordered ACK 1, SACK 3-6 unordered
We argue that tolerating reneging is wrong Suppose SACKs were a guarantee of delivery, not advisory – All SACKs are non-renegable (NR-SACKs) – Data receiver takes responsibility for all selectively acked data – Data sender can remove NR-SACKed data from send buffer 11 Part I: What is the gain in forbidding selective acks? always improved send buffer utilization (TCP and SCTP) “Non-renegable selective acks for SCTP” Int'l Conf on Network Protocols (ICNP), Orlando, 10/08 sometimes improved throughput (SCTP) “Throughput analysis of Non-Renegable Selective Acks for SCTP” Computer Communications, 33(16), 10/10
Send buffer utilization Send buffer consists of two types of data: – Necessary (renegable) - N – Unnecessary (non-renegable) - U Send buffer utilization = N / (N+U) 12
Send buffer utilization (SACK) 13 data sender receive buffer data receiver send buffer 1 1 ACK % ACK 1, SACK % 4 4 ACK 1, SACK % 5 5 ACK 1, SACK % send buffer blocking 2 ACK % % % % %
Send buffer utilization (NR-SACK) data sender receiver buffer data receiver send buffer 1 1 ACK % ACK % 7268 ACK % ACK % 4 ACK 1, NR-SACK % 4 5 ACK 1, NR-SACK % ACK 1, NR-SACK ACK 1, NR-SACK % ACK 1, NR-SACK % 10 ACK % no send buffer blocking
NR-SACK ns-2 simulation 15
NR-SACK FreeBSD implementation 16
Send buffer utilization (ns-2) NR-SACK As traffic load increases, NR-SACKs better utilize send buffer Send Buffer Utilization 17 SACK SACK 64K SACK 32K ∞
Send buffer utilization (FreeBSD) 18 Send Buffer Utilization ∞
Throughput gains (ns-2) (only for SCTP not TCP) 19 NR-SACKs never do worse than SACKs
20 Changing TCP or SCTP to non-reneging protocol is easy: SACK semantics changed from advisory to permanent if data receiver needs to renege, data receiver MUST RESET the connection ( this is the penalty) Let’s assume transport protocols are designed to forbid data reneging Part II: What is penalty in forbidding selective acks? We argue that tolerating reneging is wrong
Suppose reneging occurs 1 in 100,000 TCP (or SCTP) flows Case A (current practice): reneging allowed –99,999 non-reneging connections underutilize send buffer (and for SCTP may achieve lower throughput) –1 reneging connection continues (maybe...) Case B (proposed change): reneging forbidden –99,999 connections have equal or better send buffer utilization (and for SCTP potential greater throughput) –1 reneging connection is RESET Hypothesis: “data reneging rarely if ever occurs in practice” Data reneging has never been studied –Does data reneging happen or not? –If reneging happens, how often? How big is the penalty? answer: depends on how often reneging happens
receive buffer Data SenderData Receiver Router OS reneges State of receive buffer Detecting TCP reneging at a router 4 ACK 1, SACK ACK 1, SACK ACK 2, SACK 7-7 ACK 2, SACK 3-6 ? reneging detected
Model to detect reneging Current state (C) and new SACK (N) are compared 4 possibilities: SACK SACK NewCurrent SACK SACK SACK SACK SACK 15-20
Model to detect reneging Current state (C) New SACK (N) Reneging (R)
Model to detect reneging CAIDA* trace TCP flow filter Reneg Detect tshark editcap mergecap ~4600 lines of C code ACK reordering check TCP flows with SACKs reneging? yes or no.pcap *Cooperative Association for Internet Data Analysis
Model verification RenegDetect was tested with synthetic TCP flows – created reneging flows with text2pcap – all reneging flows were identified correctly RenegDetect was tested with real TCP flows from CAIDA Internet traces – at first, reneging seemed to occur frequently – on closer inspection, we found that many SACK implementations are incorrect ! “Misbehaviors in TCP SACK Generation” (Ekiz, Rahman, Amer) (ACM SIGCOMM Computer Communication Review, April 2011)
Misbehaviors in SACK generation 7 misbehaviors are observed in CAIDA traces We designed TBIT tests to verify SACK generation 27 OS’s tested RenegDetect updated to identify misbehaviors
Example TBIT test
Incorrect SACK implementations Operating System Misbehavior ABCD EFG FreeBSD 5.3, 5.4 Y Y Linux (Debian 3)Y Linux (Red Hat 8)Y Linux (Fedora 1)Y Linux (Ubuntu 5.10)Y Linux (Ubuntu 6.06)Y Linux (Debian 4)Y OpenBSD 4.2, 4.5, 4.6, 4.7YY OpenSolaris YY OpenSolaris YY Solaris 10Y Windows 2000YYYYY Windows XPYYYYY Windows Server 2003YYYYY Windows VistaYY Windows Server 2008YY Windows 7YY
Event A: TCP flow reneges Hypothesis: We want to design an experiment which rejects H 0 with 95% confidence to conclude Our experiment will observe n TCP flows hoping to NOT find even a single instance of reneging Using MAPLE, n ≥ 299,572 Experiment design – how to “prove” reneging does not happen?
Questions ? (thank you) 31
TCP Send buffer utilization (SACK) 32 data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 5 ACK 1, SACK 3-4 ACK 1, SACK 3-5 send buffer % 75% 50% 25% 100% send buffer blocking ACK 6 6
TCP Send buffer utilization (NR-SACK) 33 NO send buffer blocking
Data reneging in OSes Reneging in Linux (version ) – tcp_prune_ofo_queue() deletes out-of-order data Reneging in FreeBSD, Mac OS – net.inet.tcp.do_tcpdrain sysctl turns reneging on/off – tcp_drain() deletes out-of-order data
Data reneging in Linux
3. Inferring the state of receive buffer TCP Segments with n SACK options Enough space for another SACK option Not enough space for another SACK option n=1~88%0% n=2~11%0% n=30.7%0.20% n=4n/a0.15% Total number of TCP segments780,798 (100%)
3. Inferring the state of receive buffer TCP Segments with n SACK options Enough space for another SACK option Not enough space for another SACK option n=1~88%0% n=2~11%0% n=30.7%0.20% n=4n/a0.15% Total number of TCP segments780,798 (100%)
NR-SACK Negotiation INIT – NR-SACKs Supported INIT-ACK – NR-SACKs Supported COOKIE-ACK COOKIE-ECHO SCTP Data Transfer SCTP Association Startup DATA NR-SACKs (cum-ack,gap-ack, nr-gap-ack, dup-TSN) 38
NR-SACK Chunk Type = 0x10 Chunk Flags Chunk Length Cumulative TSN Ack Advertised Receiver Window Credit Number of NR-Gap-Ack Blocks = M Number of Gap Ack Blocks = N Gap Ack Block #1 Start Gap Ack Block #1 End Duplicate TSN 1 Duplicate TSN X Gap Ack Block #N Start Gap Ack Block #N End Number of Duplicate TSNs = X Reserved NR-Gap Ack Block #1 Start NR-Gap Ack Block #1 End NR-Gap Ack Block #M Start NR-Gap Ack Block #M End ACK SACK NR-SACK D-SACK 39
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver SID : Stream Identifier SSN : Stream Sequence Number 40
When is data non-renegable? case 2: multistreaming data sender receiver buffer 1 1 data receiver ACK 1 SID: 1 SSN: 1 SID : Stream Identifier SSN : Stream Sequence Number 41
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 SID: 1 SSN: 1 SID: 2 SSN: 1 SID : Stream Identifier SSN : Stream Sequence Number 42
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 SID: 1 SSN: 1 SID: 2 SSN: 1 SID: 1 SSN: 2 SID : Stream Identifier SSN : Stream Sequence Number 43
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 SID: 1 SSN: 1 SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID : Stream Identifier SSN : Stream Sequence Number 44
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 SID: 1 SSN: 1 SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID: 1 SSN: 3 SID : Stream Identifier SSN : Stream Sequence Number 45 ? ?
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 ACK 1, SACK 3-5 SID: 1 SSN: 1 SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID: 1 SSN: 3 SID : Stream Identifier SSN : Stream Sequence Number 46
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK 3-6 SID: 1 SSN: SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID: 1 SSN: 3 SID: 2 SSN: 3 SID : Stream Identifier SSN : Stream Sequence Number 47
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 6 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK 3-6 SID: 1 SSN: SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID: 1 SSN: 3 SID: 2 SSN: 3 SID: 2 SSN: 1 SID : Stream Identifier SSN : Stream Sequence Number 48
When is data non-renegable? case 2: multistreaming data sender receiver buffer data receiver ACK 1 ACK 1, SACK 3-3 ACK 6 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK 3-6 SID: 1 SSN: ACK 7 SID: 2 SSN: 1 SID: 1 SSN: 2 SID: 2 SSN: 2 SID: 1 SSN: 3 SID: 2 SSN: 3 SID: 2 SSN: 1 SID: 1 SSN: 4 SID : Stream Identifier SSN : Stream Sequence Number 49
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver 50
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer 1 1 Data Receiver ACK 1 51
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 52
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 ACK 1, SACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 6 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 6 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK ACK
When is data non-renegable? Case 3: OS guarantee Data Sender Receiver Buffer Data Receiver ACK 1 ACK 1, SACK 3-3 ACK 6 ACK 1, SACK 3-4 ACK 1, SACK 3-5 ACK 1, SACK ACK 7 * * OS guarantees not to renege