Transmission Control Protocol Chapter 12 Transmission Control Protocol Objectives Upon completion you will be able to: Be able to name and understand the services offered by TCP Understand TCP’s flow and error control and congestion control Be familiar with the fields in a TCP segment Understand the phases in a connection-oriented connection Understand the TCP transition state diagram Be able to name and understand the timers used in TCP Be familiar with the TCP options TCP/IP Protocol Suite
Figure 12.1 TCP/IP protocol suite
12.1 TCP SERVICES We explain the services offered by TCP to the processes at the application layer. The topics discussed in this section include: Process-to-Process Communication Stream Delivery Service Full-Duplex Communication Connection-Oriented Service Reliable Service TCP/IP Protocol Suite
Table 12.1 Well-known ports used by TCP TCP/IP Protocol Suite
Windows : system_path\windows\system32\drivers\etc\services Example 1 As we said in Chapter 11, in UNIX, the well-known ports are stored in a file called /etc/services. Each line in this file gives the name of the server and the well-known port number. We can use the grep utility to extract the line corresponding to the desired application. The following shows the ports for FTP. $ grep ftp /etc/services ftp-data 20/tcp ftp-control 21/tcp Windows : system_path\windows\system32\drivers\etc\services TCP/IP Protocol Suite
To the lower layers, TCP handles data in blocks, the segments. Figure 12.2 Stream delivery To the lower layers, TCP handles data in blocks, the segments. To the higher layers TCP handles data as a sequence of bytes and does not identify boundaries between bytes Two processes seem to be connected by an imaginary “tube”. TCP/IP Protocol Suite
Buffers handle difference between sending and receiving speed. Figure 12.3 Sending and receiving buffers Buffers handle difference between sending and receiving speed. Each host have sending buffers and receiving buffers. TCP/IP Protocol Suite
Data send in segments, due to IP layer needs to sent in packet. Figure 12.4 TCP segments Data send in segments, due to IP layer needs to sent in packet. All operation is transparent to process level. TCP/IP Protocol Suite
TCP characteristics Full Duplex Communication Data flow bi-direction in same connection. Connection-Oriented Service Before any data transfer, TCP establishes a connection: One TCP entity is waiting for a connection (“server”) The other TCP entity (“client”) contacts the server The actual procedure for setting up connections is more complex. Reliable Service TCP is a reliable protocol. Use acknowledgement mechanism to check the safe and sound arrival of data TCP has checksums for header and data. Segments with invalid checksums are discarded TCP/IP Protocol Suite
12.2 TCP FEATURES To provide the services mentioned in the previous section, TCP has several features that are briefly summarized in this section. The topics discussed in this section include: Numbering System Flow Control Error Control Congestion Control TCP/IP Protocol Suite
Numbering Systems Byte Number TCP numbers all data bytes that are transmitted in a connection. Numbering is independent in each direction. The Numbering not necessarily start from 0, TCP generates a random number between 0 – 232 -1 Sequence Number TCP assigns a sequence number to each segment that being sent. The sequence number for each segment is the number of the first byte carried in that segment. The bytes of data being transferred in each connection are numbered by TCP. The numbering starts with a randomly generated number. TCP/IP Protocol Suite
TCP/IP Protocol Suite
The acknowledgment number is cumulative. Numbering Systems Sequence Number (Cont.) When a segment carried both data and control information, it uses a sequence numbers. When a segment carried only control, it does not uses a sequence numbers. (has some exception establishment, termination, abortion). When a segment carried only data information, it uses a sequence numbers. Acknowledgement Number When a connection is established, both parties can send and receive data at the same time. Acknowledgement number defines the number of next byte that the party expects to receive. The acknowledgment number is cumulative. TCP/IP Protocol Suite
TCP Features Flow Control: Congestion Control: Error Control: Algorithms to prevent that the sender overruns the receiver with information? Congestion Control: Algorithms to prevent that the sender overloads the network Error Control: Algorithms to recover or conceal the effects from packet losses TCP/IP Protocol Suite
12.3 SEGMENT A packet in TCP is called a segment The topics discussed in this section include: Format Encapsulation TCP/IP Protocol Suite
Figure 12.5 TCP segment format TCP/IP Protocol Suite
TCP header fields Port Number: A port number identifies the endpoint of a connection. A pair <IP address, port number> identifies one endpoint of a connection. Two pairs <client IP address, server port number> and <server IP address, server port number> identify a TCP connection. TCP/IP Protocol Suite
TCP header fields Sequence Number (SeqNo): Sequence number is 32 bits long. So the range of SeqNo is 0 <= SeqNo <= 232 -1 4.3 Gbyte Each sequence number identifies a byte in the byte stream Initial Sequence Number (ISN) of a connection is set during connection establishment TCP/IP Protocol Suite
TCP header fields Acknowledgement Number (AckNo): Acknowledgements are piggybacked, I.e a segment from A -> B can contain an acknowledgement for a data sent in the B -> A direction A hosts uses the AckNo field to send acknowledgements. (If a host sends an AckNo in a segment it sets the “ACK flag”) The AckNo contains the next SeqNo that a hosts wants to receive Example: The acknowledgement for a segment with sequence numbers 0-1500 is AckNo=1501 TCP/IP Protocol Suite
TCP header fields Header Length ( 4bits): Length of header in 32-bit words Note that TCP header has variable length (with minimum 20 bytes) TCP/IP Protocol Suite
Figure 12.6 Control field TCP/IP Protocol Suite
TCP header fields Flag bits: URG: Urgent pointer is valid If the bit is set, the following bytes contain an urgent message in the range: SeqNo <= urgent message <= SeqNo+urgent pointer ACK: Acknowledgement Number is valid PSH: PUSH Flag Notification from sender to the receiver that the receiver should pass all data that it has to the application. Normally set by sender when the sender’s buffer is empty TCP/IP Protocol Suite
TCP header fields Flag bits: RST: Reset the connection The flag causes the receiver to reset the connection Receiver of a RST terminates the connection and indicates higher layer application about the reset SYN: Synchronize sequence numbers Sent in the first packet when initiating a connection FIN: Sender is finished with sending Used for closing a connection Both sides of a connection must send a FIN TCP/IP Protocol Suite
I Table 12.2 Description of flags in the control field TCP/IP Protocol Suite
TCP header fields Window Size: TCP Checksum: Urgent Pointer: Each side of the connection advertises the window size Window size is the maximum number of bytes that a receiver can accept. Maximum window size is 216-1= 65535 bytes TCP Checksum: TCP checksum covers over both TCP header and TCP data (also covers some parts of the IP header) Urgent Pointer: Only valid if URG flag is set TCP/IP Protocol Suite
The inclusion of the checksum in TCP is mandatory. Figure 12.7 Pseudoheader added to the TCP datagram Pseudoheader added to calculate checksum The inclusion of the checksum in TCP is mandatory. TCP/IP Protocol Suite
TCP header fields Options: (discuss in detail later) TCP/IP Protocol Suite
Figure 12.8 Encapsulation and decapsulation TCP/IP Protocol Suite
12.4 A TCP CONNECTION TCP is connection-oriented. A connection-oriented transport protocol establishes a virtual path between the source and destination. All of the segments belonging to a message are then sent over this virtual path. A connection-oriented transmission requires three phases: connection establishment, data transfer, and connection termination. The topics discussed in this section include: Connection Establishment Data Transfer Connection Termination Connection Reset TCP/IP Protocol Suite
TCP Connection Establishment TCP uses a three-way handshake to open a connection before any data is transferred: (1) ACTIVE OPEN: Client sends a segment with SYN bit set * port number of client initial sequence number (ISN) of client (2) PASSIVE OPEN: Server responds with a segment with initial sequence number of server ACK for ISN of client (3) Client acknowledges by sending a segment with: ACK ISN of server (* counts as one byte) TCP/IP Protocol Suite
Three-Way Handshake TCP/IP Protocol Suite
(correction sequence number = 8001) Figure 12.9 Connection establishment using three-way handshaking (correction sequence number = 8001) TCP/IP Protocol Suite
A Closer Look with tcpdump 1 aida.poly.edu.1121 > mng.poly.edu.telnet: S 1031880193:1031880193(0) win 16384 <mss 1460,nop,wscale 0,nop,nop,timestamp> 2 mng.poly.edu.telnet> aida.poly.edu.1121: S 172488586:172488586(0) ack 1031880194 win 8760 <mss 1460> 3 aida.poly.edu.1121 > mng.poly.edu.telnet: . ack 172488587 win 17520 4 aida.poly.edu.1121 > mng.poly.edu.telnet: P 1031880194:1031880218(24) ack 172488587 win 17520 5 mng.poly.edu.telnet> aida.poly.edu.1121: P 172488587:172488590(3) ack 1031880218 win 8736 6 aida.poly.edu.1121 > mng.poly.edu.telnet: P 1031880218:1031880221(3) ack 172488590 win 17520 TCP/IP Protocol Suite
Three-Way Handshake TCP/IP Protocol Suite
Why is a Two-Way Handshake not enough? When aida initiates the data transfer (starting with SeqNo=15322112355), mng will reject all data. TCP/IP Protocol Suite
Note: A SYN segment cannot carry data, but it consumes one sequence number. A SYN + ACK segment cannot carry data, but does consume one sequence number. An ACK segment, if carrying no data, consumes no sequence number. TCP/IP Protocol Suite
TCP/IP: 3-way handshake (Normal) A: valid sender B: valid receiver SYN SYN + ACK SYN Cache ACK TCP/IP Protocol Suite
TCP/IP: SYN Flood Attack X: attacker A: valid sender B: valid receiver SYN SYN SYN Cache SYN Cache Full Packet Dropped TCP/IP Protocol Suite
TCP Data Transfer After connection is established, bidirectional data transfer can take place. PUSH flag use, server TCP must deliver data to the process as soon as possible. (bypass receive window) Urgent Data (Urgent Pointer) use to point to urgent bytes. Receive process can read data out of order. Example : In case of control+C URG flag is set TCP/IP Protocol Suite
Figure 12.10 Data transfer TCP/IP Protocol Suite
TCP Connection Termination Each end of the data flow must be shut down independently (“half-close”) If one end is done it sends a FIN segment. This means that no more data will be sent Four steps involved: (1) X sends a FIN to Y (active close) (2) Y ACKs the FIN, (at this time: Y can still send data to X) (if Y don’t have data to sent, (2) and (3) can come in same packet) (3) and Y sends a FIN to X (passive close) (4) X ACKs the FIN. TCP/IP Protocol Suite
Figure 12.11 Connection termination using three-way handshaking TCP/IP Protocol Suite
TCP Connection Termination 1 mng.poly.edu.telnet > aida.poly.edu.1121: F 172488734:172488734(0) ack 1031880221 win 8733 2 aida.poly.edu.1121 > mng.poly.edu.telnet: . ack 172488735 win 17484 3 aida.poly.edu.1121 > mng.poly.edu.telnet: F 1031880221:1031880221(0) ack 172488735 win 17520 4 mng.poly.edu.telnet > aida.poly.edu.1121: . ack 1031880222 win 8733 TCP/IP Protocol Suite
TCP Connection Termination TCP/IP Protocol Suite
Figure 12.12 Half-close TCP/IP Protocol Suite
Note: The FIN segment consumes one sequence number if it does not carry data. The FIN + ACK segment consumes one sequence number if it does not carry data. TCP/IP Protocol Suite
TCP Connection Reset Resetting connections is done by setting the RST flag When is the RST flag set? Connection request arrives and no server process is waiting on the destination port (denying a connection) Abort (Terminate) a connection Causes the receiver to throw away buffered data. Receiver does not acknowledge the RST segment Terminating an Idle Connection TCP/IP Protocol Suite
12.5 STATE TRANSITION DIAGRAM To keep track of all the different events happening during connection establishment, connection termination, and data transfer, the TCP software is implemented as a finite state machine. . The topics discussed in this section include: Scenarios TCP/IP Protocol Suite
Table 12.3 States for TCP TCP/IP Protocol Suite
Figure 12.13 State transition diagram TCP/IP Protocol Suite
TCP State Transition Diagram Opening A Connection TCP/IP Protocol Suite
TCP State Transition Diagram closing A Connection TCP/IP Protocol Suite
Figure 12.14 Common scenario TCP/IP Protocol Suite
Another FIN to Client reset 2MSL Timer Figure 12.15 Three-way handshake Another FIN to Client reset 2MSL Timer TCP/IP Protocol Suite
2MSL Wait State 2MSL Wait State = TIME_WAIT When TCP does an active close, and sends the final ACK, the connection must stay in in the TIME_WAIT state for twice the maximum segment lifetime. 2MSL= 2 * Maximum Segment Lifetime Why? TCP is given a chance to resent the final ACK. (Server will timeout after sending the FIN segment and resend the FIN) The MSL is set to 2 minutes or 1 minute or 30 seconds. TCP/IP Protocol Suite
Note: The common value for MSL is between 30 seconds and 1 minute. MSL = Maximum Segment Lifetime TCP/IP Protocol Suite
Figure 12.16 Simultaneous open TCP/IP Protocol Suite
Figure 12.17 Simultaneous close TCP/IP Protocol Suite
Figure 12.18 Denying a connection TCP/IP Protocol Suite
Figure 12.19 Aborting a connection TCP/IP Protocol Suite
12.6 FLOW CONTROL Flow control regulates the amount of data a source can send before receiving an acknowledgment from the destination. TCP defines a window that is imposed on the buffer of data delivered from the application program. The topics discussed in this section include: Sliding Window Protocol Silly Window Syndrome TCP/IP Protocol Suite
Figure 12.20 Sliding window TCP/IP Protocol Suite
TCP’s sliding windows are byte oriented. Note: A sliding window is used to make transmission more efficient as well as to control the flow of data so that the destination does not become overwhelmed with data. TCP’s sliding windows are byte oriented. TCP/IP Protocol Suite
Example 3 What is the value of the receiver window (rwnd) for host A if the receiver, host B, has a buffer size of 5,000 bytes and 1,000 bytes of received and unprocessed data? Solution The value of rwnd = 5,000 − 1,000 = 4,000. Host B can receive only 4,000 bytes of data before overflowing its buffer. Host B advertises this value in its next segment to A. TCP/IP Protocol Suite
Example 4 What is the size of the window for host A if the value of rwnd is 3,000 bytes and the value of cwnd is 3,500 bytes? Solution The size of the window is the smaller of rwnd and cwnd, which is 3,000 bytes. TCP/IP Protocol Suite
Sliding Window Flow Control Sliding Window Protocol is performed at the byte level: Here: Sender can transmit sequence numbers 6,7,8. TCP/IP Protocol Suite
Sliding Window: “Window Closes” Transmission of a single byte (with SeqNo = 6) and acknowledgement is received (AckNo = 5): TCP/IP Protocol Suite
Sliding Window: “Window Opens” Acknowledgement is received that enlarges the window to the right (AckNo = 5, Win=6): TCP/IP Protocol Suite
Sliding Window: “Window Shrinks” Acknowledgement is received that reduces the window from the right (AckNo = 5, Win=3): (strongly discouraged) TCP/IP Protocol Suite
Window Management in TCP The receiver is returning two parameters to the sender The interpretation is: I am ready to receive new data with SeqNo= AckNo, AckNo+1, …., AckNo+Win-1 Receiver can acknowledge data without opening the window Receiver can change the window size without acknowledging data TCP/IP Protocol Suite
Sliding Window: Example TCP/IP Protocol Suite
Example 5 Figure 12.21 shows an unrealistic example of a sliding window. The sender has sent bytes up to 202. We assume that cwnd is 20 (in reality this value is thousands of bytes). The receiver has sent an acknowledgment number of 200 with an rwnd of 9 bytes (in reality this value is thousands of bytes). The size of the sender window is the minimum of rwnd and cwnd or 9 bytes. Bytes 200 to 202 are sent, but not acknowledged. Bytes 203 to 208 can be sent without worrying about acknowledgment. Bytes 209 and above cannot be sent. TCP/IP Protocol Suite
Example 6 In Figure 12.21 the server receives a packet with an acknowledgment value of 202 and an rwnd of 9. The host has already sent bytes 203, 204, and 205. The value of cwnd is still 20. Show the new window. Solution Figure 12.22 shows the new window. Note that this is a case in which the window closes from the left and opens from the right by an equal number of bytes; the size of the window has not been changed. The acknowledgment value, 202, declares that bytes 200 and 201 have been received and the sender needs not worry about them; the window can slide over them. TCP/IP Protocol Suite
Example 7 In Figure 12.22 the sender receives a packet with an acknowledgment value of 206 and an rwnd of 12. The host has not sent any new bytes. The value of cwnd is still 20. Show the new window. Solution The value of rwnd is less than cwnd, so the size of the window is 12. Figure 12.23 shows the new window. Note that the window has been opened from the right by 7 and closed from the left by 4; the size of the window has increased. TCP/IP Protocol Suite
Example 8 In Figure 12.23 the host receives a packet with an acknowledgment value of 210 and an rwnd of 5. The host has sent bytes 206, 207, 208, and 209. The value of cwnd is still 20. Show the new window. Solution The value of rwnd is less than cwnd, so the size of the window is 5. Figure 12.24 shows the situation. Note that this is a case not allowed by most implementations. Although the sender has not sent bytes 215 to 217, the receiver does not know this. TCP/IP Protocol Suite
Example 9 How can the receiver avoid shrinking the window in the previous example? Solution The receiver needs to keep track of the last acknowledgment number and the last rwnd. If we add the acknowledgment number to rwnd we get the byte number following the right wall. If we want to prevent the right wall from moving to the left (shrinking), we must always have the following relationship. new ack + new rwnd ≥ last ack + last rwnd or new rwnd ≥ (last ack + last rwnd) − new ack TCP/IP Protocol Suite
Note: To avoid shrinking the sender window, the receiver must wait until more space is available in its buffer. TCP/IP Protocol Suite
TCP/IP Protocol Suite
Sliding Window : Window Shutdown Receiver can temporarily shutdown the window by sending an rwnd of 0. Sender can always send segment with one byte of data. (called probing, use to prevent dead lock) TCP/IP Protocol Suite
Sliding Window : Silly Window Syndrome Sending application creates data slowly or receiving application consumes data slowly. Result : sending of data in very small segments. Example : 41 bytes with 1 byte of data SWS : When AdvertisedWindow < MSS How to avoid it? not to introduce a small segment receiver waits till MSS space is available before advertizing a window open from zero TCP/IP Protocol Suite
Sliding Window : Silly Window Syndrome Nagle’s Algorithm (solve in case of sending) Timer -- clock based If both available data and Window ≥ MSS, send full segment. Else, buffer new data until ACK returns. Else, send new data now. Note -- Socket interface allows some applications to turn off Nagle’s algorithm by setting the TCP-NODELAY option. Clark’s Solution (solve in case of receiving) Send ACK as soon as the data arrives, with zero window Delayed ACK Delay sending ACK TCP/IP Protocol Suite
12.7 ERROR CONTROL TCP provides reliability using error control, which detects corrupted, lost, out-of-order, and duplicated segments. Error control in TCP is achieved through the use of the checksum, acknowledgment, and time-out. The topics discussed in this section include: Checksum Acknowledgment Acknowledgment Type Retransmission Out-of-Order Segments Some Scenarios TCP/IP Protocol Suite
Acknowledgement ACK segments don’t consume seq no and are not acknowledged. ACK rules Rule 1: piggybacking Rule 2: When the receiver has no data to send and it receives an in-order segment, the receiver delays sending an ACK until another segment arrived or until a period of time (normally 500 ms). Rule 3: no more than 2 in-order unacknowledged segments at any time Rule 4: When a segment arrives with an out-of-order seg. number, the receiver immediately sends an ACK announcing the next expected seg. number. Rule 5: When a missing segment arrives, the receiver immediately sends an ACK. Rule 6: If a duplicate segment arrives, the receiver immediately sends an ACK. Acknowledgement types Accumulate Acknowledgement (ACK) Selective Acknowledgement (SACK) TCP/IP Protocol Suite
ACK segments do not consume sequence numbers and are not acknowledged. Note: ACK segments do not consume sequence numbers and are not acknowledged. TCP/IP Protocol Suite
Retransmission In modern implementations, a retransmission occurs if the retransmission timer expires or three duplicate ACK segments have arrived. No retransmission timer is set for an ACK segment. Retransmission after RTO (retransmission timeout) Retransmission after Three Duplicate ACK segments Out-of-Order segment Data may arrive out of order and be temporarily stored by the receiving TCP, but TCP guarantees that no out-of-order segment is delivered to the process. TCP/IP Protocol Suite
No retransmission timer is set for an ACK segment. Note: In modern implementations, a retransmission occurs if the retransmission timer expires or three duplicate ACK segments have arrived. No retransmission timer is set for an ACK segment. Data may arrive out of order and be temporarily stored by the receiving TCP, but TCP guarantees that no out- of-order segment is delivered to the process. TCP/IP Protocol Suite
Figure 12.25 Normal operation TCP/IP Protocol Suite
The receiver TCP delivers only ordered data to the process. Figure 12.26 Lost segment The receiver TCP delivers only ordered data to the process. TCP/IP Protocol Suite
Figure 12.27 Fast retransmission TCP/IP Protocol Suite
Delayed segment Duplicate segment Delayed TCP segment are treated the same way as lost or corrupted segments by the receiver. The delayed segment may arrive after it has been resent (a duplicate segment) Duplicate segment When a segment arrives that contains a sequence number less than the previously acknowledged bytes, it is discarded. TCP/IP Protocol Suite
Figure 12.28 Lost acknowledgment TCP/IP Protocol Suite
Figure 12.29 Lost acknowledgment corrected by resending a segment Lost acknowledgments may create deadlock if they are not properly handled. TCP/IP Protocol Suite
12.8 CONGESTION CONTROL Congestion control refers to the mechanisms and techniques to keep the load below the capacity. The topics discussed in this section include: Network Performance Congestion Control Mechanisms Congestion Control in TCP TCP/IP Protocol Suite
Congestion Control Congestion in a network may occur if the load on the network is greater than the capacity of the network Congestion control refers to the mechanism and techniques to control the congestion and keep the load below the capacity Congestion in a network or internetwork occurs because routers and switches have queues TCP/IP Protocol Suite
Figure 12.30 Router queues TCP/IP Protocol Suite
Network performance Delay versus Load Figure 12.31 Packet delay and network load Network performance Delay versus Load TCP/IP Protocol Suite
Figure 12.32 Throughput versus network load Throughput versus Load TCP/IP Protocol Suite
Congestion Control Congestion control mechanisms refers to techniques and mechanisms that can either prevent congestion, before it happens, or remove congestion, after it has happened open-loop congestion control (prevention) and closed loop congestion control (removal) TCP/IP Protocol Suite
Congestion Control Open-loop congestion control Retransmission policy The retransmission policy and the retransmission timers must be designed to optimize efficiency and at the same time prevent congestion Acknowledgment policy If the receiver does not acknowledge every packet it receives, it may slow down the sender and help prevent congestion Discard policy In audio transmission, if the policy is to discard less sensitive packets when congestion is likely, the quality of sound is still preserved and congestion is prevented TCP/IP Protocol Suite
Congestion Control Closed-loop congestion control Back pressure informing the previous upstream router to reduce the rate of outgoing packets Choke point is a packet sent by a router to the source to inform it of congestion is similar to ICMP’s source quench packet Implicit signaling Detecting an implicit signal warning of congestion and slow down its sending rate. Ex) receiving delayed ACK Explicit signaling Router experiencing congestion can send an explicit signal by setting a bit in a packet to the sender or the receiver TCP/IP Protocol Suite
Congestion Control Congestion window Today, TCP protocols include that the sender’s window size is not only determined by the receiver but also by congestion in the network Actual window size = minimum (rwnd, cwnd) TCP/IP Protocol Suite
Figure 12.33 Slow start, exponential increase In the slow start algorithm, the size of the congestion window increases exponentially until it reaches a threshold. TCP/IP Protocol Suite
Congestion Control In the slow start algorithm, the size of the congestion window increases exponentially until it reaches a threshold Start cwnd = 1 After 1 RTT cwnd = 1 x 2 = 2 21 After 2 RTT cwnd = 2 x 2 = 4 22 After 3 RTT cwnd = 4 x 2 = 8 23 TCP/IP Protocol Suite
Figure 12.34 Congestion avoidance, additive increase When the size of the congestion window reaches the slow start threshold, in the congestion avoidance algorithm, the size of the congestion window increases additively until congestion is detected TCP/IP Protocol Suite
TCP/IP Protocol Suite
Congestion Control Congestion detection: Multiplicative Decrease Most implementations react differently to congestion detection: If detection is by time-out, a new slow start phase starts If detection is by three ACKs, a new congestion avoidance phase starts TCP/IP Protocol Suite
Figure 12.35 TCP congestion policy summary TCP/IP Protocol Suite
Figure 12.36 Congestion example TCP/IP Protocol Suite
12.9 TCP TIMERS To perform its operation smoothly, most TCP implementations use at least four timers. The topics discussed in this section include: Retransmission Timer Persistence Timer Keepalive Timer TIME-WAIT Timer TCP/IP Protocol Suite
Figure 12.37 TCP timers To perform its operation smoothly, most TCP implementations use at least four timers TCP/IP Protocol Suite
TCP Timers Round Trip Time (RTT) To calculate the retransmission (RTO), we first need to calculate the round-trip time (RTT) In TCP, there can be only one RTT measurement in progress at any time Measured RTT (RTTM) : how long it takes to send a segment and receive an acknowledgment of it TCP/IP Protocol Suite
TCP Timers Smoothed RTT (RTTS) : Weighed average of RTTM and previous RTTS Original No Value After first measurement RTTS = RTTM After any other measurement RTTS = (1- ) RTTS + · RTTM The value of is implementation-dependent, but it is normally set to 1/8 In TCP, there can be only be one RTT measurement in progress at any time. TCP/IP Protocol Suite
TCP Timers RTT Deviation (RTTD) Original No Value After first measurement RTTD = RTTM/2 After any other measurement RTTD = (1- ) RTTD + · l RTTS – RTTM I The value of is also implementation dependent, but is it is usually is sent to ¼. TCP/IP Protocol Suite
TCP Timers Retransmission Timeout (RTO) Original Initial Value After any measurement RTO = RTTS + 4 RTTD TCP/IP Protocol Suite
Example 10 Let us give a hypothetical example. Figure 12.38 shows part of a connection. The figure shows the connection establishment and part of the data transfer phases. 1. When the SYN segment is sent, there is no value for RTTM , RTTS , or RTTD . The value of RTO is set to 6.00 seconds. The following shows the value of these variables at this moment: RTTM = 1.5 RTTS = 1.5 RTTD = 1.5 / 2 = 0.75 RTO = 1.5 + 4 . 0.75 = 4.5 2. When the SYN+ACK segment arrives, RTTM is measured and is equal to 1.5 seconds. The next slide shows the values of these variables: TCP/IP Protocol Suite
Example 10 (continued) RTTM = 1.5 RTTS = 1.5 RTTD = 1.5 / 2 = 0.75 RTO = 1.5 + 4 . 0.75 = 4.5 3.When the first data segment is sent, a new RTT measurement starts. Note that the sender does not start an RTT measurement when it sends the ACK segment, because it does not consume a sequence number and there is no time-out. No RTT measurement starts for the second data segment because a measurement is already in progress. RTTM = 2.5 RTTS = 7/8 (1.5) + 1/8 (2.5) = 1.625 RTTD = 3/4 (7.5) + 1/4 |1.625 − 2.5| = 0.78 RTO = 1.625 + 4 (0.78) = 4.74 TCP/IP Protocol Suite
Figure 12.38 Example 10 TCP/IP Protocol Suite
TCP Timers Persistence Timer When acknowledgment with non-zero window size after zero window size is lost, to correct deadlock, TCP uses a persistence timer for each connection When the sending TCP receives an acknowledgment with a window size of zero, the persistence timer is started When persistence timer goes off, the sending TCP sends a special segment called a probe The probe alerts the receiving TCP that the acknowledgment was lost and should be resent If a response is not received, the sender continues sending the probe segments and doubling, and resetting the value of the persistence timer until the value reaches a threshold (usually 60 seconds) After that sender sends one probe segment every 60s until the window is reopened TCP/IP Protocol Suite
TCP Timers Keepalive Timer TIME-WAIT Timer Used to prevent a long idle connection between two TCPs. Each time the server hears from a client, it resets this timer Time-out is usually 2 hours After 2 hours, sending 10 probes to client (each 75 secs), then terminates connection TIME-WAIT Timer The time-wait timer is used during connection termination TCP/IP Protocol Suite
Note: TCP does not consider the RTT of a retransmitted segment in its calculation of a new RTO. TCP/IP Protocol Suite
Example 11 Figure 12.39 is a continuation of the previous example. There is retransmission and Karn’s algorithm is applied. The first segment in the figure is sent, but lost. The RTO timer expires after 4.74 seconds. The segment is retransmitted and the timer is set to 9.48, twice the previous value of RTO. This time an ACK is received before the time-out. We wait until we send a new segment and receive the ACK for it before recalculating the RTO (Karn’s algorithm). TCP/IP Protocol Suite
Figure 12.39 Example 11 TCP/IP Protocol Suite
12.10 OPTIONS The TCP header can have up to 40 bytes of optional information. Options convey additional information to the destination or align other options. TCP/IP Protocol Suite
Figure 12.40 Options TCP/IP Protocol Suite
EOP can be used only once. Figure 12.41 End-of-option option EOP can be used only once. TCP/IP Protocol Suite
NOP can be used more than once. Figure 12.42 No-operation option NOP can be used more than once. TCP/IP Protocol Suite
Figure 12.43 Maximum-segment-size option The value of MSS is determined during connection establishment and does not change during the connection. TCP/IP Protocol Suite
Figure 12.44 Window-scale-factor option The value of the window scale factor can be determined only during connection establishment; it does not change during the connection. TCP/IP Protocol Suite
Figure 12.45 Timestamp option One application of the timestamp option is the calculation of round trip time (RTT). TCP/IP Protocol Suite
Example 12 Figure 12.46 shows an example that calculates the round-trip time for one end. Everything must be flipped if we want to calculate the RTT for the other end. The sender simply inserts the value of the clock (for example, the number of seconds past from midnight) in the timestamp field for the first and second segment. When an acknowledgment comes (the third segment), the value of the clock is checked and the value of the echo reply field is subtracted from the current time. RTT is 12 s in this scenario. TCP/IP Protocol Suite
Example 12 (Continued) The receiver’s function is more involved. It keeps track of the last acknowledgment sent (12000). When the first segment arrives, it contains the bytes 12000 to 12099. The first byte is the same as the value of lastack. It then copies the timestamp value (4720) into the tsrecent variable. The value of lastack is still 12000 (no new acknowledgment has been sent). When the second segment arrives, since none of the byte numbers in this segment include the value of lastack, the value of the timestamp field is ignored. When the receiver decides to send an accumulative acknowledgment with acknowledgment 12200, it changes the value of lastack to 12200 and inserts the value of tsrecent in the echo reply field. The value of tsrecent will not change until it isreplaced by a new segment that carries byte 12200 (next segment). TCP/IP Protocol Suite
Example 12 (Continued) Note that as the example shows, the RTT calculated is the time difference between sending the first segment and receiving the third segment. This is actually the meaning of RTT: the time difference between a packet sent and the acknowledgment received. The third segment carries the acknowledgment for the first and second segments. TCP/IP Protocol Suite
The timestamp option can also be used for PAWS. Figure 12.46 Example 12 The timestamp option can also be used for PAWS. TCP/IP Protocol Suite
Figure 12.47 SACK TCP/IP Protocol Suite
Example 13 Let us see how the SACK option is used to list out-of-order blocks. In Figure 12.48 an end has received five segments of data. The first and second segments are in consecutive order. An accumulative acknowledgment can be sent to report the reception of these two segments. Segments 3, 4, and 5, however, are out of order with a gap between the second and third and a gap between the fourth and the fifth. An ACK and a SACK together can easily clear the situation for the sender. The value of ACK is2001, which means that the sender need not worry about bytes 1 to 2000. The SACK has two blocks. The first block announces that bytes 4001 to 6000 have arrived out of order. The second block shows that bytes 8001 to 9000 have also arrived out of order. This means that bytes 2001 to 4000 and bytes 6001 to 8000 are lost or discarded. The sender can resend only these bytes. TCP/IP Protocol Suite
Figure 12.48 Example 13 TCP/IP Protocol Suite
Example 14 The example in Figure 12.49 shows how a duplicate segment can be detected with a combination of ACK and SACK. In this case, we have some out-of-order segments (in one block) and one duplicate segment. To show both out-of-order and duplicate data, SACK uses the first block, in this case, to show the duplicate data and other blocks to show out-of-order data. Note that only the first block can be used for duplicate data. The natural question is how the sender, when it receives these ACK and SACK values knows that the first block is for duplicate data (compare this example with the previous example). The answer is that the bytes in the first block are already acknowledged in the ACK field; therefore, this block must be a duplicate. TCP/IP Protocol Suite
Figure 12.49 Example 14 TCP/IP Protocol Suite
Example 15 The example in Figure 12.50 shows what happens if one of the segments in the out-of-order section is also duplicated. In this example, one of the segments (4001:5000) is duplicated. The SACK option announces this duplicate data first and then the out-of-order block. This time, however, the duplicated block is not yet acknowledged by ACK, but because it is part of the out-of-order block (4001:5000 is part of 4001:6000), it is understood by the sender that it defines the duplicate data. TCP/IP Protocol Suite
Figure 12.50 Example 15 TCP/IP Protocol Suite
12.11 TCP PACKAGE We present a simplified, bare-bones TCP package to simulate the heart of TCP. The package involves tables called transmission control blocks, a set of timers, and three software modules. The topics discussed in this section include: Transmission Control Blocks (TCBs) Timers Main Module Input Processing Module Output Processing Module TCP/IP Protocol Suite
Figure 12.51 TCP package A TCP package involving a table called Transmission Control Blocks, a set of timers, and three software modules: main module, input processing module, output processing module. TCP/IP Protocol Suite
Figure 12.52 TCBs Transmission Control Block (TCBs) To control the connection, TCP uses a structure to hold information about each connection TCP keeps an array of TCBs in the form of a table TCP/IP Protocol Suite
TCP Package State : defining the state of the connection according to the state transition diagram Process : defining the process using this connection at this machine as a client or a server Local IP address : defining the IP address of the local machine used by this connection Local port number : defining the local port number used by this connection Remote IP address Remote port address Interface : defining the local interface Local window : holding information about the window at the local TCP Remote window TCP/IP Protocol Suite
TCP Package Sending sequence number Receiving sequence number Sending ACK number Time-out values : retransmission time-out, persistence time-out, keepalive time-out, and so on Buffer size : defining the size of the buffer at the local TCP Buffer pointer : pointer to buffer where the receiving data is kept until is read by the application TCP/IP Protocol Suite
TCP Package Main Module : The main module is invoked by an arrived TCP segment, a time-out, or a message from an application program TCP/IP Protocol Suite
TCP Package Input processing module Output processing module handles all the details needed to process data or acknowledgment received when TCP is in the ESTABLISHED state sends an ACK if needed, takes care of the window size, does error checking, and so on Output processing module handles all the details needed to send out data received from application program when TCP is in the ESTABLISHED state handles retransmission time-outs, persistent time-outs, and so on TCP/IP Protocol Suite