Download presentation
Presentation is loading. Please wait.
Published byἹεριχώ Στεφανόπουλος Modified over 6 years ago
1
CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2
Prof. Athirai Irissappane CSS432: End-to-End Protocols
2
Outline Transport Layer Protocols Simple Demultiplexer (UDP)
Communication between applications running in the end nodes hence end-to-end protocols How to convert host-to-host packet delivery service into a process-to-process communication channel Simple Demultiplexer (UDP) Reliable Byte Stream (TCP) CSS432: Internetworking
3
End-to-End Protocols Common properties that a transport protocol can be expected to provide Guarantees message delivery Delivers messages in the same order they were sent Delivers at most one copy of each message Supports arbitrarily large messages Supports synchronization between the sender and the receiver Allows the receiver to apply flow control to the sender Supports multiple application processes on each host CSS432: Internetworking
4
End-to-End Protocols Typical limitations of the network on which transport protocol will operate Drop messages Reorder messages Deliver duplicate copies of a given message Limit messages to some finite size Deliver messages after an arbitrarily long delay Challenge for Transport Protocols Develop algorithms that turn the above limitations into the high level of service required by the application Request a retransmission of M3 and M5 M5, and M3 M5, M5, M3 M5 M2, M1, M4 M5, M4, M3, M2, M1 M5, M2, M1 M4 M4, M3 M3
5
Simple Demultiplexor (UDP)
Extends host-to-host delivery service of the underlying network into a process-to-process communication service Adds a level of demultiplexing which allows multiple application processes on each host to share the network Unreliable and Unordered Datagram Service No flow control: preventing senders from overrunning the capacity of the receivers (messages are discarded if the receiving buffers are full) UDP mostly use for unidirectional communication such a broadcasting CSS432: End-to-End Protocols
6
Simple Demultiplexor (UDP)
Identify each process by port id associated to the sender and receiver Endpoints identified by ports servers have well-known ports see /etc/services file on Unix Header format Optional checksum Calculated using psuedo header + UDP header + data Pseudoheader: IP header (protocol number) + source IP + destination IP, UDP length field SrcPort DstPort Checksum Length Data 16 31 A TCP connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required. What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, CSS432: End-to-End Protocols
7
UDP: Simple Demultiplexer
socket() bind() recv() Port: 13579 13579 socket() bind() recv() Port:12345 socket() sendto() int sd; sd = socket(AF_INET, SOCK_DGRAM, 0); struct sockaddr_in server; server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY) server.sin_port = htons( ); bind( sd, (struct sockaddr *)&server, sizeof( server ) ); recv( sd, buf, sizeof( buf ), 0 ); struct hostent *hp, *gethostbyname( ); Server.sin_family = AF_INET; hp = gethostbyname( ); bcopy( hp->h_addr, &( server.sin_addr.s_addr ), sizeof( hp->h_length) ); sendto( sd, buf, sizeof( buf ), 0, (struct sockaddr *)&server, Packets demultiplexed UDP M5, M4, M3, M2, M1 CSS432: End-to-End Protocols
8
CSS432: End-to-End Protocols
Common end-to-end services guarantee message delivery deliver messages in FIFO order deliver at most one copy of each message support arbitrarily large messages support synchronization allow the receiver to flow control the sender support multiple application processes on each host P1 P2 P3 P4 M5, M4, M3, M2, M1 Network m5, m4, m3, m2, m1 CSS432: End-to-End Protocols
9
CSS432: End-to-End Protocols
TCP Overview Connection Oriented It guarantees that all sent packets will reach the destination in the correct order Use of ACK packets, re-transmission, time out Full duplex: bi-directional, send and receive at each end Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network CSS432: End-to-End Protocols
10
TCP (Reliable Byte Stream)
Byte-oriented protocol, sender writes bytes into a TCP connection and the receiver reads bytes out of the TCP connection. Byte Stream: Not individual bytes source application buffers enough bytes from the sending process to fill a reasonably sized packet (segment) and sends to destination Destination empties the contents of the segment into a receive buffer, and the receiving process reads from this buffer at its leisure. Application process W rite bytes TCP Send buffer Segment T ransmit segments Read Receive buffer … CSS432: End-to-End Protocols
11
Sockets (Code Example)
int sd = socket(AF_INET, SOCK_STREAM, 0); int sd, newSd; sd = socket(AF_INET, SOCK_STREAM, 0); socket() connect() write() socket() connect() write() socket() bind() listen() accept() sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY ) server.sin_port = htons( ); bind( sd, (sockaddr *)&server, sizeof( server ) ); struct hostent *host = gethostbyname( arg[0] ); sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family = AF_INET; server.s_addr = inet_addr( inet_ntoa( *(struct in_addr*)*host->h_addr_list ) ); server.sin_port = htons( ); read() read() listen( sd, 5 ); connect( sd, (sockaddr *)&server, sizeof( server ) ); sockaddr_in client; socklen_t len=sizeof(client); while( true ) { newSd = accept(sd, (sockaddr *)&client, &len); write( sd, buf1, sizeof( buf ) ); write( sd, buf2, sizeof( buf ) ); if ( fork( ) == 0 ) { close( sd ); read( newSd, buf1, sizeof( buf1 ) ); read( newSd, buf2, sizeof( buf2 ) ); } close( newSd ); buf2, buf1 buf2, buf1 close( newsd); exit( 0 ); CSS432: End-to-End Protocols
12
Data Link Versus Transport
Data Link layer transfers data between two adjacent nodes (a single point-to-point physical link) whereas transport layer provides communication between processes running in different hosts need explicit connection establishment and termination Single physical point-to-point link fixed RTT, TCP connections Potentially different RTT (Round Trip Time) as they connect different hosts anywhere on the internet TCP connection between nodes in same room or across network (different RTT) need adaptive timeout mechanism for re-transmissions Point-to-point link, packets received in FIFO order. In TCP, they can be re-ordered as they cross internet, e.g., long delay in network, re-transmission, etc need to be prepared for arrival of very old packets Packets slightly out of order can be corrected using SeqNum of sliding window protocol How late a packet can be? need to set MSL (Maximum Segment Lifetime) CSS432: End-to-End Protocols
13
Data Link Versus Transport
Hosts connected to point-to-point are engineered to support the link. Hosts at both ends have similar resources. If windowsize = bandwidth*RTT, sender and receiver likely to have window size buffer But for a TCP connection, resources dedicated to the TCP connection such as buffer space, etc, can vary, especially if one of the host supports multiple TCP connections need to accommodate different node capacity (flow control) Potentially different network capacity. In a directly connected point-to-point link, the bandwidth of the link is known and transmitter cannot send faster than the bandwidth and not possible to congest the network. In TCP, what links will be traversed is not known before hand, and multiple sources can traverse via the same link. need to be prepared for network congestion CSS432: End-to-End Protocols
14
Segment Format (TCP Header)
Each TCP connection identified with 4-tuple: (SrcPort, SrcIPAddr, DsrPort, DstIPAddr) Sliding window + flow control Acknowledgment, SequenceNum, AdvertisedWindow Flags SYN, FIN, RESET, PUSH, URG, ACK SYN (Synchronize): Establishing a connection FIN (Finish): terminating a connection RESET: Confused and Terminating PUSH: Section 5.2.7 URG: Sending urgent data ACK: Validating acknowledgment field SequenceNum is incremented in all cases other than ACK. Sender Data (SequenceNum) Acknowledgment + AdvertisedWindow Receiver CSS432: End-to-End Protocols
15
Segment Format (TCP Header)
The SrcPort and DstPort fields identify the source and destination ports, respectively. The Acknowledgment, SequenceNum, and AdvertisedWindow fields are all involved in TCP’s sliding window algorithm. Because TCP is a byte-oriented protocol, each byte of data has a sequence number; the SequenceNum field contains the sequence number for the first byte of data carried in that segment. The Acknowledgment and AdvertisedWindow fields carry information about the flow of data going in the other direction. CSS432: End-to-End Protocols
16
Segment Format (TCP Header)
The 6-bit Flags field is used to relay control information between TCP peers. The possible flags include SYN, FIN, RESET, PUSH, URG, and ACK. The SYN and FIN flags are used when establishing and terminating a TCP connection, respectively. The ACK flag is set any time the Acknowledgment field is valid, implying that the receiver should pay attention to it. The URG flag signifies that this segment contains urgent data. When this flag is set, the UrgPtr field indicates where the nonurgent data contained in this segment begins. The urgent data is contained at the front of the segment body, up to and including a value of UrgPtr bytes into the segment. The PUSH flag allow the sender to tell TCP it should (send) flush all bytes collected to its peer and also notify it to the receiving side.. Finally, the RESET flag signifies that the receiver has become confused, it received a segment it did not expect to receive—and so wants to abort the connection. Finally, the Checksum field is used in exactly the same way as for UDP—it is computed over the TCP header, the TCP data, and the pseudoheader, which is made up of the source address, destination address, and length fields from the IP header. CSS432: End-to-End Protocols
17
TCP Connection Establishment and Termination
Tree-Way Handshake Client Initiate a connection to a server by sending segment with seq=x Set a timer and retransmit the request upon an expiration Server Acknowledge the client request with ack=++x Initiate a reverse connection with its own start sequence num seq=y Acknowledge the server request with ack=++y (next seq num expected) X and y chosen at random Segment from earlier incarnation of same connection can interfere with a later incarnation of the connection Active participant (client) Passive participant (server) Flag=SYN, SequenceNum = x SYN + ACK, SequenceNum = y , ACK, Acknowledgment = + 1 Acknowledgment = CSS432: End-to-End Protocols
18
State Transition Diagram
CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Open Active open client connect( ) Passive open server listen( ) Close Active close client or server First close( ) Both side can be active Passive close close( ) in response to the first close( ) CSS432: End-to-End Protocols
19
State Transition Diagram
States involved in opening and closing a TCP connection Anything between is hidden (ESTABLISHED) Each box represents the state of one end of TCP connection All connections start with CLOSED state As connection progresses, it moves from state to state Each arc represents the event/action. Two kinds of events: (1) a segment arrives from peer (2) local operation on TCP One/both sides can close connection. If one side alone closes, then it cannot send segments but can receive them Each arc is labeled using event/action. i.e., When event happens at a given state, it moves to the next state and takes the action CSS432: End-to-End Protocols
20
State Transition Diagram
CLOSED Active open /SYN Opening a connection: CLOSED to LISTEN: Server invokes passive open on TCP waits for conn req. CLOSED to SYN_SENT: Client invokes an active open, moves to SYN_SENT state and SYN segment sent to server. LISTEN to SYN_RCVD: Server receives the SYN segment from client, moves to SYN_RCVD state, sends SYN+ACK to client SYN_SENT to ESTABLISHED: Client receives the SYN+ACK from server, moves to ESTABLISHED state, sends an ACK back to server SYN_RCVD to ESTABLISHED: Server receives ACK from client and moves to ESTABLISHED Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK CSS432: End-to-End Protocols TIME_WAIT CLOSED
21
State Transition Diagram
This Side can close connection first Other Side can close connection first Both close the connection at the same time CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Closing a connection: ESTABLISHED TO FIN_WAIT_1: Server sends termination request FIN. Waits for ACK from client ESTABLISHED TO CLOSE_WAIT: Client receives FIN, sends ACK to server, waits for its own local FIN CLOSE_WAIT TO LAST_ACK: Client sends own FIN to server waits for ACK FIN_WAIT_1 TO FIN_WAIT_2: ACK received from client, wait for FIN from client FIN_WAIT_2 TO TIME_WAIT: FIN received from client. Server sends ACK to Client, waits for enough time until client receives ACK LAST_ACK TO CLOSED: Client receives ACK from server, moves to CLOSED TIME_WAIT TO CLOSED: Server waits 2*MSL, moves to CLOSED MSL: maximum amount of time segment is in internet (e.g. 120 s) Maximum segment lifetime of a segment after which it is discarder even if it arrives at the destination
22
State Transition Diagram
CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Closing a connection: ESTABLISHED TO FIN_WAIT_1: Server, Client send termination requests FIN. FIN_WAIT_1 TO CLOSING: Server/client receive FIN from each other and send ACK, wait for own ACK CLOSING TO TIME_WAIT: Server/Client receive ACK for the FIN they sent, wait for enough time until other peer receives ACK they sent TIME_WAIT TO CLOSED: Peer waits 2*MSL, move to CLOSED FIN_WAIT_1 TO TIME_WAIT: Server/client receive FIN and ACK for the FIN that they sent simultaneously CSS432: End-to-End Protocols
23
State Transition Diagram
In what condition can the state transit from FIN_WAIT_1 to TIME_WAIT? What is the purpose of the TIME_WAIT state? TCP is given a chance to resend the final ACK. Client sends FIN, Server receives FIN ACK sent by server can be delayed, Client times out ( 1 MSL) Client resends FIN, it can also be delayed (1 MSL) If no TIME_WAIT, new TCP connection can get the delayed FIN and close connection Server received FIN, ACK from client, server sent ACK to client. ACK may or may not have reached the client. Client can re-transmit the FIN Request to the server and it can be delayed in the network. It takes 2*MSL (timeout for ACK + timeout for retransmitted FIN). If no TIME_WAIT, When another connection is opened (between the same IP and ports), if this retransmitted FIN request reaches the server of the new connection, unwanted termination occurs. Host A has sent a FIN segment to host B, and has moved from ESTABLISHED to FIN WAIT 1. Host A then receives a segment from B that contains both the ACK of this FIN, and also B’s own FIN segment. This could happen if the application on host B closed its end of the connection immediately when the host A’s FIN segment arrived, and was thus able to send its own FIN along with the ACK. Normally, because the host B application must be scheduled to run before it can close the connection and thus have the FIN sent, the ACK is sent before the FIN. While “delayed ACKs” are a standard part of TCP, traditionally only ACKs of DATA, not FIN, are delayed. See RFC 813 for further details. CSS432: End-to-End Protocols
24
CSS432: End-to-End Protocols
Timing Chart Client Server ( connect( ) ) SYN_SENT LISTEN ( listen( ) ) SYN seq=x SYN_RCVD Establishment SYN seq=y, ACK=x + 1 ESTABLISHED ACK=y + 1 ESTABLISHED ( write( ) ) seq=x+1 ACK=y + 1 ( read( ) ) Transfer Data ACK x + 2 ( close( ) ) FIN_WAIT_1 FIN seq=x+2 ACK=y + 1 CLOSE_WAIT ACK x + 3 Termination FIN_WAIT_2 FIN seq = y + 1 LAST_ACK( close( ) ) TIME_WAIT ACK=y + 2 Peek such a flow with tcpdump in assignment 3. CSS432: End-to-End Protocols
25
Sliding Window Revisited
TCP’s variant of the sliding window algorithm, which serves several purposes: (1) it guarantees the reliable delivery of data, (2) it ensures that data is delivered in order, and (3) it enforces flow control between the sender and the receiver CSS432: End-to-End Protocols
26
Sliding Window (Reliable & Ordered Delivery)
Sending application LastByteWritten TCP LastByteSent LastByteAcked Receiving application LastByteRead LastByteRcvd NextByteExpected Sending side LastByteAcked ≤ LastByteSent Receiver cannot ack the byte not sent LastByteSent ≤ LastByteWritten Cannot send a byte that has not been written to the send buffer buffer bytes between [LastByteAcked, LastByteWritten] Receiving side LastByteRead < NextByteExpected Byte cannot be read by receiver until received NextByteExpected ≤ LastByteRcvd+1 NextByteExpected points to the start of first gap when data arrive out of order buffer bytes between [LastByteRead, LastByteRcvd] LastByteWritten: written into the sender buffer but not transmitted LastByteReceived: received from sender LastByteRead: byte read by the receiver, after receiving from the sender
27
Sliding Window (Reliable & Ordered Delivery)
Receiving side LastByteRead Data that has been received and also application has read it from the TCP buffer NextByteExpected Data that has not been received and expected as the next byte LastByteRcvd Data that has been received and in the receiver TCP buffer LastByteRead < NextByteExpected Bytes which are expected can’t be read as they have not yet reached the receiver The next expected byte points to the byte right after the last byte received if data is received in order, therefore NextByteExpected = LastByteRcvd+1 If due to some reason, data has arrived out of order, NextByteExpected will point to the first gap in the data NextByteExpected ≤ LastByteRcvd+1 buffer bytes between [LastByteRead, LastByteRcvd] LastByteWritten: written into the sender buffer but not transmitted LastByteReceived: received from sender LastByteRead: byte read by the receiver, after receiving from the sender
28
CSS432: End-to-End Protocols
Flow Control Keep sender from overrunning receiver MaxSendBuffer, MaxRcvBuffer for sender, receiver Window is amount of data that can be sent without waiting for ACK Receiver advertises the sender a window <= MaxRcvBuffer To avoid overflowing receiver buffer, TCP on receiver sider must keep, LastByteRcvd − LastByteRead ≤ MaxRcvBuffer AdvertisedWindow of receiver: Amount of free space in receive buffer = MaxRcvBuffer − ((NextByteExpected − 1) − LastByteRead) If rate of reading = rate of receiving, Advertised Window = MaxRcvBuffer If rate of reading slower, LastByteRcvd increases and Advertised Window shrinks to 0 LastByteRcvd − LastByteRead ≤ MaxRcvBuffer MaxRcvBuffer – (LastByteRcvd − LastByteRead )>=0 MaxRcvBuffer – LastByteRcvd + LastByteRead >=0 - 1 AdvertisedWindow >=0 NextByteExpected − 1 <= LastByteRcvd Replace LastByteRcvd by NextByteExpected-1 in (1), equation is valid Since a smaller number will increase the AdvertisedWindow value and it will remain >=0 CSS432: End-to-End Protocols
29
CSS432: End-to-End Protocols
Flow Control Sender should adhere to receiver’s advertised window LastByteSent − LastByteAcked ≤ AdvertisedWindow Sender computes an effective window that limits how much data it can send to receiver EffectiveWindow = AdvertisedWindow − (LastByteSent − LastByteAcked) Local application process must not overflow the send buffer LastByteWritten − LastByteAcked ≤ MaxSendBuffer If the sending process tries to write y bytes to TCP, but (LastByteWritten − LastByteAcked) + y > MaxSendBuffer then TCP blocks the sending process and does not allow it to generate more data. If no unordered data, NextByteExpected = LatestByteRcvd-1 CSS432: End-to-End Protocols
30
CSS432: End-to-End Protocols
Flow Control Sending application Receiving application TCP TCP LastByteWritten LastByteRead y LastByteAcked LastByteSent NextByteExpected LastByteRcvd LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – NextByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) Send ACK with an advertise window in response to arriving data segments as long as all the preceding bytes have also arrived and until the advertised window reaches 0. (ACK returned at the first time when it reaches 0) LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols
31
Flow Control with A Slower Receiver
Sending application Receiving application y y Read slow. TCP TCP LastByteWritten LastByteRead LastByteAcked LastByteSent LastByteRcvd NextByteExpected LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow < 0 AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – LastByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) Read slow, LastByteRcvd inreases, MaxRcvBuffer fills up such that receiver cannot receive no more messages and LastByteRcvd stops increasing; NextByteExpected No more send, no more ack, thus it stays In the same value LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender since (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols
32
CSS432: End-to-End Protocols
Flow Control The sender won’t send any more data. The receiver won’t initiate to send any advertised window. Then, how can the sender find out when the receiver can receive more data? CSS432: End-to-End Protocols
33
Protection Against Wrap Around
32-bit SequenceNum 2^32 numbers: 0 to 2^32-1 After 2^31-1? Start from 0 (wrap around) MSL (Maximum Segment Lifetime) = 120sec < wrap around time (time taken to exhaust all sequence numbers) Bandwidth Time Until Wrap Around T1 (1.5 Mbps) 6.4 hours Ethernet (10 Mbps) 57 minutes T3 (45 Mbps) 13 minutes FDDI (100 Mbps) 6 minutes STS-3 (155 Mbps) 4 minutes STS-12 (622 Mbps) 55 seconds STS-24 (1.2 Gbps) 28 seconds CSS432: End-to-End Protocols
34
CSS432: End-to-End Protocols
Keeping the Pipe Full Utilize full bandwidth: sender transmit RTT*Bandwidth data Sender transmission restricted by Receiver Advertised Window Receiver Advertised Window should be enough to accommodate RTT*Bandwidth data But 16-bit AdvertisedWindow = 64KB (2^16 bytes) Bandwidth RTT(100msec) x Bandwidth Product T1 (1.5 Mbps) 18KB Ethernet (10 Mbps) 122KB T3 (45 Mbps) 549KB FDDI (100 Mbps) 1.2MB STS-3 (155 Mbps) 1.8MB STS-12 (622 Mbps) 7.4MB STS-24 (1.2 Gbps) 14.8MB Every byte is given a sequence number CSS432: End-to-End Protocols
35
CSS432: End-to-End Protocols
Segment Transmission A segment is transmitted out: When a segment to send reaches Maximum segment size (MMS) = Maximum Transfer Unit (MTU) When a TCP receives a push operation that flushes the unsent data, data is pushed as and when written instead of waiting for segment to be filled (Peek with tcpdump in programming assignment 3) When a timer fires CSS432: End-to-End Protocols
36
Silly Window Syndrome If you think of a TCP stream as a conveyer belt with “full” containers (data segments) going in one direction and empty containers (ACKs) going in the reverse direction, then MSS-sized segments correspond to large containers and 1-byte segments correspond to very small containers. If the sender aggressively fills an empty container as soon as it arrives, then any small container introduced into the system remains in the system indefinitely. That is, it is immediately filled and emptied at each end, and never coalesced with adjacent containers to create larger containers.
37
Silly Window Syndrome Silly Window Syndrome
38
CSS432: End-to-End Protocols
Silly Window Syndrome small MMS 2 Sender MMS 1 Receiver Ad Window Ad Window If a sender aggressively takes advantage of any available window, The receiver empties every window regardless of its size and thus small windows will never disappear. The problem occurs only when either the sender transmits a small segment or the receiver opens the window a small amount The receiver can delay ACKs to make a larger window How long does it wait? The sender should make a decision Nagle’s Algorithm (Programming assignment 3) CSS432: End-to-End Protocols
39
Nagle’s Algorithm If there is data to send but the window is open less than MSS, then we may want to wait some amount of time before sending the available data But how long? If we wait too long, then we hurt interactive applications like Telnet If we don’t wait long enough, then we risk sending a bunch of tiny packets and falling into the silly window syndrome The solution is to introduce a timer and to transmit when the timer expires
40
Nagle’s Algorithm We could use a clock-based timer, for example one that fires every 100 ms Nagle introduced an elegant self-clocking solution Key Idea As long as TCP has any data in flight, the sender will eventually receive an ACK This ACK can be treated like a timer firing, triggering the transmission of more data
41
Nagle’s Algorithm When the application produces data to send
if both the available data and the window ≥ MSS send a full segment else if there is unACKed data at the sender // if ACK not received for buffer the new data until an ACK arrives previous data sent else // if no unACKed data send all the new data now
42
CSS432: End-to-End Protocols
Nagle’s Algorithm Ack works as a timer to fire a new segment transmission. intentionally delays packets. Time sensitive applications or real-time applications cannot afford such a delay TCP_NODELAY option in Socket Interface: Transmit data as soon as possible setsockopt(sockfd, SOL_TCP, TCP_NODELAY, &intFlag, sizeof(intFlag)) CSS432: End-to-End Protocols
43
Adaptive Retransmission
TCP retransmits segment if ACK not received within timeout Timeout determined based on RTT RTT between different pair of hosts in internet different How to choose timeout? Adaptive Retransmission
44
Adaptive Retransmission Original Algorithm
(keep running average of RTT) Measure SampleRTT for each segment/ ACK pair Record time when you start sending Record time when you receive ACK Take difference Compute weighted average of RTT EstRTT = a x EstRTT + b x SampleRTT where a + b = 1 a between 0.8 and 0.9 b between 0.1 and 0.2 Set timeout based on EstRTT TimeOut = 2 x EstRTT Why double? EstRTT cannot respond to deviated SampleRTT quickly. CSS432: End-to-End Protocols
45
Original Algorithm Problem
ACK does not really acknowledge a transmission It actually acknowledges the receipt of data When a segment is retransmitted and then an ACK arrives at the sender It is impossible to decide if this ACK should be associated with the first or the second transmission for calculating RTTs
46
Karn/Partridge Algorithm
Sender Receiver Sender Receiver Original transmission Original transmission TT TT ACK Retransmission SampleR SampleR Retransmission ACK Assume ACK for original transmission and actually it is for retransmission SampleRTT is too large Assume ACK for ret ransmission and actually it is for original transmission SampleRTT is too smal CSS432: End-to-End Protocols
47
Karn/Partridge Algorithm
Do not sample RTT when retransmitting Can’t figure out which transmission the latest ACK corresponds to. Whenever TCP retransmits Set the last timeout to be double the previous value (similar to exponential backoff) Congestion causes this retransmission. Do not react aggressively and be more cautious when more time outs happen Modestly retransmit segments. CSS432: End-to-End Protocols
48
Karn/Partridge Algorithm
Karn-Partridge algorithm was an improvement over the original approach, but it does not eliminate congestion We need to understand how timeout is related to congestion If you timeout too soon, you may unnecessarily retransmit a segment which adds load to the network
49
Karn/Partridge Algorithm
Main problem with the original computation is that it does not take variance of Sample RTTs into consideration. If the variance among Sample RTTs is small Then the Estimated RTT can be better trusted There is no need to multiply this by 2 to compute the timeout
50
Karn/Partridge Algorithm
On the other hand, a large variance in the samples suggest that timeout value should not be tightly coupled to the Estimated RTT Jacobson/Karels proposed a new scheme for TCP retransmission
51
Jacobson/ Karels Algorithm
Original Algorithm EstRTT = a x EstRTT + b x SampleRTT 0.8 and and 0.2 TimeOut = EstRTT * 2 New Algorithm that takes into a consideration if the variation among smapleRTTs are large. EstRTT = EstRTT + d(SampleRTT – EstRTT) Diff Dev = Dev + d(|SampleRTT – EstRTT| – Dev) Diff TimeOut = m x EstRTT + f x Dev CSS432: End-to-End Protocols
52
CSS432: End-to-End Protocols
TCP Extensions RTT Measurement Store a 32-bit timestamp in outgoing segments’ header option Receive an ack with the original timestamp Sample RTT by subtracting the timestamp from the current timer Resolving the quick wrap-around of sequence number The 32-bit timestamp and the 32-bit sequence number gives a 64-bit sequence space TimeStamp differentiate 2 different incarnations of the same sequence number Extending an advertised window Scale up the advertised window How many bytes to send → How many 16-byte units to send CSS432: End-to-End Protocols
53
CSS432: End-to-End Protocols
Reviews UDP TCP: three-way handshake and state transition Sliding window and flow control Segment transmission: silly window syndrome and Nagle’s algorithm Adaptive retransmission: original, Karn/Partridge, and Jacobson/Karels Exercises in Chapter 5 Ex. 5, 14, 22, and 39 (TCP state transition) Ex. 9(a) (Sliding window) Ex. 20 (Nagle’s algorithm) CSS432: End-to-End Protocols
54
Supplementary CSS432: Internetworking
55
Jacobson/ Karels Algorithm
EstimatedRTT = 8 EstRTT Deviation = 8 Dev SampleRTT = SampleRTT – 8 EstRTT/ 8 = SampleRTT – EstRTT diff SampleRTT –= (EstimatedRTT >> 3) EstimatedRTT += SampleRTT; If (SampleRTT < 0) SampleRTT = – SampelRTT; SampleRTT –= (Deviation >> 3); Deviation += SampleRTT; TimeOut = (EstimatedRTT >> 3) + (Deviation >> 1) EstimatedRTT = EstimatedRTT + d Diff 8 EstRTT = 8EstRTT + SampleRTT = 8EstRTT + (SampleRTT – EstRTT) EstRTT = EstRTT + 1/8 (SampleRTT – EstRTT) diff |SampleRTT – EstRTT| – 8Dev/8 |diff| 8 Dev = 8 Dev + |SampleRTT – EsRTT| - Dev Dev = Dev + 1/8( |SampleRTT – EstRTT| - Dev ) Notes Scale calculation by 2^n, d = 1/n, n=3, avoid floating points EstimatedRTT, Deviation already in scaled form, SampleRTT, TimeOut,, real values << multiply by 2^n; >> divide by 2^n Algorithm does not work accurately with a large granularity of clock (500ms on Unix) Accurate timeout mechanism important to congestion control TimeOut = 8 EstRTT / Dev / 2 TimeOut = EstRTT + 4 Dev
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.