CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2

Slides:



Advertisements
Similar presentations
1 Computer Networks: A Systems Approach, 5e Larry L. Peterson and Bruce S. Davie Chapter 5 End-to-End Protocols Copyright © 2010, Elsevier Inc. All rights.
Advertisements

TCP - Part I Relates to Lab 5. First module on TCP which covers packet format, data transfer, and connection management.
Slide Set 13: TCP. In this set.... TCP Connection Termination TCP State Transition Diagram Flow Control How does TCP control its sliding window ?
CS 6401 Transport Control Protocol Outline TCP objectives revisited TCP basics New algorithms for RTO calculation.
Computer Networks Chapter 5: End-to-End Protocols
1 Chapter 5 End-to-End Protocols Outline 5.1 UDP 5.2 TCP 5.3 Remote Procedure Call.
1 TCP - Part I Relates to Lab 5. First module on TCP which covers packet format, data transfer, and connection management.
1 Computer Networks: A Systems Approach, 5e Larry L. Peterson and Bruce S. Davie Chapter 5 End-to-End Protocols Copyright © 2010, Elsevier Inc. All rights.
1 CS 4396 Computer Networks Lab Transmission Control Protocol (TCP) Part I.
8. Transport Protocol and UDP 8.1 Transport protocol : End-to-end protocol –IP: Host to host packet delivery –Transport: Process to process communication.
CSS432: End-to-End Protocols 1 CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2 Professor: Munehiro Fukuda.
Provides a reliable unicast end-to-end byte stream over an unreliable internetwork.
Transport Layer – TCP (Part1) Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF.
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
6-May-154/598N: Computer Networks End-to-End Protocols Underlying best-effort network –drop messages –re-orders messages –delivers duplicate copies of.
CSE Computer Networks Prof. Aaron Striegel Department of Computer Science & Engineering University of Notre Dame Lecture 14 – February 23, 2010.
TCP 4/15/2017.
Fundamentals of Computer Networks ECE 478/578 Lecture #21: TCP Window Mechanism Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Spring 2003CS 4611 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
TCP. Learning objectives Reliable Transport in TCP TCP flow and Congestion Control.
Transport Layer TCP and UDP IS250 Spring 2010
Spring 2002CS 4611 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
1 Chapter 1 OSI Architecture The OSI 7-layer Model OSI – Open Systems Interconnection.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
1 Internet Engineering University of ilam Dr. Mozafar Bag-Mohammadi Transport Layer.
TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.
1 Introduction to Computer Networks University of ilam Dr. Mozafar Bag-Mohammadi Transport Layer.
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Spring 2008CPE Computer Networks1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control.
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Ilam University Dr. Mozafar Bag-Mohammadi 1 Transport Layer.
1 End-to-End Protocols UDP TCP –Connection Establishment/Termination –Sliding Window Revisited –Flow Control –Congestion Control –Adaptive Timeout.
Two Transport Protocols Available Transmission Control Protocol (TCP) User Datagram Protocol (UDP) Provides unreliable transfer Requires minimal – Overhead.
3. END-TO-END PROTOCOLS (PART 1) Rocky K. C. Chang Department of Computing The Hong Kong Polytechnic University 22 March
Univ. of TehranIntroduction to Computer Network1 An Introduction Computer Networks An Introduction to Computer Networks University of Tehran Dept. of EE.
Advanced Computer Networks
Fast Retransmit For sliding windows flow control we waited for a timer to expire before beginning retransmission of a packet TCP uses an additional mechanism.
Chapter 5 TCP Sequence Numbers & TCP Transmission Control
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 full duplex data:
Transport Control Protocol
5. End-to-end protocols (part 1)
TCP.
Review: UDP demultiplexing TCP demultiplexing Multiplexing?
Chapter 5 TCP Sliding Window
ECE 4450:427/527 - Computer Networks Spring 2017
TCP - Part I Karim El Defrawy
Internet routing Problem: Route from any node to any other node
Chapter 5 TCP Transmission Control
TCP - Part I Relates to Lab 5. First module on TCP which covers packet format, data transfer, and connection management.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
TCP Overview Connection-oriented Byte-stream Full duplex
Transport Control Protocol
CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2
Reliable Byte-Stream (TCP)
Ilam University Dr. Mozafar Bag-Mohammadi
Transmission Control Protocol (TCP) Part II Neil Tang 11/21/2008
State Transition Diagram
Advanced Computer Networks
CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2
Transportation Layer.
TCP - Part I Relates to Lab 5. First module on TCP which covers packet format, data transfer, and connection management.
The University of Adelaide, School of Computer Science
CSS432 UDP and TCP Textbook Ch5.1 – 5.2
Transport Protocols: TCP Segments, Flow control and Connection Setup
Introduction to Computer Networks
Introduction to Computer Networks
Introduction to Computer Networks
Lecture 21 and 22 5/29/2019.
Transport Protocols: TCP Segments, Flow control and Connection Setup
Presentation transcript:

CSS432 End-to-End Protocols Textbook Ch5.1 – 5.2 Prof. Athirai Irissappane http://courses.washington.edu/css432/athirai/ athirai@uw.edu CSS432: End-to-End Protocols

Outline Transport Layer Protocols Simple Demultiplexer (UDP) Communication between applications running in the end nodes hence end-to-end protocols How to convert host-to-host packet delivery service into a process-to-process communication channel Simple Demultiplexer (UDP) Reliable Byte Stream (TCP) CSS432: Internetworking

End-to-End Protocols Common properties that a transport protocol can be expected to provide Guarantees message delivery Delivers messages in the same order they were sent Delivers at most one copy of each message Supports arbitrarily large messages Supports synchronization between the sender and the receiver Allows the receiver to apply flow control to the sender Supports multiple application processes on each host CSS432: Internetworking

End-to-End Protocols Typical limitations of the network on which transport protocol will operate Drop messages Reorder messages Deliver duplicate copies of a given message Limit messages to some finite size Deliver messages after an arbitrarily long delay Challenge for Transport Protocols Develop algorithms that turn the above limitations into the high level of service required by the application Request a retransmission of M3 and M5 M5, and M3 M5, M5, M3 M5 M2, M1, M4 M5, M4, M3, M2, M1 M5, M2, M1 M4 M4, M3 M3

Simple Demultiplexor (UDP) Extends host-to-host delivery service of the underlying network into a process-to-process communication service Adds a level of demultiplexing which allows multiple application processes on each host to share the network Unreliable and Unordered Datagram Service No flow control: preventing senders from overrunning the capacity of the receivers (messages are discarded if the receiving buffers are full) UDP mostly use for unidirectional communication such a broadcasting CSS432: End-to-End Protocols

Simple Demultiplexor (UDP) Identify each process by port id associated to the sender and receiver Endpoints identified by ports servers have well-known ports see /etc/services file on Unix Header format Optional checksum Calculated using psuedo header + UDP header + data Pseudoheader: IP header (protocol number) + source IP + destination IP, UDP length field SrcPort DstPort Checksum Length Data 16 31 A TCP connection consists of two endpoints, and each endpoint consists of an IP address and a port number.  Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port).  Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required.   What is not immediately evident is that when a connection is established that the client side of the connection uses a port number.  Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number.  Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose.  When the connection terminates, the ephemeral port is available for reuse,  CSS432: End-to-End Protocols

UDP: Simple Demultiplexer socket() bind() recv() Port: 13579 13579 socket() bind() recv() Port:12345 socket() sendto() int sd; sd = socket(AF_INET, SOCK_DGRAM, 0); struct sockaddr_in server; server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY) server.sin_port = htons( 12345 ); bind( sd, (struct sockaddr *)&server, sizeof( server ) ); recv( sd, buf, sizeof( buf ), 0 ); struct hostent *hp, *gethostbyname( ); Server.sin_family = AF_INET; hp = gethostbyname( ); bcopy( hp->h_addr, &( server.sin_addr.s_addr ), sizeof( hp->h_length) ); sendto( sd, buf, sizeof( buf ), 0, (struct sockaddr *)&server, Packets demultiplexed UDP M5, M4, M3, M2, M1 CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Common end-to-end services guarantee message delivery deliver messages in FIFO order deliver at most one copy of each message support arbitrarily large messages support synchronization allow the receiver to flow control the sender support multiple application processes on each host P1 P2 P3 P4 M5, M4, M3, M2, M1 Network m5, m4, m3, m2, m1 CSS432: End-to-End Protocols

CSS432: End-to-End Protocols TCP Overview Connection Oriented It guarantees that all sent packets will reach the destination in the correct order Use of ACK packets, re-transmission, time out Full duplex: bi-directional, send and receive at each end Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network CSS432: End-to-End Protocols

TCP (Reliable Byte Stream) Byte-oriented protocol, sender writes bytes into a TCP connection and the receiver reads bytes out of the TCP connection. Byte Stream: Not individual bytes source application buffers enough bytes from the sending process to fill a reasonably sized packet (segment) and sends to destination Destination empties the contents of the segment into a receive buffer, and the receiving process reads from this buffer at its leisure. Application process W rite bytes TCP Send buffer Segment T ransmit segments Read Receive buffer … CSS432: End-to-End Protocols

Sockets (Code Example) int sd = socket(AF_INET, SOCK_STREAM, 0); int sd, newSd; sd = socket(AF_INET, SOCK_STREAM, 0); socket() connect() write() socket() connect() write() socket() bind() listen() accept() sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family =AF_INET; server.sin_addr.s_addr = htonl( INADDR_ANY ) server.sin_port = htons( 12345 ); bind( sd, (sockaddr *)&server, sizeof( server ) ); struct hostent *host = gethostbyname( arg[0] ); sockaddr_in server; bzero( (char *)&server, sizeof( server ) ); server.sin_family = AF_INET; server.s_addr = inet_addr( inet_ntoa( *(struct in_addr*)*host->h_addr_list ) ); server.sin_port = htons( 12345 ); read() read() listen( sd, 5 ); connect( sd, (sockaddr *)&server, sizeof( server ) ); sockaddr_in client; socklen_t len=sizeof(client); while( true ) { newSd = accept(sd, (sockaddr *)&client, &len); write( sd, buf1, sizeof( buf ) ); write( sd, buf2, sizeof( buf ) ); if ( fork( ) == 0 ) { close( sd ); read( newSd, buf1, sizeof( buf1 ) ); read( newSd, buf2, sizeof( buf2 ) ); } close( newSd ); buf2, buf1 buf2, buf1 close( newsd); exit( 0 ); CSS432: End-to-End Protocols

Data Link Versus Transport Data Link layer transfers data between two adjacent nodes (a single point-to-point physical link) whereas transport layer provides communication between processes running in different hosts need explicit connection establishment and termination Single physical point-to-point link fixed RTT, TCP connections Potentially different RTT (Round Trip Time) as they connect different hosts anywhere on the internet TCP connection between nodes in same room or across network (different RTT) need adaptive timeout mechanism for re-transmissions Point-to-point link, packets received in FIFO order. In TCP, they can be re-ordered as they cross internet, e.g., long delay in network, re-transmission, etc need to be prepared for arrival of very old packets Packets slightly out of order can be corrected using SeqNum of sliding window protocol How late a packet can be? need to set MSL (Maximum Segment Lifetime) CSS432: End-to-End Protocols

Data Link Versus Transport Hosts connected to point-to-point are engineered to support the link. Hosts at both ends have similar resources. If windowsize = bandwidth*RTT, sender and receiver likely to have window size buffer But for a TCP connection, resources dedicated to the TCP connection such as buffer space, etc, can vary, especially if one of the host supports multiple TCP connections need to accommodate different node capacity (flow control) Potentially different network capacity. In a directly connected point-to-point link, the bandwidth of the link is known and transmitter cannot send faster than the bandwidth and not possible to congest the network. In TCP, what links will be traversed is not known before hand, and multiple sources can traverse via the same link. need to be prepared for network congestion CSS432: End-to-End Protocols

Segment Format (TCP Header) Each TCP connection identified with 4-tuple: (SrcPort, SrcIPAddr, DsrPort, DstIPAddr) Sliding window + flow control Acknowledgment, SequenceNum, AdvertisedWindow Flags SYN, FIN, RESET, PUSH, URG, ACK SYN (Synchronize): Establishing a connection FIN (Finish): terminating a connection RESET: Confused and Terminating PUSH: Section 5.2.7 URG: Sending urgent data ACK: Validating acknowledgment field SequenceNum is incremented in all cases other than ACK. Sender Data (SequenceNum) Acknowledgment + AdvertisedWindow Receiver CSS432: End-to-End Protocols

Segment Format (TCP Header) The SrcPort and DstPort fields identify the source and destination ports, respectively. The Acknowledgment, SequenceNum, and AdvertisedWindow fields are all involved in TCP’s sliding window algorithm. Because TCP is a byte-oriented protocol, each byte of data has a sequence number; the SequenceNum field contains the sequence number for the first byte of data carried in that segment. The Acknowledgment and AdvertisedWindow fields carry information about the flow of data going in the other direction. CSS432: End-to-End Protocols

Segment Format (TCP Header) The 6-bit Flags field is used to relay control information between TCP peers. The possible flags include SYN, FIN, RESET, PUSH, URG, and ACK. The SYN and FIN flags are used when establishing and terminating a TCP connection, respectively. The ACK flag is set any time the Acknowledgment field is valid, implying that the receiver should pay attention to it. The URG flag signifies that this segment contains urgent data. When this flag is set, the UrgPtr field indicates where the nonurgent data contained in this segment begins. The urgent data is contained at the front of the segment body, up to and including a value of UrgPtr bytes into the segment. The PUSH flag allow the sender to tell TCP it should (send) flush all bytes collected to its peer and also notify it to the receiving side.. Finally, the RESET flag signifies that the receiver has become confused, it received a segment it did not expect to receive—and so wants to abort the connection. Finally, the Checksum field is used in exactly the same way as for UDP—it is computed over the TCP header, the TCP data, and the pseudoheader, which is made up of the source address, destination address, and length fields from the IP header. CSS432: End-to-End Protocols

TCP Connection Establishment and Termination Tree-Way Handshake Client Initiate a connection to a server by sending segment with seq=x Set a timer and retransmit the request upon an expiration Server Acknowledge the client request with ack=++x Initiate a reverse connection with its own start sequence num seq=y Acknowledge the server request with ack=++y (next seq num expected) X and y chosen at random Segment from earlier incarnation of same connection can interfere with a later incarnation of the connection Active participant (client) Passive participant (server) Flag=SYN, SequenceNum = x SYN + ACK, SequenceNum = y , ACK, Acknowledgment = + 1 Acknowledgment = CSS432: End-to-End Protocols

State Transition Diagram CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Open Active open client connect( ) Passive open server listen( ) Close Active close client or server First close( ) Both side can be active Passive close close( ) in response to the first close( ) CSS432: End-to-End Protocols

State Transition Diagram States involved in opening and closing a TCP connection Anything between is hidden (ESTABLISHED) Each box represents the state of one end of TCP connection All connections start with CLOSED state As connection progresses, it moves from state to state Each arc represents the event/action. Two kinds of events: (1) a segment arrives from peer (2) local operation on TCP One/both sides can close connection. If one side alone closes, then it cannot send segments but can receive them Each arc is labeled using event/action. i.e., When event happens at a given state, it moves to the next state and takes the action CSS432: End-to-End Protocols

State Transition Diagram CLOSED Active open /SYN Opening a connection: CLOSED to LISTEN: Server invokes passive open on TCP waits for conn req. CLOSED to SYN_SENT: Client invokes an active open, moves to SYN_SENT state and SYN segment sent to server. LISTEN to SYN_RCVD: Server receives the SYN segment from client, moves to SYN_RCVD state, sends SYN+ACK to client SYN_SENT to ESTABLISHED: Client receives the SYN+ACK from server, moves to ESTABLISHED state, sends an ACK back to server SYN_RCVD to ESTABLISHED: Server receives ACK from client and moves to ESTABLISHED Passive open Close Close LISTEN SYN/SYN + ACK Send/ SYN SYN/SYN + ACK SYN_RCVD SYN_SENT ACK SYN + ACK/ACK Close /FIN ESTABLISHED Close /FIN FIN/ACK FIN_WAIT_1 CLOSE_WAIT FIN/ACK ACK Close /FIN ACK + FIN/ACK FIN_WAIT_2 CLOSING LAST_ACK Timeout after two ACK ACK segment lifetimes (2 * MSL) FIN/ACK CSS432: End-to-End Protocols TIME_WAIT CLOSED

State Transition Diagram This Side can close connection first Other Side can close connection first Both close the connection at the same time CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Closing a connection: ESTABLISHED TO FIN_WAIT_1: Server sends termination request FIN. Waits for ACK from client ESTABLISHED TO CLOSE_WAIT: Client receives FIN, sends ACK to server, waits for its own local FIN CLOSE_WAIT TO LAST_ACK: Client sends own FIN to server waits for ACK FIN_WAIT_1 TO FIN_WAIT_2: ACK received from client, wait for FIN from client FIN_WAIT_2 TO TIME_WAIT: FIN received from client. Server sends ACK to Client, waits for enough time until client receives ACK LAST_ACK TO CLOSED: Client receives ACK from server, moves to CLOSED TIME_WAIT TO CLOSED: Server waits 2*MSL, moves to CLOSED MSL: maximum amount of time segment is in internet (e.g. 120 s) Maximum segment lifetime of a segment after which it is discarder even if it arrives at the destination

State Transition Diagram CLOSED LISTEN SYN_RCVD SYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACK CLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive open Close Send/ SYN SYN/SYN + ACK SYN + ACK/ACK ACK /FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes (2 * MSL) Active open /SYN Closing a connection: ESTABLISHED TO FIN_WAIT_1: Server, Client send termination requests FIN. FIN_WAIT_1 TO CLOSING: Server/client receive FIN from each other and send ACK, wait for own ACK CLOSING TO TIME_WAIT: Server/Client receive ACK for the FIN they sent, wait for enough time until other peer receives ACK they sent TIME_WAIT TO CLOSED: Peer waits 2*MSL, move to CLOSED FIN_WAIT_1 TO TIME_WAIT: Server/client receive FIN and ACK for the FIN that they sent simultaneously CSS432: End-to-End Protocols

State Transition Diagram In what condition can the state transit from FIN_WAIT_1 to TIME_WAIT? What is the purpose of the TIME_WAIT state? TCP is given a chance to resend the final ACK. Client sends FIN, Server receives FIN ACK sent by server can be delayed, Client times out ( 1 MSL) Client resends FIN, it can also be delayed (1 MSL) If no TIME_WAIT, new TCP connection can get the delayed FIN and close connection Server received FIN, ACK from client, server sent ACK to client. ACK may or may not have reached the client. Client can re-transmit the FIN Request to the server and it can be delayed in the network. It takes 2*MSL (timeout for ACK + timeout for retransmitted FIN). If no TIME_WAIT, When another connection is opened (between the same IP and ports), if this retransmitted FIN request reaches the server of the new connection, unwanted termination occurs. Host A has sent a FIN segment to host B, and has moved from ESTABLISHED to FIN WAIT 1. Host A then receives a segment from B that contains both the ACK of this FIN, and also B’s own FIN segment. This could happen if the application on host B closed its end of the connection immediately when the host A’s FIN segment arrived, and was thus able to send its own FIN along with the ACK. Normally, because the host B application must be scheduled to run before it can close the connection and thus have the FIN sent, the ACK is sent before the FIN. While “delayed ACKs” are a standard part of TCP, traditionally only ACKs of DATA, not FIN, are delayed. See RFC 813 for further details. CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Timing Chart Client Server ( connect( ) ) SYN_SENT LISTEN ( listen( ) ) SYN seq=x SYN_RCVD Establishment SYN seq=y, ACK=x + 1 ESTABLISHED ACK=y + 1 ESTABLISHED ( write( ) ) seq=x+1 ACK=y + 1 ( read( ) ) Transfer Data ACK x + 2 ( close( ) ) FIN_WAIT_1 FIN seq=x+2 ACK=y + 1 CLOSE_WAIT ACK x + 3 Termination FIN_WAIT_2 FIN seq = y + 1 LAST_ACK( close( ) ) TIME_WAIT ACK=y + 2 Peek such a flow with tcpdump in assignment 3. CSS432: End-to-End Protocols

Sliding Window Revisited TCP’s variant of the sliding window algorithm, which serves several purposes: (1) it guarantees the reliable delivery of data, (2) it ensures that data is delivered in order, and (3) it enforces flow control between the sender and the receiver CSS432: End-to-End Protocols

Sliding Window (Reliable & Ordered Delivery) Sending application LastByteWritten TCP LastByteSent LastByteAcked Receiving application LastByteRead LastByteRcvd NextByteExpected Sending side LastByteAcked ≤ LastByteSent Receiver cannot ack the byte not sent LastByteSent ≤ LastByteWritten Cannot send a byte that has not been written to the send buffer buffer bytes between [LastByteAcked, LastByteWritten] Receiving side LastByteRead < NextByteExpected Byte cannot be read by receiver until received NextByteExpected ≤ LastByteRcvd+1 NextByteExpected points to the start of first gap when data arrive out of order buffer bytes between [LastByteRead, LastByteRcvd] LastByteWritten: written into the sender buffer but not transmitted LastByteReceived: received from sender LastByteRead: byte read by the receiver, after receiving from the sender

Sliding Window (Reliable & Ordered Delivery) Receiving side LastByteRead Data that has been received and also application has read it from the TCP buffer NextByteExpected Data that has not been received and expected as the next byte LastByteRcvd Data that has been received and in the receiver TCP buffer LastByteRead < NextByteExpected Bytes which are expected can’t be read as they have not yet reached the receiver The next expected byte points to the byte right after the last byte received if data is received in order, therefore NextByteExpected = LastByteRcvd+1 If due to some reason, data has arrived out of order, NextByteExpected will point to the first gap in the data NextByteExpected ≤ LastByteRcvd+1 buffer bytes between [LastByteRead, LastByteRcvd] LastByteWritten: written into the sender buffer but not transmitted LastByteReceived: received from sender LastByteRead: byte read by the receiver, after receiving from the sender

CSS432: End-to-End Protocols Flow Control Keep sender from overrunning receiver MaxSendBuffer, MaxRcvBuffer for sender, receiver Window is amount of data that can be sent without waiting for ACK Receiver advertises the sender a window <= MaxRcvBuffer To avoid overflowing receiver buffer, TCP on receiver sider must keep, LastByteRcvd − LastByteRead ≤ MaxRcvBuffer AdvertisedWindow of receiver: Amount of free space in receive buffer = MaxRcvBuffer − ((NextByteExpected − 1) − LastByteRead) If rate of reading = rate of receiving, Advertised Window = MaxRcvBuffer If rate of reading slower, LastByteRcvd increases and Advertised Window shrinks to 0 LastByteRcvd − LastByteRead ≤ MaxRcvBuffer MaxRcvBuffer – (LastByteRcvd − LastByteRead )>=0 MaxRcvBuffer – LastByteRcvd + LastByteRead >=0 - 1 AdvertisedWindow >=0 NextByteExpected − 1 <= LastByteRcvd Replace LastByteRcvd by NextByteExpected-1 in (1), equation is valid Since a smaller number will increase the AdvertisedWindow value and it will remain >=0 CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Flow Control Sender should adhere to receiver’s advertised window LastByteSent − LastByteAcked ≤ AdvertisedWindow Sender computes an effective window that limits how much data it can send to receiver EffectiveWindow = AdvertisedWindow − (LastByteSent − LastByteAcked) Local application process must not overflow the send buffer LastByteWritten − LastByteAcked ≤ MaxSendBuffer If the sending process tries to write y bytes to TCP, but (LastByteWritten − LastByteAcked) + y > MaxSendBuffer then TCP blocks the sending process and does not allow it to generate more data. If no unordered data, NextByteExpected = LatestByteRcvd-1 CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Flow Control Sending application Receiving application TCP TCP LastByteWritten LastByteRead y LastByteAcked LastByteSent NextByteExpected LastByteRcvd LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – NextByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) Send ACK with an advertise window in response to arriving data segments as long as all the preceding bytes have also arrived and until the advertised window reaches 0. (ACK returned at the first time when it reaches 0) LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols

Flow Control with A Slower Receiver Sending application Receiving application y y Read slow. TCP TCP LastByteWritten LastByteRead LastByteAcked LastByteSent LastByteRcvd NextByteExpected LastByteRcvd – LastByteRead ≤ MaxRcvbuffer LastByteSent – LastByteAcked ≤ AdvertisedWindow < 0 AdvertisedWindow = MaxRcvBuffer – (NextByteExpected – LastByteRead) EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) Read slow, LastByteRcvd inreases, MaxRcvBuffer fills up such that receiver cannot receive no more messages and LastByteRcvd stops increasing; NextByteExpected No more send, no more ack, thus it stays In the same value LastByteWritten – LastByteAcked ≤ MaxsendBuffer block sender since (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Flow Control The sender won’t send any more data. The receiver won’t initiate to send any advertised window. Then, how can the sender find out when the receiver can receive more data? CSS432: End-to-End Protocols

Protection Against Wrap Around 32-bit SequenceNum 2^32 numbers: 0 to 2^32-1 After 2^31-1? Start from 0 (wrap around) MSL (Maximum Segment Lifetime) = 120sec < wrap around time (time taken to exhaust all sequence numbers) Bandwidth Time Until Wrap Around T1 (1.5 Mbps) 6.4 hours Ethernet (10 Mbps) 57 minutes T3 (45 Mbps) 13 minutes FDDI (100 Mbps) 6 minutes STS-3 (155 Mbps) 4 minutes STS-12 (622 Mbps) 55 seconds STS-24 (1.2 Gbps) 28 seconds CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Keeping the Pipe Full Utilize full bandwidth: sender transmit RTT*Bandwidth data Sender transmission restricted by Receiver Advertised Window Receiver Advertised Window should be enough to accommodate RTT*Bandwidth data But 16-bit AdvertisedWindow = 64KB (2^16 bytes) Bandwidth RTT(100msec) x Bandwidth Product T1 (1.5 Mbps) 18KB Ethernet (10 Mbps) 122KB T3 (45 Mbps) 549KB FDDI (100 Mbps) 1.2MB STS-3 (155 Mbps) 1.8MB STS-12 (622 Mbps) 7.4MB STS-24 (1.2 Gbps) 14.8MB Every byte is given a sequence number CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Segment Transmission A segment is transmitted out: When a segment to send reaches Maximum segment size (MMS) = Maximum Transfer Unit (MTU) When a TCP receives a push operation that flushes the unsent data, data is pushed as and when written instead of waiting for segment to be filled (Peek with tcpdump in programming assignment 3) When a timer fires CSS432: End-to-End Protocols

Silly Window Syndrome If you think of a TCP stream as a conveyer belt with “full” containers (data segments) going in one direction and empty containers (ACKs) going in the reverse direction, then MSS-sized segments correspond to large containers and 1-byte segments correspond to very small containers. If the sender aggressively fills an empty container as soon as it arrives, then any small container introduced into the system remains in the system indefinitely. That is, it is immediately filled and emptied at each end, and never coalesced with adjacent containers to create larger containers.

Silly Window Syndrome Silly Window Syndrome

CSS432: End-to-End Protocols Silly Window Syndrome small MMS 2 Sender MMS 1 Receiver Ad Window Ad Window If a sender aggressively takes advantage of any available window, The receiver empties every window regardless of its size and thus small windows will never disappear. The problem occurs only when either the sender transmits a small segment or the receiver opens the window a small amount The receiver can delay ACKs to make a larger window How long does it wait? The sender should make a decision Nagle’s Algorithm (Programming assignment 3) CSS432: End-to-End Protocols

Nagle’s Algorithm If there is data to send but the window is open less than MSS, then we may want to wait some amount of time before sending the available data But how long? If we wait too long, then we hurt interactive applications like Telnet If we don’t wait long enough, then we risk sending a bunch of tiny packets and falling into the silly window syndrome The solution is to introduce a timer and to transmit when the timer expires

Nagle’s Algorithm We could use a clock-based timer, for example one that fires every 100 ms Nagle introduced an elegant self-clocking solution Key Idea As long as TCP has any data in flight, the sender will eventually receive an ACK This ACK can be treated like a timer firing, triggering the transmission of more data

Nagle’s Algorithm When the application produces data to send if both the available data and the window ≥ MSS send a full segment else if there is unACKed data at the sender // if ACK not received for buffer the new data until an ACK arrives previous data sent else // if no unACKed data send all the new data now

CSS432: End-to-End Protocols Nagle’s Algorithm Ack works as a timer to fire a new segment transmission. intentionally delays packets. Time sensitive applications or real-time applications cannot afford such a delay TCP_NODELAY option in Socket Interface: Transmit data as soon as possible setsockopt(sockfd, SOL_TCP, TCP_NODELAY, &intFlag, sizeof(intFlag)) CSS432: End-to-End Protocols

Adaptive Retransmission TCP retransmits segment if ACK not received within timeout Timeout determined based on RTT RTT between different pair of hosts in internet different How to choose timeout? Adaptive Retransmission

Adaptive Retransmission Original Algorithm (keep running average of RTT) Measure SampleRTT for each segment/ ACK pair Record time when you start sending Record time when you receive ACK Take difference Compute weighted average of RTT EstRTT = a x EstRTT + b x SampleRTT where a + b = 1 a between 0.8 and 0.9 b between 0.1 and 0.2 Set timeout based on EstRTT TimeOut = 2 x EstRTT Why double? EstRTT cannot respond to deviated SampleRTT quickly. CSS432: End-to-End Protocols

Original Algorithm Problem ACK does not really acknowledge a transmission It actually acknowledges the receipt of data When a segment is retransmitted and then an ACK arrives at the sender It is impossible to decide if this ACK should be associated with the first or the second transmission for calculating RTTs

Karn/Partridge Algorithm Sender Receiver Sender Receiver Original transmission Original transmission TT TT ACK Retransmission SampleR SampleR Retransmission ACK Assume ACK for original transmission and actually it is for retransmission SampleRTT is too large Assume ACK for ret ransmission and actually it is for original transmission SampleRTT is too smal CSS432: End-to-End Protocols

Karn/Partridge Algorithm Do not sample RTT when retransmitting Can’t figure out which transmission the latest ACK corresponds to. Whenever TCP retransmits Set the last timeout to be double the previous value (similar to exponential backoff) Congestion causes this retransmission. Do not react aggressively and be more cautious when more time outs happen Modestly retransmit segments. CSS432: End-to-End Protocols

Karn/Partridge Algorithm Karn-Partridge algorithm was an improvement over the original approach, but it does not eliminate congestion We need to understand how timeout is related to congestion If you timeout too soon, you may unnecessarily retransmit a segment which adds load to the network

Karn/Partridge Algorithm Main problem with the original computation is that it does not take variance of Sample RTTs into consideration. If the variance among Sample RTTs is small Then the Estimated RTT can be better trusted There is no need to multiply this by 2 to compute the timeout

Karn/Partridge Algorithm On the other hand, a large variance in the samples suggest that timeout value should not be tightly coupled to the Estimated RTT Jacobson/Karels proposed a new scheme for TCP retransmission

Jacobson/ Karels Algorithm Original Algorithm EstRTT = a x EstRTT + b x SampleRTT 0.8 and 0.9 0.1 and 0.2 TimeOut = EstRTT * 2 New Algorithm that takes into a consideration if the variation among smapleRTTs are large. EstRTT = EstRTT + d(SampleRTT – EstRTT) 0.125 Diff Dev = Dev + d(|SampleRTT – EstRTT| – Dev) Diff TimeOut = m x EstRTT + f x Dev 1 4 CSS432: End-to-End Protocols

CSS432: End-to-End Protocols TCP Extensions RTT Measurement Store a 32-bit timestamp in outgoing segments’ header option Receive an ack with the original timestamp Sample RTT by subtracting the timestamp from the current timer Resolving the quick wrap-around of sequence number The 32-bit timestamp and the 32-bit sequence number gives a 64-bit sequence space TimeStamp differentiate 2 different incarnations of the same sequence number Extending an advertised window Scale up the advertised window How many bytes to send → How many 16-byte units to send CSS432: End-to-End Protocols

CSS432: End-to-End Protocols Reviews UDP TCP: three-way handshake and state transition Sliding window and flow control Segment transmission: silly window syndrome and Nagle’s algorithm Adaptive retransmission: original, Karn/Partridge, and Jacobson/Karels Exercises in Chapter 5 Ex. 5, 14, 22, and 39 (TCP state transition) Ex. 9(a) (Sliding window) Ex. 20 (Nagle’s algorithm) CSS432: End-to-End Protocols

Supplementary CSS432: Internetworking

Jacobson/ Karels Algorithm EstimatedRTT = 8 EstRTT Deviation = 8 Dev SampleRTT = SampleRTT – 8 EstRTT/ 8 = SampleRTT – EstRTT diff SampleRTT –= (EstimatedRTT >> 3) EstimatedRTT += SampleRTT; If (SampleRTT < 0) SampleRTT = – SampelRTT; SampleRTT –= (Deviation >> 3); Deviation += SampleRTT; TimeOut = (EstimatedRTT >> 3) + (Deviation >> 1) EstimatedRTT = EstimatedRTT + d Diff 8 EstRTT = 8EstRTT + SampleRTT = 8EstRTT + (SampleRTT – EstRTT) EstRTT = EstRTT + 1/8 (SampleRTT – EstRTT) diff |SampleRTT – EstRTT| – 8Dev/8 |diff| 8 Dev = 8 Dev + |SampleRTT – EsRTT| - Dev Dev = Dev + 1/8( |SampleRTT – EstRTT| - Dev ) Notes Scale calculation by 2^n, d = 1/n, n=3, avoid floating points EstimatedRTT, Deviation already in scaled form, SampleRTT, TimeOut,, real values << multiply by 2^n; >> divide by 2^n Algorithm does not work accurately with a large granularity of clock (500ms on Unix) Accurate timeout mechanism important to congestion control TimeOut = 8 EstRTT / 8 + 8 Dev / 2 TimeOut = EstRTT + 4 Dev