Transport Layer Problems with network layer services Transport layer services Addressing Connection establishment and release Flow control and buffering Multiplexing TCP and UDP Some performance guidelines 1 Ying-Dar Lin@CIS.NCTU
Problems with Network Layer Services Data corruption (while stored in node memory) Packet loss (due to node failure) Duplicate packets (if ACK is lost) Virtual channel loss Out of sequence Congestion Note: Some of the problems exist also at the link level. The problems here are end-to-end and more difficult to resolve due to large delay. 2 Ying-Dar Lin@CIS.NCTU
Transport Layer Services Error, loss, duplicate detection and recovery (by software checksum and message ID) Sequencing and flow control (by sliding window protocol) Connection establishment and release (by 3-way handshake) Multiplexing (upward or downward) 3 Ying-Dar Lin@CIS.NCTU
Primitives of Transport Services socket primitives for TCP 4 Ying-Dar Lin@CIS.NCTU
Addressing TSAP (Transport Service Access Point) -- e.g. (IP addr, local port) in Internet NSAP (Network Service Access Point) -- e.g. IP addr in Internet Three alternative schemes to address TSAP: 1. stable well-known TSAP addresses for common services (server processes attach themselves to well-known ports) 2. process server that acts as a proxy for less-heavily used servers (listens to a set of ports to wait for connection request, spawns off process to serve the client, goes back to listen for new requests) 3. name server that new services register to (listens to a well-known TSAP, responds the TSAP of requested service) 5 Ying-Dar Lin@CIS.NCTU
Connection Establishment and Release Problems of delayed duplicates Three-way handshake: both sides do not have to start with the same sequence number host 1 host 2 host 1 host 2 host 1 host 2 host 1 host 2 CR (seq=x) delayed duplicate DR DR CR (seq=x) ACK (seq=y, ACK=x) DR lost ACK DR ACK (seq=y, ACK=x) ACK ACK DATA (seq=x, ACK=y) REJECT (ACK=y) lost timeout release connection normal operation delayed duplicate normal operation 6 Ying-Dar Lin@CIS.NCTU
Flow Control and Buffering Source buffering (for low-bandwidth bursty traffic) vs. destination buffering (for high bandwidth smooth traffic) Two potential bottlenecks: receive buffer space subnet capacity Dynamic sliding window flow control: measure capacity c TPDUs/sec and cycle time r compute window size as cr adjust frequently to track changes in the carrying capacity and cycle time 7 Ying-Dar Lin@CIS.NCTU
Multiplexing upward multiplexing downward multiplexing 4 3 2 1 layer TSAP TSAP 4 NSAP NSAP 3 2 1 to router upward multiplexing downward multiplexing 8 Ying-Dar Lin@CIS.NCTU
TCP (Transmission Control Protocol) A reliable end-to-end byte stream protocol over an unreliable internetwork socket, connection, segment TCP header TCP connection management TCP transmission management and silly window syndrome TCP dynamic window congestion control TCP dynamic timer management UDP: encapsulating raw IP 9 Ying-Dar Lin@CIS.NCTU
TCP Socket socket: IP+port (TSAP) connection: (socket_source, socket_dest) full-duplex, point-to-point, byte stream segment: two limits: 65,535 bytes network’s MTU (max transfer unit) 10 Ying-Dar Lin@CIS.NCTU
TCP Header 11 Ying-Dar Lin@CIS.NCTU 32 Bits URG: urgent data Urgent pointer: byte offset PSH: PUSHed data (not to buffer it) RST: reset a connection SYN, ACK: (1,0)--connection request (1,1)--connection reply FIN: release a connection Window size: receiver window size Checksum: on header, data, psuedo header Options: max TCP payload (default 536 bytes) window scale factor (up to 216) NAK for selective repeat Source port Destination port Sequence number Acknowledgement number TCP header length U R G A C K P S H R S T S Y N F I N Window size Checksum Urgent pointer Options (0 or more 32-bit words) Data (optional) 11 Ying-Dar Lin@CIS.NCTU
TCP Connection Management (START) CONNECT/SYN CLOSED host 1 host 2 CLOSE/- LISTEN/- CLOSE/- SYN (SEQ=x) SYN/SYN+ACK LISTEN RST/- SEND/SYN SYN (SEQ=y,ACK=x+1) SYN RCVD SYN SENT SYN/SYN +ACK (simultaneous open) (Data transfer stage) ACK/- SYN + ACK/ACK (step 3 of the three-way handshake) SYN (SEQ=x+1,ACK=y+1) ESTABLISHED CLOSE/FIN CLOSE/FIN FIN/ACK Connection establishment (Active close) (Passive Close) CLOSING FIN/ACK FIN WAIT 1 CLOSED WAIT ACK/- ACK/- FIN+ACK/ACK TIMED WAIT CLOSE/FIN FIN WAIT 1 LAST ACK FIN/ACK (Timeout/) CLOSED ACK/- (Go back to start) 12 Ying-Dar Lin@CIS.NCTU
TCP Transmission Management and Silly Window Syndrome Example: a TELNET TCP connection to an interactive editor 1. Sourcedest: 21-byte TCP data segment (41-byte IP datagram) 2. Sourcedest: acknowledgement segment (40-byte) 3. Sourcedest: window update segment (40-byte)(after editor reads the byte) 4. Sourcedest: echo segment (41-byte) (after editor processes the byted) 5. Repeat step 1, 162 bytes in 4 segments for each character types!! Silly window syndrome: frequent but mall window updates 1. sending application to TCP one byte at a time 2. receiving application sucks the data up from TCP one byte at a time Nagle’s algorithm to solve 1: When data come into the sender one byte at a time, just send the first byte and buffer all the rest until the outstanding byte is acked. Clark’s algorithm to solve 2: The receiver should not send a window update until it can handle the max segment size it advitised when conn. was established, or its buffer is half empty, whichever is smaller. 13 Ying-Dar Lin@CIS.NCTU
TCP Dynamic Window Congestion Control Two limits: network capacity congestion window receiver capacity receiver window #bytes that may be sent: min(congestion window, receiver window) Dynamic window control: 1. Initialization: congestion windowone maximum segment threshold64KB 2. Slow start: exponential up to threshold congestion windowcongestion window + one max segment size for each acked segment (Each successful acked burst doubles congestion window) 3. Congestion avoidance: linearly up to receiver window congestion windowcongestion window + one max segment size for each acked burst 14 Ying-Dar Lin@CIS.NCTU
TCP Congestion Window Transmission number Timeout 44 40 36 32 28 24 20 16 12 8 4 Threshold Congestion window (kilobytes) Threshold 2 4 6 8 10 12 14 16 18 20 22 24 Transmission number 15 Ying-Dar Lin@CIS.NCTU
TCP Dynamic Timer Management Retransmission timer: RTT=RTT + (1-)M (M: measured round-trip delay, ~7/8) D=D + (1-) |RTT-M| Timeout=RTT+4*D Persistent timer: prevent deadlock due to lost window update Keepalive timer: for idle connections, controversial! Close timer: for timed wait state while closing a connection double the max packet lifetime (2*120 sec) 16 Ying-Dar Lin@CIS.NCTU
Design Guidelines at Transport Layer CPU speed is more important than network speed Reduce packet count to reduce software overhead Minimize context switching Minimize copying You can buy more bandwidth but not lower delay Avoiding congestion is better than recovering from it Avoid timeouts Speed up TPDU processing 17 Ying-Dar Lin@CIS.NCTU