Spring 2010CS 3321 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.

Slides:



Advertisements
Similar presentations
TCP Sliding Windows, Flow Control, and Congestion Control Lecture material taken from Computer Networks A Systems Approach, Fourth Ed.,Peterson and Davie,
Advertisements

1 Computer Networks: A Systems Approach, 5e Larry L. Peterson and Bruce S. Davie Chapter 5 End-to-End Protocols Copyright © 2010, Elsevier Inc. All rights.
Slide Set 13: TCP. In this set.... TCP Connection Termination TCP State Transition Diagram Flow Control How does TCP control its sliding window ?
CS 6401 Transport Control Protocol Outline TCP objectives revisited TCP basics New algorithms for RTO calculation.
Computer Networks Chapter 5: End-to-End Protocols
Transport Layer3-1 TCP. Transport Layer3-2 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r full duplex data: m bi-directional data flow in same connection.
1 Transport Layer Lecture 9 Imran Ahmed University of Management & Technology.
1 Chapter 5 End-to-End Protocols Outline 5.1 UDP 5.2 TCP 5.3 Remote Procedure Call.
8. Transport Protocol and UDP 8.1 Transport protocol : End-to-end protocol –IP: Host to host packet delivery –Transport: Process to process communication.
Reliable Byte-Stream (TCP)
Provides a reliable unicast end-to-end byte stream over an unreliable internetwork.
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
6-May-154/598N: Computer Networks End-to-End Protocols Underlying best-effort network –drop messages –re-orders messages –delivers duplicate copies of.
CSE Computer Networks Prof. Aaron Striegel Department of Computer Science & Engineering University of Notre Dame Lecture 14 – February 23, 2010.
TCP 4/15/2017.
Fundamentals of Computer Networks ECE 478/578 Lecture #21: TCP Window Mechanism Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.
Transport Layer 3-1 Transport Layer r To learn about transport layer protocols in the Internet: m TCP: connection-oriented protocol m Reliability protocol.
ACN: TCP Sliding Windows1 TCP Sliding Windows, with Flow Control, and Congestion Control Based on Peterson and Davie Textbook.
Advanced Computer Networks : TCP Sliding Windows1 TCP Sliding Windows, Flow Control, and Congestion Control Lecture material taken from “Computer Networks.
CSE 461: Sliding Windows & ARQ. Next Topic  We begin on the Transport layer  Focus  How do we send information reliably?  Topics  The Transport layer.
Spring 2003CS 4611 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Spring 2002CS 4611 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Networking. Protocol Stack Generally speaking, sending an message is equivalent to copying a file from sender to receiver.
Gursharan Singh Tatla Transport Layer 16-May
18-Aug-154/598N: Computer Networks Overview Direct link networks –Error detection - Section 2.4 –Reliable transmission - Section 2.5.
1 Transport Layer Computer Networks. 2 Where are we?
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
1 Internet Engineering University of ilam Dr. Mozafar Bag-Mohammadi Transport Layer.
SMUCSE 4344 transport layer. SMUCSE 4344 transport layer end-to-end protocols –transport code runs only on endpoint hosts encapsulates network communications.
TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.
1 Introduction to Computer Networks University of ilam Dr. Mozafar Bag-Mohammadi Transport Layer.
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Transport Layer: Sliding Window Reliability
1 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout.
Ilam University Dr. Mozafar Bag-Mohammadi 1 Transport Layer.
CSE/EE 461 Sliding Windows and ARQ. 2 Last Time We finished up the Network layer –Internetworks (IP) –Routing (DV/RIP, LS/OSPF) It was all about routing:
1 Direct Link Networks: Reliable Transmission Sections 2.5.
Transport Layer3-1 Transport Layer If you are going through Hell Keep going.
McGraw-Hill Chapter 23 Process-to-Process Delivery: UDP, TCP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
1 End-to-End Protocols UDP TCP –Connection Establishment/Termination –Sliding Window Revisited –Flow Control –Congestion Control –Adaptive Timeout.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
1 TCP ProtocolsLayer name DNSApplication TCP, UDPTransport IPInternet (Network ) WiFi, Ethernet Link (Physical)
Computer Networking Lecture 16 – Reliable Transport.
1 Transmission Control Protocol (TCP) RFC: Introduction The TCP is intended to provide a reliable process-to-process communication service in a.
3. END-TO-END PROTOCOLS (PART 1) Rocky K. C. Chang Department of Computing The Hong Kong Polytechnic University 22 March
Univ. of TehranIntroduction to Computer Network1 An Introduction Computer Networks An Introduction to Computer Networks University of Tehran Dept. of EE.
Advanced Computer Networks
Reliable Transmission
Chapter 5 TCP Sequence Numbers & TCP Transmission Control
Transport Control Protocol
5. End-to-end protocols (part 1)
Introduction of Transport Protocols
Chapter 5 TCP Sliding Window
Internet routing Problem: Route from any node to any other node
TCP Overview Connection-oriented Byte-stream Full duplex
Transport Control Protocol
TCP Sliding Windows, Flow Control, and Congestion Control
Advanced Computer Networks
Reliable Byte-Stream (TCP)
Ilam University Dr. Mozafar Bag-Mohammadi
State Transition Diagram
Advanced Computer Networks
The University of Adelaide, School of Computer Science
Introduction to Computer Networks
Introduction to Computer Networks
Transport Layer Outline Intro to transport UDP
Introduction to Computer Networks
TCP Sliding Windows, Flow Control, and Congestion Control
TCP Sliding Windows, Flow Control, and Congestion Control
Presentation transcript:

Spring 2010CS 3321 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout

Spring 2010CS 3322 End-to-End Protocols Underlying best-effort network –drops messages –re-orders messages –delivers duplicate copies of a given message –limits messages to some finite size –delivers messages after an arbitrarily long delay

Spring 2010CS 3323 End-to-End Protocols Common end-to-end services –guarantee message delivery –deliver messages in the same order they are sent –deliver at most one copy of each message –support arbitrarily large messages –support synchronization (between sender and receiver) –allow the receiver to flow control the sender –support multiple application processes on each host

Spring 2010CS 3324 Simple Demultiplexer (UDP) Extends host-to-host service into process-to- process Unreliable and unordered datagram service Adds multiplexing No flow control Endpoints identified by ports (why not PID?) –servers have well-known ports (clients don’t need this) Often just starting point –see /etc/services on Unix/Linux –Implemented as message queue

Spring 2010CS 3325 Simple Demultiplexer (UDP) Header format –Note 16 bit port number (so only 64K ports) –Process really identified via pair Checksum (optional in IPv4, mandatory in IPv6) –pseudo header + UDP header + data Pseudo header: Protocol number Source IP Dest IP UDP length field Why? SrcPortDstPort ChecksumLength Data 01631

Spring 2010CS 3326 TCP Overview Connection-oriented Byte-stream –app writes bytes –TCP sends segments –app reads bytes Application process Write bytes TCP Send buffer Segment Transmit segments Application process Read bytes TCP Receive buffer … …… Full duplex Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network

Spring 2010CS 3327 Flow Control vs Congestion Control Flow Control –Prevent sender from overloading receiver –End-to-end issue Congestion Control –Prevent too much data from being injected into network –Concerned with how hosts and network interact

Spring 2010CS 3328 Data Link Reliability (text 2.5) Wherein we look at reliability issues on a point-to-point link! Error correcting codes can’t handle all possible errors (without introducing lots of overhead--including this is not designing for normal situation), so badly garbled frames are dropped. We need a way to recover from these lost frames.

Spring 2010CS 3329 Acks and Timeouts Acknowledgement (ACK) –Small frame sent to peer indicating receipt of frame –No data –Piggybacking Timeout –If ACK not received within reasonable time, original frame is retransmitted Automatic Repeat Request (ARQ) –General strategy of using ACKS and timeouts to implement reliable delivery

Spring 2010CS Acknowledgements & Timeouts

Spring 2010CS Acknowledgements & Timeouts

Spring 2010CS A Subtlety… Consider scenarios (c) and (d) in previous slide. –Receiver receives two good frames (duplicate) –It may deliver both to higher layer protocol (not good!) –Solution: 1-bit sequence number in frame header

Spring 2010CS Stop-and-Wait Problem: keeping the pipe full Example –1.5Mbps link x 45ms RTT = 67.5Kb (8KB) –1KB frames implies 1/8th link utilization (Next slide) SenderReceiver

Spring 2010CS Bandwidth x Delay Product Sending a 1KB packet in 45ms implies sending at rate of (1024 x 8)/0.045 = 182 Kbps, or 1/8 of bandwidth. Bandwidth-delay: The number of bits that fits in the pipe in a single round trip. (I.e. the amount of data that could be “in transit” at any given time.) Goal: Want to be able to send this much data before getting first ACK. (keeping the pipe full)

Spring 2010CS Sliding Window Allow multiple outstanding (un-ACKed) frames Upper bound on un-ACKed frames, called window SenderReceiver T ime … …

Spring 2010CS Sliding Window: Sender Assign sequence number to each frame ( SeqNum ) Maintain three state variables: –send window size ( SWS ) –last acknowledgment received ( LAR ) –last frame sent ( LFS ) Maintain invariant: LFS - LAR ≤ SWS Advance LAR when ACK arrives Buffer up to SWS frames (must be prepared to retransmit frames until they are ACKed)  SWS LARLFS ……

Spring 2010CS Sliding Window: Receiver Maintain three state variables –receive window size ( RWS ) (upper bound on # out-of-order frames) –largest frame acceptable ( LFA ) (sequence # of) –last frame received ( LFR ) Maintain invariant: LFA - LFR ≤ RWS Frame SeqNum arrives: –if LFR < SeqNum ≤ LFA accept –if SeqNum ≤ LFR or SeqNum > LFA discard Send cumulative ACKs  RWS LFA LFR ……

Spring 2010CS Note: When packet loss occurs, pipe is no longer kept full! Longer it takes to notice lost packet, worse the condition becomes Possible solutions: –Send NACKs –Selective acknowledgements (just ACK exactly those frames received, not highest frame received) –Not used: too much added complexity

Spring 2010CS Sequence Number Space SeqNum field is finite; sequence numbers wrap around Sequence number space must be larger than number of outstanding frames (I.e. stop-and-wait had sequence # space size == 2) –I.e. if sequence number space is of size 8 (say 0..7), and number of outstanding frames is allowed to be 10, then sender can send sequence numbers 0,1,2,3,4,5,6,7,0,1 all at once. Now if receiver sends back an ACK with sequence number 1, which packet 1 is it ACKing?

Spring 2010CS Sequence Number Space Even SWS < SequenceSpaceSize is not sufficient –suppose 3-bit SeqNum field (0..7) (so SequenceSpaceSize = 8) –Let SWS=RWS=7 –sender transmit frames 0..6 –Frames arrive successfully, but ACKs are lost –sender retransmits 0..6 –receiver expecting 7, 0..5, but believes it is receiving the second incarnation of 0..5 (because the receiver has at this point updated its various pointers) SWS ≤ (SequenceSpaceSize+1)/2 is rule (if SWS=RWS ) Intuitively, SeqNum “slides” between two halves of sequence number space

Spring 2010CS Easy to overlook… Relationship between window size and sequence number space depends on assumption that frames are not reordered in transit (easy to assume on point-to-point link).

Spring 2010CS Back to Chapter 5…

Spring 2010CS Data Link Versus Transport Transport potentially connects many different hosts –need explicit connection establishment and termination Transport has potentially different RTT (over different routes and at different times, even on scale of minutes) –need adaptive timeout mechanism Transport has potentially long delay in network –need to be prepared for arrival of very old packets Transport has potentially different capacity at destination –need to accommodate different node capacity Transport has potentially different network capacity –need to be prepared for network congestion

Spring 2010CS The “End-to-End” Argument Consider TCP vs X.25 TCP: Consider underlying IP network unreliable and use sliding window to provide end-to-end in- order reliable delivery X.25: Use sliding window within network on hop- by-hop basis (which should guarantee end-to-end). Several problems with this: –No guarantee that added hop preserves service –In link from A to B to C, no guarantee that B behaves perfectly (nodes known to introduce errors and mix packet order)

Spring 2010CS End-to-End “A function should not be provided in the lower levels of the system unless it can be completely and correctly implemented at that level” Does allow for functions to be incompletely provided at lower levels for optimization –E.g. detecting and retransmitting single corrupt packet across one hop preferable to retransmitting entire file end-to-end. See reading assignment on class homework page

Spring 2010CS Segment Format

Spring 2010CS Segment Format (cont) Each connection identified with 4-tuple: –(SrcPort, SrcIPAddr, DestPort, DestIPAddr) Sliding window and flow control –acknowledgment, SequenceNum, AdvertisedWindow Flags –SYN, FIN, RESET, PUSH, URG, ACK Checksum –pseudo header + TCP header + data Sender Data(SequenceNum) Acknowledgment + AdvertisedWindow Receiver

Spring 2010CS Connection Establishment and Termination Active participant (client) Passive participant (server) SYN, SequenceNum = x SYN + ACK, SequenceNum = y, ACK, Acknowledgment = y + 1 Acknowledgment = x + 1 Note: SequenceNum contains the sequence number of the first data byte contained in the segment. ACK field always gives the sequence number of the next data byte expected. (Except for the SYN segments)

Spring 2010CS State Transition Diagram CLOSED LISTEN SYN_RCVDSYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACKCLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive openClose Send/SYN SYN/SYN + ACK SYN + ACK/ACK SYN/SYN + ACK ACK Close/FIN FIN/ACKClose/FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes FIN/ACK ACK Close/FIN Close CLOSED Active open/SYN Opening connection Closing connection event/action

Spring 2010CS Sliding Window Revisited Sending side –LastByteAcked ≤ LastByteSent –LastByteSent ≤ LastByteWritten –buffer bytes between LastByteAcked and LastByteWritten Sending application LastByteWritten TCP LastByteSentLastByteAcked Receiving application LastByteRead TCP LastByteRcvdNextByteExpected Receiving side –LastByteRead < NextByteExpected –NextByteExpected ≤ LastByteRcvd +1 –buffer bytes between LastByteRead and LastByteRcvd

Spring 2010CS Flow Control Send buffer size: MaxSendBuffer Receive buffer size: MaxRcvBuffer Receiving side –LastByteRcvd - LastByteRead ≤ MaxRcvBuffer –AdvertisedWindow = MaxRcvBuffer - ( LastByteRcvd - LastByteRead ) Sending side –LastByteSent - LastByteAcked ≤ AdvertisedWindow –EffectiveWindow = AdvertisedWindow - ( LastByteSent - LastByteAcked ) –LastByteWritten - LastByteAcked ≤ MaxSendBuffer –block sender if ( LastByteWritten - LastByteAcked ) + y > MaxSenderBuffer

Spring 2010CS Flow Control Always send ACK in response to arriving data segment –This response contains latest Acknowledge and AdvertisedWindow fields even if they haven’t changed Slow receiving process may cause AdvertisedWindow = 0. Problem: How does the sending side know when the advertised window is no longer 0? –It can’t get this info, since receiver only sends window advertisements in response to received packets, and sender can’t send anything because it believes the window size is zero. Solution: Persist when AdvertisedWindow = 0 –Periodically send a probe segment with one byte of data. Although most won’t be accepted, they trigger responses, and eventually one will come back with a nonzero advertised window.

Spring 2010CS Protection Against Wrap Around 32-bit SequenceNum BandwidthTime Until Wrap Around T1 (1.5 Mbps)6.4 hours Ethernet (10 Mbps)57 minutes T3 (45 Mbps)13 minutes FDDI, FastEther (100 Mbps)6 minutes OC-3 (155 Mbps)4 minutes OC-12 (622 Mbps)55 seconds OC-48 (2.5 Gbps)14 seconds

Spring 2010CS Keeping the Pipe Full 16-bit AdvertisedWindow BandwidthDelay x Bandwidth Product T1 (1.5 Mbps)18KB Ethernet (10 Mbps)122KB T3 (45 Mbps)549KB FDDI, FastEther (100 Mbps)1.2MB OC-3 (155 Mbps)1.8MB OC-12 (622 Mbps)7.4MB OC-48 (2.5 Gbps)29.6MB Results below assume RTT of 100 ms, typical for cross-country link

Spring 2010CS TCP Extensions Implemented as header options Store timestamp in outgoing segments Extend sequence space with 32-bit timestamp: PAWS (Protection Against Wrapped Sequence Numbers) Shift (scale) advertised window

Spring 2010CS Nagle Algorithm 1 byte data segment generates 41 byte packets (20 for IP header + 20 for TCP header). Small packets are called tinygrams –On LANs, usually not an issue, but on WANs, this can be a problem (it adds congestion) Solution: Nagle Algorithm (RFC 896, Nagle, 1984): When a TCP connection has outstanding data that has not yet been Acked, small segments cannot be sent until the outstanding data is acknowledged.

Spring 2010CS Nagle Algorithm (continued) Nagle is self-clocking: the faster the ACKs come back, the faster the data is sent. But on slow WAN, where tinygrams can be a problem, fewer segments are sent. –Ex. On LAN, time for single byte to be sent, ACKed and echoed is around 16ms. To generate data at this rate, you need to be typing around 60 characters per second (so on LAN you don’t kick in Nagle) –On WAN, you’ll often kick in Nagle

Spring 2010CS Disabling the Nagle Algorithm Why would you want to? –X Window system: small messages (mouse movements) need to be delivered without delay –Typing one of the terminals special function keys during interactive login Function keys normally generate multiple bytes of data, beginning with ASCII escape character. If TCP gets data a byte at a time, it can potentially send first byte and then hold the rest of the characters. The server wouldn’t generate the ACK until it received the rest of the command, so Nagle would kick in, meaning rest of bytes not sent for 200ms, which can be a noticeable delay. With sockets API, the TCP_NODELAY option disables Nagle Host Requirements RFCs (1122, 1123) specify that there must be a way for an app to disable Nagle on an individual TCP connection.

Spring 2010CS Adaptive Retransmission (Original Algorithm) Measure SampleRTT for each segment/ACK pair Compute weighted average of RTT  between 0.8 and 0.9 (recommended value 0.9) –Note  in this range has a strong smoothing effect Set timeout based on EstRTT –TimeOut = 2 x EstRTT (rather conservative)

Spring 2010CS Karn/Partridge Algorithm Problem: ACK doesn’t acknowledge a transmission (it acks a receive) Do not sample RTT when retransmitting Double timeout after each retransmission (exponential backoff) SenderReceiver Original transmission ACK SampleR TT Retransmission SenderReceiver Original transmission ACK SampleR TT Retransmission Why?

Spring 2010CS A Problem Problem with both these approaches: they can’t keep up with wide RTT fluctuations, thus causing unnecessary retransmissions When the network is already loaded, unnecessary retransmissions add to the network load (as Stevens notes, “It is the network equivalent of pouring gasoline on a fire”) What’s needed: keep track of the variance in RTT measurements AND use smooth RTT estimator.

Spring 2010CS Jacobson/ Karels Algorithm New Calculations for average RTT Diff = sampleRTT - EstRTT EstRTT = EstRTT + ( g x Diff) –Recommended value for g is –EstRTT is just the smoothed RTT as before Dev = Dev + h ( |Diff| - Dev) –Recommended value for h is 0.25 –Dev is the smoothed mean deviation (easier to compute mean that standard deviation, which requires a square root) TimeOut = EstRTT +  x Dev –Larger gain for the deviation makes the TimeOut value increase faster when the RTT changes. Notes –algorithm only as good as granularity of clock (500ms on Unix) –accurate timeout mechanism important to congestion control (later) Note these values?

Spring 2010CS TCP Interactive Data Flow Material here is from TCP/IP Illustrated, Vol. 1 Study by Caceres, et. al. (1991) : –On a packet count basis, about half of all TCP segments contain bulk data (ftp, , Usenet news) –Half contain interactive data (telnet, rlogin) –On byte count basis, ratio is around 90% bulk transfer, 10% interactive. –Bulk data tends to be full size (normally 512 bytes of data), interactive is much smaller (90% of telnet and rlogin packets carry less than 10 bytes of data).

Spring 2010CS More recent data 1997 MCI study –Web: 75% of bytes, 70% of packets –FTP: 5% of bytes, 3% of packets –SMTP ( ): 5% of bytes, 5% of packets –DNS, NNTP, Telnet/rlogin: < 2% each 2000 CAIDA study –Napster, streaming audio, online games show up 2004 top application? –P2P: 62% of Internet traffic. BitTorrent: approx. 35% (CacheLogic) –Update – 2007, Cringley says 50% BitTorrent (5% of users)

Spring 2010CS Rlogin and Telnet Surprisingly, each interactive keystroke typically generates a packet (as opposed to a line generating a packet). Moreover, a single rlogin keystroke can generate 4 segments (though usually 3) i.Interactive keystroke from client ii.ACK of keystroke from server (typically piggybacked in echo of data byte) see next slide iii.Echo of data byte from server iv.ACK of echoed byte from client

Spring 2010CS Delayed ACKs Normally, TCP does not send an ACK the instant it receives data. Instead, it delays the ACK, hoping to have data going in other direction on which it can piggyback the ACK. Most implementations use a 200ms delay (delays ACK up to 200ms before sending the ACK by itself) This is why in previous slide, ACK would normally piggyback with the echoed character