Presentation is loading. Please wait.

Presentation is loading. Please wait.

T RANSPORT L AYER Dr. Nawaporn Wisitpongphan Credit: Prof. Nick McKeown

Similar presentations


Presentation on theme: "T RANSPORT L AYER Dr. Nawaporn Wisitpongphan Credit: Prof. Nick McKeown"— Presentation transcript:

1 T RANSPORT L AYER Dr. Nawaporn Wisitpongphan Credit: Prof. Nick McKeown http://www.stanford.edu/~nickm

2 O UTLINE The Transport Layer The UDP Protocol The TCP Protocol TCP Characteristics TCP Connection setup TCP Segments TCP Sequence Numbers TCP Sliding Window Timeouts and Retransmission Congestion Control and Avoidance

3 R EVIEW OF THE TRANSPORT LAYER NickDave Leland.Stanford.edu Athena.MIT.edu Network Layer Link Layer Application Layer Transport Layer O.S. HeaderDataHeaderData HD HD HD HDHD HD

4 L AYERING : T HE OSI M ODEL Session Network Link Physical Application Presentation Transport Network Link Network Transport Session Presentation Application Network Link Physical Peer-layer communication layer-to-layer communication Router 1 2 3 4 5 6 7 1 2 3 4 5 6 7

5 U SER D ATAGRAM P ROTOCOL (UDP) C HARACTERISTICS UDP is a connectionless datagram service. There is no connection establishment: packets may show up at any time. UDP is unreliable: No acknowledgements to indicate delivery of data. Checksums cover the header, and only optionally cover the data. Contains no mechanism to detect missing or mis-sequenced packets. No mechanism for automatic retransmission. No mechanism for flow control, and so can over-run the receiver.

6 U SER -D ATAGRAM P ROTOCOL (UDP) App A1A1 A2A2 B1B1 B2B2 UDP OS IP UDP uses port number to demultiplex packets PortDescription 123 Network Time Protocol (NTP) 67,68 Dynamic Host Configuration Protocol (DHCP) 500 Internet Security Association Key Management Protocol (ISAKMP) 520 Routing Information Protocol

7 SRC port DST port checksumlength DATA U SER -D ATAGRAM P ROTOCOL (UDP) P ACKET FORMAT  Why do we have UDP?  It is used by applications that don’t need reliable delivery, or  Applications that have their own special needs, such as streaming of real-time audio/video. By default, only covers the header.

8 TCP C HARACTERISTICS TCP is connection-oriented. 3-way handshake used for connection setup. TCP provides a stream-of-bytes service. TCP is reliable: Acknowledgements indicate delivery of data. Checksums are used to detect corrupted data. Sequence numbers detect missing, or mis-sequenced data. Corrupted data is retransmitted after a timeout. Mis-sequenced data is re-sequenced. (Window-based) Flow control prevents over-run of receiver. TCP uses congestion control to share network capacity among users.

9 HTTP AND TCP PortDescription 80 HTTP 23 Telnet 20/21 FTP(data/control) 25 Simple Mail Transfer Protocol (SMTP)

10 TCP IS CONNECTION - ORIENTED Connection Setup 3-way handshake (Active) Client (Passive) Server Syn Syn + Ack Ack Connection Close/Teardown 2 x 2-way handshake (Active) Client (Passive) Server Fin (Data +) Ack Fin Ack

11 T HE TCP D IAGRAM Which path does the Active Client or Passive Server follow? (Active) Client (Passive) Server Syn Syn + Ack Ack

12 TCP C LIENT

13 TCP S ERVER

14 TCP SUPPORTS A “ STREAM OF BYTES ” SERVICE Byte 0Byte 1 Byte 2Byte 3 Byte 0Byte 1Byte 2Byte 3 Host A Host B Byte 80  TCP accepts data as a constant stream from the applications  There are no record markers automatically inserted by TCP.  Example:  If the application on one end writes 10 bytes, followed by a write of 20 bytes, followed by a write of 50 bytes, the application at the other end of the connection cannot tell what size the individual writes were. The other end may read the 80 bytes in four reads of 20 bytes at a time.  One end puts a stream of bytes into TCP and the same, identical stream of bytes appears at the other end

15 … WHICH IS EMULATED USING TCP “ SEGMENTS ” Byte 0Byte 1 Byte 2Byte 3 Byte 0Byte 1Byte 2Byte 3 Host A Host B Byte 80 TCP Data Byte 80 Segment sent when: 1.Segment full (MSS bytes), 2.Not full, but times out, or 3.“Pushed” by application. Segment sent when: 1.Segment full (MSS bytes), 2.Not full, but times out, or 3.“Pushed” by application.

16 T HE TCP S EGMENT F ORMAT IP Hdr IP Data TCP HdrTCP Data Src portDst port Sequence # Ack Sequence # HLEN 4 RSVD 6 URGACK PSH RSTSYNFIN Flags Window Size ChecksumUrgent Pointer (TCP Options) 01531 TCP Data TCP Header and Data + IP Addresses Src/dst port numbers and IP addresses uniquely identify socket Src/dst port numbers and IP addresses uniquely identify socket

17 S EQUENCE N UMBERS Host A Host B TCP Data TCP HDR TCP HDR ISN (initial sequence number) Sequence number = 1 st byte Ack sequence number = next expected byte How does ISN get chosen?

18 I NITIAL S EQUENCE N UMBERS Connection Setup 3-way handshake (Active) Client (Passive) Server Syn +ISN A Syn + Ack +ISN B Ack Sequence number = 32 bits What if a message has more than 2 32 bytes? Sequence Number wrap-around Solution: Timestamp Option : Sender places timestamp in every segment : Receiver copies timestamp in the ACK it sends for a segment

19 TCP S LIDING W INDOW How much data can a TCP sender have outstanding in the network? How much data should TCP retransmit when an error occurs? Just selectively repeat the missing data? How does the TCP sender avoid over-running the receiver’s buffers?

20 TCP S LIDING W INDOW Window Size Outstanding Un-ack’d data Data OK to send Data not OK to send yet Data ACK’d  Window is meaningful to the sender.  Current window size is “advertised” by receiver (usually 4k – 8k Bytes when connection set-up).

21 TCP S LIDING W INDOW Host A Host B ACK Window Size Round-trip time (1) RTT > Window size ACK Window Size Round-trip time (2) RTT = Window size ACK Window Size ???

22 TCP: R ETRANSMISSION AND T IMEOUTS Host A Host B ACK Round-trip time (RTT) ACK Retransmission TimeOut (RTO) Estimated RTT Data1Data2 Guard Band TCP uses an adaptive retransmission timeout value: Congestion Changes in Routing RTT changes frequently

23 TCP: R ETRANSMISSION AND T IMEOUTS Picking the RTO is important:  Pick a values that’s too big and it will wait too long to retransmit a packet,  Pick a value too small, and it will unnecessarily retransmit packets. The original algorithm for picking RTO: 1. EstimatedRTT k =  EstimatedRTT k-1 + (1 -  ) SampleRTT 2. RTO = 2 * EstimatedRTT Characteristics of the original algorithm:  Variance is assumed to be fixed.  But in practice, variance increases as congestion increases. Determined empirically

24 TCP: R ETRANSMISSION AND T IMEOUTS  There will be some (unknown) distribution of RTTs.  We are trying to estimate an RTO to minimize the probability of a false timeout. RTT Probability mean variance Load (Amount of traffic arriving to router) Average Queueing Delay Variance grows rapidly with load  Router queues grow when there is more traffic, until they become unstable.  As load grows, variance of delay grows rapidly.

25 TCP: R ETRANSMISSION AND T IMEOUTS Newer Algorithm includes estimate of variance in RTT:  Difference = SampleRTT - EstimatedRTT  EstimatedRTT k = EstimatedRTT k-1 + (  *Difference)  Deviation = Deviation +  *( |Difference| - Deviation )  RTO =  * EstimatedRTT +  * Deviation   1   4 Same as before

26 TCP: R ETRANSMISSION AND T IMEOUTS K ARN ’ S A LGORITHM Retransmission Wrong RTT Sample Host AHost B Retransmission Wrong RTT Sample Host AHost B Problem: How can we estimate RTT when packets are retransmitted? Solution: On retransmission, don’t update estimated RTT (and double RTO).

27 C ONGESTION C ONTROL : M AIN POINTS Congestion is inevitable Congestion happens at different scales – from two individual packets colliding to too many users TCP Senders can detect congestion and reduce their sending rate by reducing the window size TCP modifies the rate according to “Additive Increase, Multiplicative Decrease (AIMD)”. To probe and find the initial rate, TCP uses a restart mechanism called “slow start”. Routers slow down TCP senders by buffering packets and thus increasing delay

28 C ONGESTION H1H1 H2H2 R1 H3H3 A 1 (t) 10Mb/s D(t) 1.5Mb/s A 2 (t) 100Mb/s A1(t)A1(t) A2(t)A2(t) X(t)X(t) D(t)D(t) A1(t)A1(t) A2(t)A2(t) D(t)D(t) X(t)X(t) Cumulative bytes t

29 T IME S CALES OF C ONGESTION Too many users using a link during a peak hour TCP flows filling up all available bandwidth Two packets colliding at a router 7:008:009:00 1s2s3s 100µs200µs300µs

30 D EALING WITH C ONGESTION E XAMPLE : TWO FLOWS ARRIVING AT A ROUTER Strategy Drop one of the flows Buffer one flow until the other has departed, then send it Re-Schedule one of the two flows for a later time Ask both flows to reduce their rates R1 ? A1(t)A1(t) A2(t)A2(t)

31 C ONGESTION IS UNAVOIDABLE A RGUABLY IT ’ S GOOD ! We use packet switching because it makes efficient use of the links. Therefore, buffers in the routers are frequently occupied. If buffers are always empty, delay is low, but our usage of the network is low. If buffers are always occupied, delay is high, but we are using the network more efficiently. So how much congestion is too much?

32 L OAD, DELAY AND POWER Average Packet delay Load Typical behavior of queueing systems with random arrivals: Power Load A simple metric of how well the network is performing: “optimal load” Burstiness tends to move asymptote to the left

33 O PTIONS FOR C ONGESTION C ONTROL 1. Implemented by host versus network 2. Reservation-based, versus feedback-based 3. Window-based versus rate-based.

34 TCP C ONGESTION C ONTROL TCP implements host-based, feedback-based, window-based congestion control. TCP sources attempts to determine how much capacity is available TCP sends packets, then reacts to observable events (loss).

35 TCP C ONGESTION C ONTROL TCP sources change the sending rate by modifying the window size: Window = min{Advertized window, Congestion Window} In other words, send at the rate of the slowest component: network or receiver. “cwnd” follows additive increase/multiplicative decrease On receipt of Ack: cwnd += 1 On packet loss (timeout): cwnd *= 0.5 ReceiverTransmitter (“cwnd”)

36 A DDITIVE I NCREASE / M ULTIPLICATIVE D ECREASE D A DDAADDAADA Src Dest Additive Increase: Every time the source successfully sends a cwnd’s worth of packets (each pkt sent out during the last RTT has been ACKed )  add the equivalent of 1 pkt to the cwnd Increment = MSS×(MSS/CWND) ; CWND≥MSS CWND +=Increment

37 L EADS TO THE TCP “ SAWTOOTH ” t Window halved Timeouts Could take a long time to get started! Multiplicative Decrease: For each timeout, the source set CWND to half of its previous value. CWND is large  all the packets dropped will be retransmitted  congestion gets worse  Need to get out of this state quickly

38 “S LOW S TART ” Designed to find the fair-share rate quickly at startup. How Does it work? 1. Increase cwnd exponentially for each ACK received, until it reaches SSthreshold. 2. If cwnd < SSthreshold  {Do Slow Start}, else {Do Congestion Avoidance} 3. Initial SSThreshold = large value. After the pkt lost, SSThreshold = cwnd/2 4. Congestion Avoidance  Increase cwnd linearly D A DDAADD AA D A Src Dest D A 1 2 4 8

39 S LOW S TART Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole advertised window’s worth of data.

40 F AST R ETRANSMIT AND F AST R ECOVERY ? Homework!!

41 TCP S ENDING R ATE What is the sending rate of TCP? Acknowledgement for sent packet is received after one RTT Amount of data sent until ACK is received is the current window size W Therefore sending rate is R = W/RTT Is the TCP sending rate saw tooth shaped as well?

42 TCP AND BUFFERS

43 For TCP with a single flow over a network link with enough buffers, RTT and W are proportional to each other Therefore the sending rate R = W/RTT is constant (and not a sawtooth) But experiments and theory suggest that with many flows: Where: p is the drop probability. TCP rate can be controlled in two ways: 1. Buffering packets and increasing the RTT 2. Dropping packets to decrease TCP’s window size

44 C ONGESTION CONTROL IN THE I NTERNET Maximum window sizes of most TCP implementations by default are very small Windows XP: 12 packets Linux/Mac: 40 packets Often the buffer of a link is larger than the maximum window size of TCP A typical DSL line has 200 packets worth of buffer For a TCP session, the maximum number of packets outstanding is 40 The buffer can never fill up The router will never drop a packet

45 C ONGESTION A VOIDANCE TCP reacts to congestion after it takes place. The data rate changes rapidly and the system is barely stable (or is even unstable). Can we predict when congestion is about to happen and avoid it? E.g. by detecting the knee of the curve. Average Packet delay Load

46 C ONGESTION A VOIDANCE S CHEMES Router-based Congestion Avoidance: DECbit: Routers explicitly notify sources about congestion. Random Early Detection (RED): Routers implicitly notify sources by dropping packets. RED drops packets at random, and as a function of the level of congestion. Host-based Congestion Avoidance Source monitors changes in RTT to detect onset of congestion.

47 DEC BIT Each packet has a “Congestion Notification” bit called the DECbit in its header. If any router on the path is congested, it sets the DECbit. Set if average queue length >= 1 packet, averaged since the start of the previous busy cycle. To notify the source, the destination copies DECbit into ACK packets. Source adjusts rate to avoid congestion. Counts fraction of DECbits set in each window. If <50% set, increase rate additively. If >=50% set, decrease rate multiplicatively. Time Queue Length at router Averaging period

48 R ANDOM E ARLY D ETECTION (RED) RED is based on DECbit, and was designed to work well with TCP. RED implicitly notifies sender by dropping packets. Drop probability is increased as the average queue length increases. (Geometric) moving average of the queue length is used so as to detect long term congestion, yet allow short term bursts to arrive.

49 RED D ROP P ROBABILITIES A(t)D(t) maxP 1 minThmaxTh AvgLen

50 P ROPERTIES OF RED Drops packets before queue is full, in the hope of reducing the rates of some flows. Drops packet for each flow roughly in proportion to its rate. Drops are spaced out in time. Because it uses average queue length, RED is tolerant of bursts. Random drops hopefully desynchronize TCP sources.

51 S YNCHRONIZATION OF SOURCES Source A A B C D RTT

52 S YNCHRONIZATION OF SOURCES Aggregate Flow f(RTT) A B C D RTT Avg

53 D ESYNCHRONIZED SOURCES Source A A B C D RTT

54 D ESYNCHRONIZED SOURCES Aggregate Flow A B C D RTT Avg


Download ppt "T RANSPORT L AYER Dr. Nawaporn Wisitpongphan Credit: Prof. Nick McKeown"

Similar presentations


Ads by Google