Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCP and UDP. 2 The Internet Transport Layer Two transport layer protocols supported by the Internet: Reliable: The Transport Control Protocol (TCP) Unreliable.

Similar presentations


Presentation on theme: "TCP and UDP. 2 The Internet Transport Layer Two transport layer protocols supported by the Internet: Reliable: The Transport Control Protocol (TCP) Unreliable."— Presentation transcript:

1 TCP and UDP

2 2 The Internet Transport Layer Two transport layer protocols supported by the Internet: Reliable: The Transport Control Protocol (TCP) Unreliable The Unreliable Datagram Protocol (UDP)

3 3 UDP UDP is an unreliable transport protocol that can be used in the Internet UDP does not provide: connection management flow or error control guaranteed in-order packet delivery UDP is almost a “null” transport layer

4 4 Why UDP? No connection needs to be set up Throughput may be higher because UDP packets are easier to process, especially at the source The user doesn’t care if the data is transmitted reliably The user wants to implement his or her own transport protocol

5 5 UDP Frame Format 32 bits Source PortDestination Port UDP lengthUDP checksum (optional) Data

6 6 UDP checksum Sender: treat segment contents as sequence of 16-bit integers checksum: 1’s complement of (1’s complement sum of segment contents) sender puts checksum value into UDP checksum field Receiver: compute checksum of received segment check if computed checksum equals checksum field value: NO - error detected YES - no error detected. But maybe errors nonetheless? More later …. Goal: detect “errors” (e.g., flipped bits) in transmitted segment

7 7 Internet Checksum Example Note When adding numbers, a carryout from the most significant bit needs to be added to the result Example: add two 16-bit integers 1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 Wraparound the carry sum Checksum (complement)

8 8 TCP TCP provides the end-to-end reliable connection that IP alone cannot support The protocol Frame format Connection management Retransmission Flow control Congestion control

9 9 TCP Frame Format Sequence Number Acknowledgement number Options (0 or more 32-bit words) ChecksumUrgent Pointer Window SizeHL FINFIN SYNSYN RSTRST PSHPSH ACKACK URGURG Data 32 bits Source PortDestination Port

10 10 TCP Frame Fields Source & Destination Ports 16 bit port identifiers for each packet Sequence number The packet’s unique sequence ID Acknowledgement number The sequence number of the next packet expected by the receiver

11 11 TCP Frame Fields (cont’d) Window size Specifies how many bytes may be sent after the first acknowledged byte Checksum Checksums the TCP header and IP address fields Urgent Pointer Points to urgent data in the TCP data field

12 12 TCP Frame Fields (cont’d) Header bits URG = Urgent pointer field in use ACK = Indicates whether frame contains acknowledgement PSH = Data has been “pushed”. It should be delivered to higher layers right away. RST = Indicates that the connection should be reset SYN = Used to establish connections FIN = Used to release a connection

13 13 TCP Connection Establishment Three-way Handshake SYN (seq=x) SYN (seq=y, ACK=x+1) SYN (seq=x+1, ACK=y+1) Host AHost B

14 14 TCP Connection Tear-down Two double handshakes: FIN (seq=x) ACK (ACK=x+1) ACK (ACK=y+1) Host AHost B FIN (seq=y) A->B torn down B->A torn down

15 15 TCP Retransmission When a packet remains unacknowledged for a period of time, TCP assumes it is lost and retransmits it TCP tries to calculate the round trip time (RTT) for a packet and its acknowledgement From the RTT, TCP can guess how long it should wait before timing out

16 16 Round Trip Time (RTT) RTT = Time for packet to arrive at destination + Time for ACK to return from destination Network Time for data to arrive Time for ACK to return

17 17 RTT Calculation 2KSEQ=0 ACK = 2048 ReceiverSender RTT 0.9 sec 2.2 sec RTT = 2.2 sec - 0.9 sec. = 1.3 sec

18 18 Smoothing the RTT measurement First, we must smooth the round trip time due to variations in delay within the network: SRTT =  SRTT + (1-  ) RTT arriving ACK The smoothed round trip time (SRTT) weights previously received RTTs by the  parameter  is typically equal to 0.875

19 19 Retransmission Timeout Interval (RTO) The timeout value is then calculated by multiplying the smoothed RTT by some factor (greater than 1) called  Timeout =   SRTT This coefficient of  is included to allow for some variation in the round trip times.

20 20 Example Initial SRTT = 1.50  0.875,  = 4.0 RTT Meas.SRTT 1.5 s  1.50 1.0 s  1.50     2.2 s  1.44     1.0 s  1.54     0.8 s  1.47     3.1 s Timeout  1.50  6.00  1.44  5.76  1.54  6.16  1.47  5.88  1.39  5.56 2.0 s

21 21 Problem with RTT Calculation 2KSEQ=0 ReceiverSender Sender Timeout 2KSEQ=0 RTT? ACK = 2048

22 22 Karn’s Algorithm Retransmission ambiguity Measure RTT from original data segment Measure RTT from most recent segment Either way there is a problem in RTT estimate One solution Never update RTT measurements based on acknowledgements from retransmitted packets Problem: Sudden change in RTT can cause system never to update RTT Primary path failure leads to a slower secondary path

23 23 Karn’s algorithm Use back-off as part of RTT computation Whenever packet loss, RTO is increased by a factor Use this increased RTO as RTO estimate for the next segment (not from SRTT) Only after an acknowledgment received for a successful transmission is the timer set to new RTT obtained from SRTT

24 24 Another Problem with RTT Calculation RTT measurements can sometimes fluctuate severely smoothed RTT (SRTT) is not a good reflection of round- trip time in these cases Solution: Use Jacobson/Karels algorithm: Error =RTT - SRTT SRTT =  SRTT +  Error  Dev =  Dev + h(|Error| - Dev) Timeout = SRTT+  Dev 

25 25 Jacobson/Karels Algorithm Example Initial SRTT , Dev   RTT Meas.SRTT 1.5 s   1.0 s   2.2 s  1.0 s  0.8 s  3.1 s Error        Dev.       Timeout       Error = RTT - SRTT SRTT = SRTT + (  Error) Dev = Dev + [  (|Error| - Dev)] Timeout = SRTT + (  Dev) 2.0 s

26 26 Example RTT computation

27 27 TCP Flow Control TCP uses a modified version of the sliding window In acknowledgements, TCP uses the “Window size” field to tell the sender how many bytes it may transmit TCP uses bytes, not packets, as sequence numbers

28 28 TCP Flow Control (cont’d) Send Number of bytes in packet (N) Sequence number of first data byte in packet (SEQ) NSEQ Recv Window size at the receiver (WIN) ACKWIN Sequence number of next expected byte (ACK) Important information in TCP/IP packet headers ACK bit set Contained in IP header Contained in TCP header

29 29 Example TCP session (1)remus:$ tcpdump -S host scully Kernel filter, protocol ALL, datagram packet socket tcpdump: listening on all devices 15:15:22.152339 eth0 > remus.4706 > scully.echo: S 1264296504:1264296504(0) win 32120 15:15:22.153865 eth0 remus.4706: S 875676030:875676030(0) ack 1264296505 win 8760 15:15:22.153912 eth0 > remus.4706 > scully.echo:. 1264296505:1264296505(0) ack 875676031 win 32120 remus: telnet scully 7 A

30 30 Example TCP session Packet 1: 15:15:22.152339 eth0 > remus.4706 > scully.echo: S 1264296504:1264296504(0) win 32120 (DF) Packet 2: 15:15:22.153865 eth0 remus.4706: S 875676030:875676030(0) ack 1264296505 win 8760 <mss 1460) Packet 3: 15:15:22.153912 eth0 > remus.4706 > scully.echo:. 1264296505:1264296505(0) ack 875676031 win 32120 TimestampSource IP/portDest IP/port Flags Options Start Sequence Number Acknowledgement Number Window End Sequence Number

31 31 TCP data transfer Packet 4: 15:15:28.591716 eth0 > remus.4706 > scully.echo: P 1264296505:1264296508(3) ack 875676031 win 32120 Packet 5: 15:15:28.593255 eth0 remus.4706: P 875676031:875676034(3) ack 1264296508 win 8760 data # bytes

32 32 TCP Flow Control (cont’d) 2KSEQ=0 ACK = 2048 WIN = 2048 2KSEQ=2048 ACK = 4096 WIN = 0 ACK = 4096 WIN = 2048 1KSEQ=4096 Application does a 2K write Application does a 3K write Sender is blocked Sender may send up to 2K Empty 2K Full 2K 1K Application reads 2K 04K Receiver’s buffer ReceiverSender

33 33 TCP Flow Control (cont’d) A NSEQ Piggybacking: Allows more efficient bidirectional communication ACKWIN B NSEQACKWIN Data from A to B ACK for data from B to A Data from B to A ACK for data from A to B

34 34 TCP Congestion Control Recall: Network layer is responsible for congestion control However, TCP/IP blurs the distinction In TCP/IP: the network layer (IP) simply handles routing and packet forwarding congestion control is done end-to-end by TCP

35 35 Self-Clocking Model Sender Receiver Fast link Bottleneck link Data Acks 1. Send Burst 2. Receive data packet 3. Send Acknowledgement 4. Receive Acknowledgement 5. Send a data packet PbPb PrPr ArAr AbAb ArAr Given: P b = P r = A r =A b =A r (in units of time) Sending a packet on each ACK keeps the bottleneck link busy

36 36 Changing bottleneck bandwidth one router, finite buffers sender retransmission of lost packet finite shared output link buffers Host A in : original data Host B out ' in : original data, plus retransmitted data

37 37 TCP Congestion Control Goal: achieve self-clocking state Even if don’t know bandwidth of bottleneck Bottleneck may change over time Two phases to keep bottleneck busy: Slow-start ramps up to the bottleneck limit Packet loss signals we passed bandwidth of bottleneck Congestion Avoidance tries to maintain self clocking mode once established

38 38 TCP Congestion Window TCP introduces a second window, called the “congestion window” This window maintains TCP’s best estimate of amount of outstanding data to allow in the network to achieve self-clocking

39 39 TCP Congestion Window To determine how many bytes it may send, the sender takes the minimum of the receiver window and the congestion window Example: If the receiver window says the sender can transmit 8K, but the congestion window is only 4K, then the sender may only transmit 4K If the congestion window is 8K but the receiver window says the sender can transmit 4K, then the sender may only transmit 4K

40 40 TCP Slow Start Phase TCP defines the “maximum segment size” as the maximum size a TCP packet can be (including header) TCP Slow Start: Congestion window starts small, at 1 segment size Each time a transmitted segment is acknowledged, the congestion window is increased by one maximum segment size On each ack, cwnd=cwnd +1

41 41 TCP Slow Start (cont’d) 1KA sends 1 segment to B B ACKs the segment 2KA sends 2 segments to B B ACKs both segments 4KA sends 4 segments to B B ACKs all four segments 8KA sends 8 segments to B B ACKs all eight segments 16K… and so on Congestion Window SizeEvent

42 42 TCP Slow Start (cont’d) Congestion window size grows exponentially (i.e. it keeps on doubling) Packet losses indicate congestion Packet losses are determined by using timers at the sender When a timeout occurs, the congestion window is reduced to one maximum segment size and everything starts over

43 43 TCP Slow Start When connection begins, increase rate exponentially until first loss event: double CongWin every RTT done by incrementing CongWin for every ACK received Summary: initial rate is slow but ramps up exponentially fast Host A one segment RTT Host B time two segments four segments

44 44 TCP Slow Start (cont’d) Congestion window Transmission Number Timed out Transmissions 1 Maximum Segment Size

45 45 TCP Slow Start (cont’d) TCP Slow Start by itself is inefficient Although the congestion window builds exponentially, it drops to 1 segment size every time a packet times out This leads to low throughput

46 46 TCP Linear Increase Threshold Establish a threshold at which the rate increase is linear instead of exponential to improve efficiency Algorithm : Start the threshold at 64K (ssthresh) Slow start Once the threshold is passed, only increase the congestion window size by 1 segment size for each congestion window of data transmitted For each ack received, cwnd = cwnd + (mss*mss)/cwnd If a timeout occurs, reset the congestion window size to 1 segment and set threshold to max(2*mss,1/2 of MIN(sliding window, congestion window))

47 47 TCP Linear Increase Threshold Phase Congestion window Transmission Number 1K 20K 32K Timeout occurs when MIN(sliding window, congestion window) = 40K Example: Maximum segment size = 1K Assume SSthresh=32K Thresholds 40K

48 48 TCP Fast Retransmit Another enhancement to TCP congestion control Idea: When sender sees 3 duplicate ACKs, it assumes something went wrong The packet is immediately retransmitted instead of waiting for it to timeout Why? Note that acks sent by the receiver when it receives a packet Dup ack implies something is getting through Better than time out

49 49 TCP Fast Retransmit Example ReceiverSender 1KSEQ=20481KSEQ=3072 ACK = 2048 WIN = 30K 1KSEQ=4096 ACK = 2048 WIN = 31K ACK = 2048 WIN = 29K 1KSEQ=5120 ACK = 2048 WIN = 28K Fast Retransmit occurs (2nd packet is now retransmitted w/o waiting for it to timeout) 1KSEQ=2048 ACK = 7168 WIN = 26K MSS = 1K 1KSEQ=6144 ACK = 2048 WIN = 27K Duplicate ACK #1 Duplicate ACK #2 Duplicate ACK #3 ACK of new data

50 50 TCP Fast Recovery Yet another enhancement to TCP congestion control Idea: Don’t do a slow start after a fast retransmit Instead, use this algorithm: Drop threshold to max(2*mss,1/2 of MIN(sliding window, congestion window)) Set congestion window to threshold + 3 * MSS For each duplicate ACK (after the fast retransmit), increment congestion window by MSS When next non-duplicate ACK arrives, set congestion window equal to the threshold

51 51 TCP Fast Recovery Example Sender 1KSEQ=2048 ACK = 7168 WIN = 26K Fast Retransmit Occurs ACK = 2048 WIN = 27K ACK = 2048 WIN = 28K 1KSEQ=6144 SW=29K,TH=15K, CW=20K Continuing with the Fast Retransmit Example... SW=28K,TH=15K, CW=20K SW=28K, TH=10K, CW=13K SW=27K, TH=10K, CW=14K SW=26K, TH=10K, CW=10K MSS=1K Sliding Window (SW) Congestion Threshold (TH) Congestion Window (CW)

52 52 Resulting TCP Sawtooth Congestion window Transmission Number 1K 20K 32K Slow Start 40K Linear Mode Bottleneck Capacity In steady state, window oscillates around the bottleneck’s capacity (I.e. number of outstanding bytes in transit) Sawtooth

53 53 TCP Recap Timeout Computation Timeout is a function of 2 values the weighted average of sampled RTTs The sampled variance of each RTT Congestion control: Goal: Keep the self-clocking pipe full in spite of changing network conditions 3 key Variables: Sliding window (Receiver flow control) Congestion window (Sender flow control) Threshold (Sender’s slow start vs. linear mode line)

54 54 TCP Recap (cont) Slow start Add 1 segment for each ACK to the congestion window -Double’s the congestion window’s volume each RTT Linear mode (Congestion Avoidance) Add 1 segment’s worth of data to each congestion window Adds 1 segment per RTT

55 55 Algorithm Summary: TCP Congestion Control When CongWin is below Threshold, sender in slow-start phase, window grows exponentially. When CongWin is above Threshold, sender is in congestion- avoidance phase, window grows linearly. When a triple duplicate ACK occurs, Threshold set to max( FlightSize/2,2*mss) and CongWin set to Threshold+3*mss. (Fast retransmit, Fast recovery) When timeout occurs, Threshold set to max( FlightSize/2,2*mss) and CongWin is set to 1 MSS. FlightSize: The amount of data that has been sent but not yet acknowledged.

56 56 TCP sender congestion control EventStateTCP Sender ActionCommentary ACK receipt for previously unacked data Slow Start (SS) CongWin = CongWin + MSS, If (CongWin > Threshold) set state to “Congestion Avoidance” Resulting in a doubling of CongWin every RTT ACK receipt for previously unacked data Congesti on Avoidanc e (CA) CongWin = CongWin+MSS * (MSS/CongWin) Additive increase, resulting in increase of CongWin by 1 MSS every RTT Loss event detected by triple duplicate ACK SS or CAThreshold = max(FlightSize/2,2*mss) CongWin = Threshold+3*mss, Set state to “Congestion Avoidance” Fast recovery, implementing multiplicative decrease. CongWin will not drop below 1 MSS. TimeoutSS or CAThreshold = max(FlightSize/2,2*mss), CongWin = 1 MSS, Set state to “Slow Start” Enter slow start Duplicate ACKSS or CAIncrement duplicate ACK count for segment being acked CongWin and Threshold not changed

57 57 Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 bottleneck router capacity R TCP connection 2 TCP Fairness

58 58 Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally R R equal bandwidth share Connection 1 throughput Connection 2 throughput congestion avoidance: additive increase loss: decrease window by factor of 2

59 59 Fairness (more) Fairness and UDP Multimedia apps often do not use TCP do not want rate throttled by congestion control Instead use UDP: pump audio/video at constant rate, tolerate packet loss Research area: TCP friendly Fairness and parallel TCP connections nothing prevents app from opening parallel connections between 2 hosts. Web browsers do this Example: link of rate R supporting 9 connections; new app asks for 1 TCP, gets rate R/10 new app asks for 11 TCPs, gets R/2 !

60 60 Delay modeling Q: How long does it take to receive an object from a Web server after sending a request? Ignoring congestion, delay is influenced by: TCP connection establishment data transmission delay slow start Notation, assumptions: Assume one link between client and server of rate R S: MSS (bits) O: object size (bits) no retransmissions (no loss, no corruption) Window size: First assume: fixed congestion window, W segments Then dynamic window, modeling slow start

61 61 Fixed congestion window (1) First case: WS/R > RTT + S/R: ACK for first segment in window returns before window’s worth of data sent delay = 2RTT + O/R

62 62 Fixed congestion window (2) Second case: WS/R < RTT + S/R: wait for ACK after sending window’s worth of data sent delay = 2RTT + O/R + (K-1)[S/R + RTT - WS/R]

63 63 TCP Delay Modeling: Slow Start (1) Now suppose window grows according to slow start Will show that the delay for one object is: where P is the number of times TCP idles at server: - where Q is the number of times the server idles if the object were of infinite size. - and K is the number of windows that cover the object.

64 64 TCP Delay Modeling: Slow Start (2) Example: O/S = 15 segments K = 4 windows Q = 2 P = min{K-1,Q} = 2 Server idles P=2 times Delay components: 2 RTT for connection estab and request O/R to transmit object time server idles due to slow start Server idles: P = min{K-1,Q} times

65 65 TCP Delay Modeling (3)

66 66 TCP Delay Modeling (4) Calculation of Q, number of idles for infinite-size object, is similar (see HW). Recall K = number of windows that cover object How do we calculate K ?


Download ppt "TCP and UDP. 2 The Internet Transport Layer Two transport layer protocols supported by the Internet: Reliable: The Transport Control Protocol (TCP) Unreliable."

Similar presentations


Ads by Google