TCP/IP Performance COMT 429
© Hans Kruse, Ohio University 2 Protocol Overview Ethernet, X.25, HDLC etc. IP ICMP ARP RARP (Auxiliary Services) TCP UDP HTTP (WWW) Remote Login File Transfer ATM
© Hans Kruse, Ohio University 3 Connection Types in TCP/IP Data Link Layer and Physical Network Network Layer Transport Layer TCP: Connection Oriented UDP: Connection-less Connection-less Depends on the network
© Hans Kruse, Ohio University 4 Real Networks Include many different types of circuits –Different speeds –Some LAN, some Wide-Area connections Rely on routers to connect the different sub- networks Routers are not expected to have detailed knowledge about the traffic flows they are handling
© Hans Kruse, Ohio University 5 Network Knowledge and Lack Thereof End Systems –Know the applications they are running –Often know the network capacity they would like to have –Do not know the actual network capacity available –Do not know the “competition”, i.e. other network users’ traffic
© Hans Kruse, Ohio University 6 Network Knowledge and Lack Thereof Routers –Know the capacity of the links they are attached to –Do not know much about the network farther away from them –Do not know the complete path taken by the packets handled in the router –Do not know (from the network traffic itself) what the applications’ needs are
© Hans Kruse, Ohio University 7 Routers Must cope with packet flows that may exceed the available capacity on their outbound route –Short-term this indicates randomness in the traffic and we need to deal with it –If the overload persists long-term we call it congestion, and we would like for it to go away Routers use queues to handle the short-term variations Long-term overload??
© Hans Kruse, Ohio University 8 Applications Should and often can adapt to the available capacity Should be fair in their use of resources, or Should identify themselves as high-capacity users (and compensate the network operator accordingly) Need information about the network and the capacity is can deliver
© Hans Kruse, Ohio University 9 In the ideal case Use a control protocol to communicate this information between applications and “the network” –Standard procedure in circuit switched and virtual circuit networks Telephone network Frame Relay and ATM –Increases overall complexity –Can provide a wide range of services really well
© Hans Kruse, Ohio University 10 The case of the Internet Successful because a transparent network encourages application development and deployment Because the network elements are simple –Reasonably low complexity –Great flexibility –Not much capability to communicate network information to applications
© Hans Kruse, Ohio University 11 Performance Issues Long term –Increase complexity and add QoS protocol layers –Throw capacity at the network faster than applications require it (good luck...) Short term –Implicit communication of congestion in the TCP protocol –Network performs many different functions, some better than others
© Hans Kruse, Ohio University 12 Application View Network attachment over which I dispatch my packets -- known Intermediate network –Contains many links and queues Application sees an overall “latency”, or delay between packet dispatch and receipt More precisely, applications can discover Round Trip Times
© Hans Kruse, Ohio University 13 Sliding Window 123M … 123M … 123M … Idle Time One “Cycle”
© Hans Kruse, Ohio University 14 In practice How is the sliding window mechanism used in TCP What control do we have over performance parameters Starting with a quick TCP review...
© Hans Kruse, Ohio University 15 UDP Header Source PortDestination Port Length Checksum
© Hans Kruse, Ohio University 16 TCP Header Source PortDestination Port Sequence Number Acknowledgement Number Window (flow cntrl)misc Flags ChecksumUrgent Options
© Hans Kruse, Ohio University 17 TCP Connection Setup “Three-Way Handshake” –Send SYN packet –Wait for peer to return a SYN/ACK packet –Acknowledge the SYN/ACK packet
© Hans Kruse, Ohio University 18 TCP Connection Termination Send a FIN packet Wait to receive acknowledgement of FIN
© Hans Kruse, Ohio University 19 TCP Data Exchange Sequence Numbers - Sliding Window –Arbitrary initial setting –Labels the first byte of the segment Acknowledgements –Indicate the next byte the receiver is looking for, all previous bytes have been received.
© Hans Kruse, Ohio University 20 TCP Segment Size Originally Unlimited –IP fragments segments that are too large –Turned out to be very inefficient SYN packet can carry the MSS (Maximum Segment Size) option –Must be approved in the SYN/ACK –Default used if the option is not present
© Hans Kruse, Ohio University 21 TCP Sliding Window Operation Sender Receiver snd.unasnd.nxt snd.una +snd.wnd rcv.nxt +rcv.wnd snd.wnd (local to the Sender) rcv.wnd (Must tell the sender this value)
© Hans Kruse, Ohio University 22 Slow Start Congestion Control Idle Time Window doubles in each “cycle” Note: recent TCP amendments permit more than 1 initial segment
© Hans Kruse, Ohio University 23 The Congestion Collapse Problem Original TCP specs used the window for flow control, and retransmission after 2 round trip times Congestion of a link causes the timers to “go off” before an ack can be returned The network goes into steady state congestion where every segment is transmitted about three times
© Hans Kruse, Ohio University 24 Congestion Issues Slow Start - New Connection –Set send window to n*MSS (n <= 4) –Increase the window by MSS for each ack received –Exponential increase in send window size What is the limit? –Window size reached before full utilization –Path is overloaded and an intermediate router discards one or more packets
© Hans Kruse, Ohio University 25 Congestion Issues cont... Packet loss –may occur due to actual errors or congestion –TCP equates loss with congestion Congestion Avoidance, Timer Back-Off –Reduce send window to 1/2 of previous size for each retransmit (exponential back-off) –After a segment is retransmitted, set the new RTO timer for that segment to 2*RTO, up to a hard upper bound (2*MSL, Maximum Segment Life) (RTO = Retransmit Time-Out)
© Hans Kruse, Ohio University 26 Congestion Issues cont... Slow Start - After retransmission –Exponential slow-start up to 1/2 of the original window size –Increase the window by MSS for each send window ack’ed without loss –Linear increase in send window size
© Hans Kruse, Ohio University 27 What can we control Vendors –TCP implementation needs to follow most recent guidelines –TCP window size should be configurable Users –Control the TCP window –For each application (rare) –For the entire workstation (more likely)
© Hans Kruse, Ohio University 28 Tuning, cont. Network Administrator –Router Queues In Out In Out In Out In Out
© Hans Kruse, Ohio University 29 Optional Slides on TCP window operation
© Hans Kruse, Ohio University 30 Example... Sender Receiver Received and acked; not yet picked up by client Available receive window space Sent but no ack received yet Next segment to send Available window for further sends
© Hans Kruse, Ohio University 31 Segment Dispatch Dispatch segment to IP Set RTO (Retransmit Time Out) timer –Proportional to the Round Trip Time (RTT) Sender Sent but no ack received yet Next segment to send Available window for further sends
© Hans Kruse, Ohio University 32 Segment Receipt with Pickup Send Ack segment with Ack=2001 Window = 4000 Receiver Received and picked up by client Available receive window space
© Hans Kruse, Ohio University 33 Segment Receipt w/o Pickup Send Ack packet with Ack = 2001 Window = 3500 Receiver Received and picked up by client Available receive window space Received but not picked up by client
© Hans Kruse, Ohio University 34 Acknowledgement Receipt Seg received with Ack=2001, Win=3500 –Left window edge to 2001 –Right window edge to 5501 Sender before after 5501
© Hans Kruse, Ohio University 35 Segment Receipt After Segment Loss Send a “duplicate” acknowledgement –Send Ack packet with Ack = 2001 –Window = 3500 Receiver Received and picked up by client Received but not picked up by client Last segment received Missing segment
© Hans Kruse, Ohio University 36 Retransmission Highest Ack Number received is 2001 –Duplicate Ack=2001 may have been received RTO timer for segment 2001 expires and 2001 is retransmitted –Trigger congestion avoidance algorithm –We really want to avoid this because RTO is large Sender
© Hans Kruse, Ohio University 37 Retransmit Timing and Window Size - Single Error BDP (Bandwidth Delay Product) –Ethernet: 1ms * 10Mbps = 1250 bytes –Satcom T1: 500ms * 1.5Mbps = 94 kbytes Assume window size = BDP –RTO > 2*RTT –“Recovery Ack” after retransmit needs 1 RTT –Channel idles for length of RTO (“drained pipe”)
© Hans Kruse, Ohio University 38 Retransmission Timer Implementation Running estimate (based on Acks) of –Average RTT –RTT variance factor Exclude retransmissions Set RTO to RTT times RTT variance factor (with a hard upper bound) –Around 2 RTT for lightly loaded links –As high as 16 RTT for congested links
© Hans Kruse, Ohio University 39 Window Scaling 16 bit window field in the TCP header allows a maximum of 64 kbytes for the window. RFC 1323 defines the window scaling option: –Syn segment suggests a “scaling” factor –Ack/Syn approves –All window advertisements are scaled by that factor prior to use in TCP
© Hans Kruse, Ohio University 40 Window Scaling cont... Large windows cause an adjunct problem: sequence number reuse –RFC 1323 limits the window to about 1Gbyte to fit within the sequence number space –OC-12 will use all sequence numbers in about 28 sec. –Segements can “live” in the network for 120 sec
© Hans Kruse, Ohio University 41 Fast Retransmit Duplicate Ack=2001 have been received Re-send segment 2001 before RTO expires –“Guess” that 2001 was lost –Wait for >=3 dup acks (segements could just have arrived out-of-order) –Enter congestion avoidance with allowance for duplicate acks Sender
© Hans Kruse, Ohio University 42 Selective Acknowledgement Enabled during Syn and Syn/Ack Receiver send segment with –Ack = 2001, Window = 3500 –SACK option: block start=2501, end=2600 Receiver Last segment received Missing Segment