David Wetherall djw@cs.washington.edu Spring 2000 CSE 561 – Reliability David Wetherall djw@cs.washington.edu Spring 2000
This Lecture Routers, continued. End to End argument Retransmission timers djw // CS 561, Spring 2000
E2E Paper Saltzer, Reed, Clark 1984 Captures folklore in a now classic systems paper Design requires deciding what functions to implement and where to place them; E2E guides the latter E2E is a powerful design principle, but not a law djw // CS 561, Spring 2000
E2E Argument A function might be placed in the application, network, or both. Place it where it can be done correctly and completely, or otherwise (if it does an incomplete job) provides a significant performance gain. Rationale for moving functions out of the network and into end-systems. An Occam’s razor. djw // CS 561, Spring 2000
Example: Careful File Transfer File transfer exposed to many kinds of errors other than network corruption (disk, software) E2E check and retry is required for correctness Reduces complexity of low probability failures Strong network checks are not sufficient So avoid them as wasteful Q: How does file transfer work today? djw // CS 561, Spring 2000
Simplicity vs. Complexity Downsides of low-level implementation Duplication of effort can lower performance Optimizes network for one type of use Upsides of low-level implementation Partial implementation can improve performance Avoid repeated implementation by each app djw // CS 561, Spring 2000
Tensions Non-performance aspects of network implementation Bandwidth enforcement, firewalls, AUPs Need to take administrative regions into account System evolution Transparent caching, NAT boxes Want to administer end-systems in a scalable manner Generic vs. per Application network support Multicast, content distribution, active networks Value in using network location if not a cost to all djw // CS 561, Spring 2000
Retransmission Timers Timeouts (RTO) are used to decide a packet has been lost and should be retransmitted. Based on estimate of RTT and hence maximum likely RTT SRTT = alpha x sample + (1-alpha) x SRTT RTO = beta x SRTT RTO exponential backoff for successive losses djw // CS 561, Spring 2000
The Value of a Good Timer Q: Does any of this matter? A: Yes. Critical to performance (protocol and network) If too large: Detection of losses delayed, window doesn’t advance, result is low throughput, esp. on error prone links If too small: Early retransmissions (before the original ack arrives) Seriously bad for the network djw // CS 561, Spring 2000
Karn and Partridge (SIGCOMM’87) Deals with retransmission ambiguity Is ack for original or retransmitted packet? Problem: timers were failing, this was a piece. Can’t assume acks are always for new, always for old, or simply ignore if packet is retransmitted. djw // CS 561, Spring 2000
Karn’s Algorithm Insight: Use backoff as part of RTT estimation Don’t use ack for retransmitted packet to calculate RTT On loss, backoff RTO and keep using for subsequent packets until 3. When ack for singly-transmitted packet arrives, use sample to update RTT and reset RTO 1 avoids retransmission ambiguity, 2 ensures good samples will arrive, and 3 tells us when we get one. djw // CS 561, Spring 2000
TCP Timestamps Several TCP options added in early 1990s as part of extensions for high performance Round Trip time Measurement Want more than one RTT sample per window Send timestamp with packet; receiver echoes with ack Resolves retransmission ambiguity djw // CS 561, Spring 2000