Department of Informatics Networks and Distributed Systems (ND) group

Department of Informatics Networks and Distributed Systems (ND) group
TCP "TEB" (Timer-based Exponential Backoff): Code and Rationale Michael Welzl Net Group, University of Rome Tor Vergata

From my last talk: what I envision
SS CA Only at the beginning! Simple rules for increase/decrease events (magnitude determined by CC like before) Increase: upon ACK Decrease: upon ECN or loss Loss determined via (aggressive, not RTO!) per-packet timeout; reduce every time! Undo if we got it wrong (ACKs that shouldn't have arrived – spurious loss det.), adjust timers Avoid over-reacting: look at ACK rate + RTT No need for RTO with SS because we back-off exponentially (instead of: cwnd*=factor, then cwnd=1) Already done today with ECN !

Overview: goals Simplify, but perform no worse than TCP SACK
Scoreboard logic unnecessarily complex (related to SACK processing) Using timers, implement the simplest thing that will "do the job" Maybe not "overdo" it: e.g., don't always pace Key aspects: Ignore DupACKs If SACK is enabled, SACKs are parsed; this only avoids unnecessary retransmissions) No RTO All logic based on timers Also need to pace (a little bit)

When sending a data packet...
SendDataPacket_HOOK Remember the transmission time (for pacing) Remember the highest sent seqno ("highestSentSeqno") (to correctly end recovery) Insert an entry in a "TimerEntry" timer data structure (push into a FIFO queue) seqno retransCount (= global round+1, initially 0+1=1) scheduledTime: to be able to check if this is even still relevant when we see it again (needed in sim., maybe unnecessary when associated with a real timer) schedule a timeout at now+timeout (timeout value known from SYN handshake RTT estimation) Actually, linked list – for SACK only

When any ACK arrives (also SYN/ACK)...
EstimateRtt_HOOK Init 0 Check: ackNo > highestAckNo ? Else it's a DupACK and we want to ignore it => exit (Can't even estimate the RTT with it) Update highestAckNo Using timestamps option, update RTT estimate For now: only let it grow (simple max) Timeout value = 2 * RTT estimate. Rationale: at worst, in slow start, the RTT doubles (but not from one packet to the next, so this should be safe) Need to allow overhead based on most recent RTT value (max lags) In the long run, max becomes the correct value (until rerouting happens: code needs to be updated for this) If SACK enabled, parse SACK: remove SACKed entries from the TimerEntry data structure Check if this ACK ends recovery: discussed later

Side note: pacing Goal avoid bursts Don't "think in slots"...
Note: not so necessary when every ACK clocks out a new packet: bottleneck-based gap TEB code: (only!) after slow start, this is the time gap between timeouts Don't "think in slots"... unnecessarily complex Don't wait for "next free time slot": just ensure minimum gap TEB implementation: Calc: Input: # packets, time; Output: gap ("pacingDelay") Apply: when sending, ensure pacingDelay after previous sending or scheduling time (scheduled packet may not have yet been sent) Careful: don't re-schedule the scheduled transmission itself! Then, update previous sending and scheduling times

When a timeout happens... TebTimeout
Check if anything needs to be done (check seqno stored in front queue element, remove if ACKed) ... go on if there's something left and the timer is due All of this is maybe only simulation specific Enter "recovery state": some new state name to make sure we don't, e.g., increase cwnd upon ACKs If front queue element's retransCount > global currentRetransRound (i.e., only once per RTT) reduce ssthresh (ask congestion control), set prev_ssthresh=ssthresh and cwnd=ssthresh update global round ("currentRetransRound") remember current RTT for pacing (global "pacingTimePeriod") and packetsToPace (#packets / RTT) : ssthresh/segment size Retransmit (with pacing), and remember highestRetransSeqno Else return! "&& currentRetransRound == 0" to only react once We'll need this later! To correctly end recovery

An example of pacing trouble
It's t=1, we decide to pace every second remember prev. send time, prev. scheduled time when called, check: re-schedule or transmit? Is it a scheduled sending time? Is it before the next pacing time? After? If decision = schedule: next timeout: t=1.25 => we schedule this for t=2 next timeout: t=1.5 => we schedule this for t=3 next timeout: t=1.75 => we schedule this for t=4 next timeout: t=2.0 => but the packet's scheduledTime in the queue says that it isn't due for transmission yet... At least in simulations, easy to get into a re-scheduling loop Fix here: when "isn't due yet", remember "delayedTransmissionTime"; later check before deciding: re-schedule or transmit?

Checking if an ACK ends recovery
EstimateRtt_HOOK End upon Condition 1: ackNo > highestSentSeqno This ACK acknowledges everything that was ever sent OR Condition 2: highestRetransSeqno >= highestSentSeqno We have retransmitted everything and only wait for an ACK now Reset some things (currentRetransRound, highestRetransSeqno, ...) and tell the simulator we're done (state = "OPEN" (CA), etc.)

What this gives us....

This is "correct": ACK tells us: no more packets in flight Available window is 9 packets
Being able to handle double drops doesn't help us: our recovery has ended, this happens afterwards, and then we enter recovery again Fix: either clock out via DupACKs (ensure packets are in flight) => we didn't want this... ... or: pace after recovery

The fix in action

Almost parallel because, before loss, RTT was almost the same, and that rate was fine (so we keep it); only cwnd was too large The cwnd that we want, but not the rate that we want forever: this is cwnd after congestion / RTT (Reno: ½ the rate) Different angles here because RTT changed very fast

In comparison: normal SACK TCP
Fantastic scoreboard magic 

normal SACK TCP's cwnd

How? Pacing after recovery: init
EstimateRtt_HOOK – when ending recovery packetsToPace = prev_cwnd / segment size Important! Else we end up pacing forever (you'll see) Logic: keep prev (bottleneck-clocked) rate, despite reducing cwnd prevPacingTime = prevTransmissionTime Correctly initializes the next sending time doPostRecoveryPacing = true: well yes, do turn it on  postRecoveryPrevAvailWin = AvailableWindow () (special trick, explained later) ns-3's AvailableWindow – we use the case without SACK: unack = UnAckDataCount (); // Number of outstanding bytes (highest trans. seq no – highest ack no) win = min(cwnd, rwnd); return win-unack

Pacing after recovery SendPendingData_HOOK
Only continue with our special code if AvailableWindow() > 0 and doPostRecoveryPacing ( and: doPostRecoveryPacing=false if AvailableWindow() = 0 ) At first, tried fixed # packets (1 window) instead of AvailableWindow()>0 => but this gives us yet another burst, AFTER our post-recovery-pacing phase Check: if state = loss recovery, end this! We may actually never really get here (which is good) Pace! ...and do a trick: if we're not fast enough, AvailableWindow() may never become 0, and we'll pace forever! so: if availableWindow has increased, increase sending rate a little (packetsToPace++)

Un-doing spurious loss recovery
Same sim as before, but: timeout = 1*RTTestimate

Spurious loss recovery: how?
When retransmitting a data packet, remember the latest timestamp ("retransmitTS") We're sending retransmits if currentRetransRound > 0 When detecting congestion, remember the previous state ("prev_..") before reducing cwnd and ssthresh When getting an ack (conservative: full ack only!) if(time_in_ack < retransmitTS): restore cwnd, ssthresh values, currentRetransRound - - (we can only undo one spurious event anyway, else we'd have to store a list of ssthresh/cwnd values)

Conclusion A little complex already... still, nothing compared to scoreboard magic (Possible?) future work: RTT estimation a bit too simple (won't work well with re-routing)

Department of Informatics Networks and Distributed Systems (ND) group

Similar presentations

Presentation on theme: "Department of Informatics Networks and Distributed Systems (ND) group"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Department of Informatics Networks and Distributed Systems (ND) group

Similar presentations

Presentation on theme: "Department of Informatics Networks and Distributed Systems (ND) group"— Presentation transcript:

Similar presentations

About project

Feedback