Aleksandar Kuzmanovic

Aleksandar Kuzmanovic
Low-Rate TCP-Targeted Denial of Service Attacks (The Shrew vs. the Mice and Elephants) Aleksandar Kuzmanovic Edward W. Knightly Hello everybody, my name is Aleksandar Kuzmanovic and my advisor is Ed Knightly, we come from Rice University in Houston, Texas. I will present a paper entitled as Low-Rate TCP-Targeted Denial of Service Attacks, or alternatively The Shrew vs. the Mice and Elephants. You probably don’t know what a shrew might be, but I will explain right now, and that is actually the first contribution of this paper. Rice Networks Group

Background Traditional view of DoS attacks
Attacker consumes resources and denies service to legitimate users Ex. traffic floods, DDoS Result: TCP backs off Observe: statistical anomalies that are relatively easily detectable Due to attacker’s high rate To motivate this work, we start off with a traditional view of Denial of Service attacks which is defined as an activity where an attacker malicously monopolizes / consumes network resources in order to deny service to legitimate users. For example, an attacker can flood certain network links with high-rate traffic, or use compromise other machines in the network to direct high data volumes to a particular link in the network and this is called Distributed denial of service attacks. Since most of the applications in the Internet use TCP, they will back-off due to this congestion and this is a well known vulnerability of TCP to attacks by high-rate non-responsive flows. However, common to the above attacks is a so called “sledge-hammer” approach of high-rate transmission of packets towards the attacked node. So, while potentially quite harmful, the high-rate nature of such attacks presents a statistical anomaly to network monitors and could be relatively easy detected. On the other hand, in this paper we study low-rate DoS attacks, that can send at sufficiently low average rate to elude detection by counter-DoS mechanisms, and are still able to severely degrade service to legitimate users. Thus, ...

Thesis: TCP is Vulnerable to Low-rate Attacks
Shrew: low-rate TCP-targeted attacks Elude detection by counter-DoS mechanisms Able to severely deny service to legitimate users Goals Analyze TCP mechanisms that can be exploited by DoS attackers Explore TCP frequency response to Shrews Evaluate detection mechanisms Analyze effectiveness of randomization strategies Methodology: modeling, simulations, Internet experiments Thus, our problem here is to explore TCP’s vulnerability to low-rate attacks. And a reasonable question here is why do we want to do something like that, are we malicious? No, our goals are as follows: First, to analyze TCP mechanisms that can be exploited by DoS attackers. Second, to explore TCP frequency response to Shrews. Next, to evaluate detection mechanisms and Finally, to analyze effectiveness of randomization strategies. ---*OUT first, to detect and isolate fragile network/protocol mechanisms that are used as tools of a possible DoS attacks and second, to reveal dangerous low-rate streams and detect and isolate applications that are able to generate such streams.

Shrew Very small but aggressive mammal that ferociously attacks and kills much larger animals with a venomous bite Since the low-rate attacks that I will present in this talk can be quite harmful to both short- and long-lived TCP flows, also known as mice and elephants, we add another animal in the Internet jungle and call these attacks as the Shrew attacks, inspired by a very small but aggressive mammal that ferociously attacks and kills much larger animals with a venomous bite. On the other hand, one of the reviewers observed however that only Some shrews are venomous and the amount of venom in even venomous species is very mild. ---? Which also makes sense because we will also show that some flows are Immune to shrews. Reviewer 3: “only some shrews are venomous and the amount of venom in even the venomous species is very mild.”

TCP: a Dual Time-Scale Perspective
Two time-scales fundamentally required RTT time-scales (~ ms) AIMD control RTO time-scales (RTO=SRTT+4*RTTVAR) Avoid congestion collapse RTO must be lower bounded to avoid spurious retransmissions [AllPax99] and RFC2988 recommends minRTO = 1 sec The main target of our attack is TCP which is known to be mostly used protocol In today’s Internet. Before explaining the details about low-rate attacks, let me first give a brief but necessary intro about TCP protocol, which we may observe from a timescale perspective. There are two fundamentally required timescales at which TCP operates. The first one is RTT timescale, typically of the order few tens of miliseconds, and TCP performs well-known additive increase / multiplicative decrease control, On the other hand, in cases of severe network congestion, when TCP experience multiple packet losses, it has to back-off for longer time-scales to avoid congestion collapse. The back-off time equals the retransmission time out which is computed as SRTT + 4*RTTVAR, and the RTO time scales are usually much longer than the RTT timescales. Moreover, Allman and Paxson did an experimental study and have found out that in order for TCP to avoid spurious retransmissions, it should lower bound the RTO value such that whenever the RTO is below 1 sec, it should be rounded up to 1 sec. The intuition behind this is that it pays-off to use slightly conservative values for the RTO, and thus avoid spurious retransmissions and consequently improve the throughput. The same authors wrote the RFC 2988 where it is recommended to lower bound the RTO parameter to 1 sec. In our opinion, slow RTO time-scale mechanisms are a key source of vulnerability to low rate attacks.

TCP Timeline Timeline of TCP congestion window AIMD control
And I will explain this mechanism through a timeline of TCP congestion window. As we all know, in congestion avoidance, TCP performs additive increase/multiplicative decrease control. For example, we have shown two flows, blue and green, and they both linearly increase their window sizes initially. Next, thay experience packet losses and do that independently from each other. These losses are marked with appropriate green and blue dots. Consequently, upon packet losses, TCP flows cut their window sizes by half.

The Shrew Attack (1/3) Pulse-induced outage – multiple losses force TCP to enter RTO mechanism Short outages (~RTT) force TCP to timeout All flows simultaneously enter this state However, if the TCP flow experiences correlated packet losses, it will be forced to enter retransmission timeout mechanism as shown in the picture. We refer to these events as outages, and one typical such event is when all the packets from the window of data are lost. In the picture, I have shown the event of outage with a red line, and you can see that in this scenario both green and blue flows experience correlated packet losses. The key message from this slide is two fold. First, the outage needs to be of the order of RTT (remember, this is a short time scale at which TCP operates), and second, TCP will back-off for the minRTO = 1sec timescale, i.e., the longer time-scale, provided that SRTT + 4*RTTVAR is less than 1 sec, which is a reasonable assumption. I will later in the talk treat the case when this is not the case. And second, observe that all flows will simultaneously (in the same moment) enter the RTO period, and this period will last identically for all flows. Why is this important - because we know when to hit the system next.

The Shrew Attack (2/3) When flows attempt to simultaneously exit timeout and enter slow-start… Shrew pulses again and forces flows synchronously back into timeout state Well, it is of course expected that protocols react identiacally to such events, that is why they are defined. However, observe that such a deterministic protocol behavior can be exploited by a malicious attacker: Once the timeout expires, and both flows try to recover, we hit them again… Thus, the Shrew attacks exploit protocol determinism – the fact that all TCP flows will backoff for the same amount of the minRTO period…

The Shrew Attack (3/3) Shrew periodically repeats pulse
RTT-time-scale outages inter-spaced on minRTO periods can deny service to TCP Flows synchronize their state to the Shrew And by repeating these short outages on RTO time scales it is possible to Significantly deny service to TCP flows as is shown in the figure. Thus, the key mechanism is the flow synchronization and lies in the fact that these periodic outages are able to accurately synchronize TCP flows and that all flows behave as dictated by a single attacker.

Shrew Principles Shrews exploit protocol homogeneity and determinism
Protocols react in a pre-defined way Tradeoff of vulnerability vs. predictability Periodic outages synchronize TCP flow states and deny their service Slow time scale protocol mechanisms enable low-rate attacks Outages at RTO scale, pulses at RTT scale imply low average rate So to summarize: A single RTT-length outage forces all TCP flows to simultaneously enter the timeout. All flows respond identically and backoff for the minRTO period We exploit this protocol determinism and repeat the outage after minRTO period and force all TCPs to re-enter timeout And in this way, by creating periodic outages, a single attacker synchronizes TCP flows denies their service Outages occur relatively slowly (RTO-scale) and can be induced with low average rate. And the question is how should one create these outages:

Creating Outages in the Network
Shrew: square-wave stream (l~RTT, T~minRTO) Optimal pattern in paper Low-rate “TCP friendly” DoS  hard to detect Counter-DOS mechanisms tuned for high rate attacks Detecting Shrews may have unacceptably many false alarms (due to legitimate bursty flows) Well by simply sending periodic bursts into the network. In the figure I have shown a simple square-wave DoS stream, which is a general Denial of Service pattern that we use. It has magnitude of the peak R, length of the peak l and period of the attack of T Recall that the burst length should be on the order of flow’s roundtrip time and that the period of the attack is on the time-scale of the minRTO parameter, and this implies that this denial of service stream will have very low average rate. And the point about these attacks being low rate is the fact that these types of attacks are hard to detect. This is because most counter-DoS mechanisms are tuned for sledge-hammer attacks which are high rate. On the other hand, detecting Shrews is inherently hard due to fact that many legitimate flows in the Internet can burst for very short intervals and thus detecting shrews may have unacceptably many false alarms. We next want to see if such a stream can accurately create outages in the network and what happens when we multiplex this stream with a TCP flow?

Outline Shrew attack Simulation and Internet experiments
DoS detection mechanisms minRTO randomization

The Shrew in Action How much is TCP throughput degraded? DoS stream:
R=C=1.5Mb/s; l=70ms (~TCP RTT) So here we have a simple simulation experiment where we have a single bottleneck link shared by a TCP flow and a DoS flow. The parameters of the flows are: DoS burst rate equals the link capacity while the peak the length of the burst is 70ms which approximately equals the TCP flows roundtrip time. In the figure, we have plotted the throughput of the DOS flow vs. the inter-burst rate. You can see that as the inter-burst period increases, the average normalized rate of the DoS stream decreases. And what happens with the TCP throughput throughput...

The Shrew in Action Shrews induce null frequency near RTO
Shrew has low average rate  .08C Analytical model accurately predicts degradation STRAIGHT LINE – WHEN THEE IS NO DOS FLOW Well, we see that when the DoS sends at high rate, TCP service is denied because of its well-known vulnerability to high-rate flows. However, as explained previously, TCP also shows a significant vulnerability and is fragile on relatively longer intervals of the attack ... and we call these time-scales of the attack as the null-TCP time-scales, meaning that these are time-scales of the attack at which TCP degrades mostly. As hypothesized above, these time-scales are dominated by the minRTO parameter and the most interesting time-scale of the attack is exactly this 1-sec time-scale since this is where the TCP throughput is brought to zero and the average rate of the attack is minimized. In this particular scenario it is 7% of the link capacity.

Challenges for Shrews Aggregation RTT heterogeneity DoS peak rate
Vulnerable due to Shrew-induced flow synchronization RTT heterogeneity Shrews are high-RTT pass filters DoS peak rate Less-than-bottleneck bursts can damage short-RTT flows Short-lived TCP flows Web browsing Internet experiments Can Shrews be successful on the Internet? We just showed that the shrew attack can be devistating to TCP and difficult to detect. But this was a very specific scenario. Now we look at a broad class of scenarios to explore the Shrew attack in more depth. First, what we saw is that we can deny service to a single TCP flow and homogenous TCP aggregates without sending too much traffic in the network. However, we know that RTTs are heterogeneous and range from several ms to several hundreds of ms, how does that influence the attacks? The following challenge is what is the peak rate at which one needs to burst to cause the outages? The majority of the traffic in the Internet are short lived flows, can we deny the service to them? - Next, I will provide the results from our experiments performed in the Internet. And finally there are many counter-DoS mechanisms out there, both end-point-based and router-based, and we explore if they are able to detect these streams.

Shrews vs. Short-lived TCP Traffic
Scenario: Web browsing [FGHW99] Average damage to a mouse (<100pkts) =400% delay increase an elephant (>100pkts) =24500% delay increase The next challenge are short-lived flows which are known to form the majority of the traffic in the Internet. The question is if low-rate streams can hurt such traffic. The scenario is as shown in the upper right figure. We have a pool of clients and a pool of servers. The clients are requesting the files from the servers, and these files are transferred from R1 to R2. At the same time, we attack this traffic by a DoS stream. The picture below shows the results of the simulation. The y-axis are the file response times normalized to the response times when there is no attack in the system and the x-axis is the file size. The blue curve is a reference line showing response times when there is no attack in the system. On the other hand, red dots are averaged response times when there is DoS attack. Observe first that longer files are more vulnerable than the short files. Average damage to a mouse flow (less than 100 packets) is delay increase of 400%, while the average delay increase for longer files (greater than 100 packets) increases for 245 times. ---- OUT Indeed, if the file is only few packets, it may transfer these packets in between two outages. However, when the file size increases, it becomes more vulnerable to DoS attacks, I.e., some flows response times are degraded for more than 1000 times.

Shrews vs. Short-lived TCP Traffic
Scenario: Web browsing Larger files more vulnerable most suffer some benefit The next challenge are short-lived flows which are known to form the majority of the traffic in the Internet. The question is if low-rate streams can hurt such traffic. The scenario is as shown in the upper right figure. We have a pool of clients and a pool of servers. The clients are requesting the files from the servers, and these files are transferred from R1 to R2. At the same time, we attack this traffic by a DoS stream. The picture below shows the results of the simulation. The y-axis are the file response times normalized to the response times when there is no attack in the system and the x-axis is the file size. The blue curve is a reference line showing response times when there is no attack in the system. On the other hand, red dots are averaged response times when there is DoS attack. Observe first that longer files are more vulnerable than the short files. Average damage to a mouse flow (less than 100 packets) is delay increase of 400%, while the average delay increase for longer files (greater than 100 packets) increases for 245 times. ---- OUT Indeed, if the file is only few packets, it may transfer these packets in between two outages. However, when the file size increases, it becomes more vulnerable to DoS attacks, I.e., some flows response times are degraded for more than 1000 times.

Internet Experiments: Scenario
Scenario: victim on a lightly loaded 10 Mb/sec LAN Attacker on same LAN, nearby LAN, or over WAN WAN path: EPFLETH, 8 hops (10/100/OC-12) We also performed the Internet experiments on three different site: on Rice University, on ETH and EPFL in Switzerland. (For the EPFL talk - give the name of a person who gave you the account (Martin Vitterli) and make a joke) All the attacks were performed with TCP Sack, and we have launched the attacks from the machines that were on the same LAN as the node, from the nearby LAN and in a wide area network. In the figure, we again have normalized TCP throughput as a function of the period of the attack. In summary, we were able to throttle down TCP throughput to approximately 7-10% of the link capacity while the average DoS stream was at the same order. Also, probably the most interesting scenario is the one when the attack was launched through a high speed WAN, from EPFL to ETH. The key issue here is that DoS stream is not significantly distorted when sent through a WAN, and indeed, since the utilization of the network core and high speed links is very low, it is actually possible to launch these attacks in a wide area network.

Internet Experiments: Results
Shrew average rate: 909 kb/sec R = 10 Mb/sec, l = 100 msec, T = 1.1 sec TCP throughput 9.8 Mb/sec without Shrew 1.2 Mb/sec with Shrew, 87.8% degradation We also performed the Internet experiments on three different site: on Rice University, on ETH and EPFL in Switzerland. (For the EPFL talk - give the name of a person who gave you the account (Martin Vitterli) and make a joke) All the attacks were performed with TCP Sack, and we have launched the attacks from the machines that were on the same LAN as the node, from the nearby LAN and in a wide area network. In the figure, we again have normalized TCP throughput as a function of the period of the attack. In summary, we were able to throttle down TCP throughput to approximately 7-10% of the link capacity while the average DoS stream was at the same order. Also, probably the most interesting scenario is the one when the attack was launched through a high speed WAN, from EPFL to ETH. The key issue here is that DoS stream is not significantly distorted when sent through a WAN, and indeed, since the utilization of the network core and high speed links is very low, it is actually possible to launch these attacks in a wide area network.

Counter DoS mechanisms Robust TCP variants (NewReno, Sack…) Router detection mechanisms (RED, RED-PD, …) minRTO randomization

Detecting Shrews Shrews have low average rate, yet send high-rate bursts on short time-scales Key questions Can algorithms intended to find high-rate attacks detect Shrews? Can we tune the algorithms to detect Shrews without having too many false alarms? A number of schemes can detect malicious flows E.g., RED-PD: use the packet drop history to detect high-bandwidth flows and preferentially drop packets from these flows Next, these low-rate streams send at sufficiently low average rates, however, they may burst at high rates for very short time intervals, and the question here is if these flows could easily be detected and throttled by the the existing counter-DoS mechanisms. A number of schemes have been proposed to detect malicious flows and here we evaluate RED with preferential dropping which is a scheme that uses packet drop history to detect high-bandwidth flows and preferentially drops packets from these flows. And the key questions that we want to answer are: Can algorithms intended to find “sledge-hammer” attacks detect shrews, and Could we tune these algorithms to detect shrews without having too much false alarms? We will next show the experiments indicating that answers are no to both of these questions.

Router-Assisted Mechanisms
Scenario: 9 TCP Sack flows with RED and RED-PD RED-PD only detects Shrews with unnecessarily high rate Reducing RED-PD measurement time scale results in excessive false positives Here, we have an experiment with 9 TCP Sack flows and the AQM is RED and RED-PD. In the Figure, we have TCP throughput and DoS throughputs plotted as a function of the period of the attack for both RED and RED-PD. First, Observe that while RED-like randomization actually helps in smoothing out TCP dips, a DoS stream is able to significantly throttle down an aggregate of TCP Sack flows, and that neither RED nor RED-PD are able to defend the system since we observe here that there is a huge dip on the minRTO time-scale of the attack. Second, observe that RED-PD is tuned to detect high-rate flows, and the only difference among RED and RED-PD is in the fact that RED-PD is actually able to detect DoS stream, but only when the period of the attack is below 500ms, I.e., when the DoS has relatively high rate.

Counter DoS mechanisms minRTO randomization

End-point minRTO Randomization
Observe Shrews exploit protocol homogeneity and determinism Question Can minRTO randomization alleviate threat of Shrews? TCP flows’ approach Randomize the minRTO = uniform(a,b) Shrews’ counter approach Given flows randomize minRTO, the optimal Shrew pulses at time-scale T=b Wait for all flows to recover and then pulse again Another apparent strategy for preventing DoS mechanisms is to randomize the minRTO parameter since deterministic minRTO parameter is one of the key vehicles for low-rate attacks. So the key question here is if randomization of this minRTO parameter can alleviate threat of Shrews. Thus, we assume that minRTO is uniformly distributed from a to b and see what happens with the attack. We find that the most vulnerable timescale for an attacker is T=b, and the intuition behind this is that one should wait for all the flows to recover from the timeout and then hit the system again.

End-point minRTO Randomization
TCP throughput for T=b time-scale of the Shrew attack a small  spurious retransmissions [AllPax99] b large  bad for short-lived (HTTP) traffic Randomizing the minRTO parameter shifts and smoothes TCP’s null time-scales Fundamental tradeoff between TCP performance and vulnerability to low-rate DoS attacks remains So, we have come up with a simple formula for the TCP throughput on the T=b timescale of the attack, where n is the number of TCP flows and a and b are the parameters of the uniform distribution. OUT ----- For example, this result tells us that we if the parameters are a=1sec and b=1.5 sec, the TCP throughput would be degraded from 17% of the bandwidth (for a single TCP flow) up to 34% in the case of many TCP flows. ----- There are 2 apparent strategies for increasing throughput on T=b timescale. First, it appears attractive to decrease a which would significantly increase TCP throughput. However, recall that this would increase the number of spurious retransmissions and eventually degrade the TCP throughput in absence of any attack. On the other hand, increasing parameter b can also improve the throughput. However, this is true only when n is high, I.e., when there are enough TCP flows in the aggregate and for long-lived flows. On the other hand, increasing parameter b is not a good option for low aggregation regimes and of course for short-lived flows. In summary, randomizing the minRTO parameter shifts and smoothes TCP’s null frequencies, However, a fundamental tradeoff between TCP performance and vulnerability to low-rate DoS attacks remains

Conclusions Shrew principles
Exploit slow-time-scale protocol homogeneity and determinism Real-world vulnerability to Shrew attacks Internet experiment: 87.8% throughput loss without detection Shrews are difficult to detect Low average rate and “TCP friendly” Cannot filter short bursts Fundamental mismatch of attack/defense timescales

Open Questions Can filters specific to Shrews be designed without excessive false positives? Can end-point algorithms be sufficiently randomized, so that attackers cannot exploit their known reactions performance is not sacrificed Reconsider “TCP friendly” definition

Backup Slides

Aggregation Homogeneous TCP aggregates are vulnerable
Shrews induce flow synchronization Analytical model accurately predicts degradation So next we want to check if we can throttle down throughput of TCP aggregates. We have an experiment of 5 TCP flows of the same RTT attacked by a periodic stream with peak rate again at the bottleneck capacity and with the burst length of the order of TCP RTT. We again have a picture showing TCP throughput as a function of the period of the attack and we again see the null TCP time-scales at the same places as in the single TCP flow scenario and I have already explained that this is due to flow synchronization and that is why an aggregate of TCP flows behaves the same way as a single TCP flow. Scenario: 5 TCP flows, homogenous RTTs

DoS Peak Rate Less-than-bottleneck bursts can damage short-RTT flows
Scenario: 4 TCP flows + DoS 1 short-RTT & 3 long-RTT flows DoS outage ~ RTT of the short-RTT flow Next, we explore the issue of the peak (or burst) rate and want to see what is the peak rate at which one should burst in order to deny service to a TCP flow. We have the following experiment consisting of 4 TCP flows and a DoS flow. One of these flows shorter RTT and the three others have longer RTT, and the DoS flow has burst length on time scales of shorter RTT. On the plot we have the TCP throughput of this short-RTT flow while on the x-axis represents the peak rate of the DoS flow. Observe that when the peak rate is at the range of one-third of the capacity, it is possible to significantly degrade the throughput of this short-RTT TCP flow. At the same time, the average rate of this low-rate DoS flow is only 3.3% of the link capacity, significantly less than its fair-share of 20%. So the question is what is the mechanism that leads to such a behavior?

DoS Peak Rate Long-RTT flows inadvertently collaborate in the attack
Well, the reason is that longer-RTT flows actually collaborate with the DoS flow and improve the attack. This is because these flows are not denied, and they keep the buffers filled up and together with DoS flow improve to the overall rate and the strength of the outage. For that reason, less than the link bottleneck rates are needed to deny service to short-RTT TCP flows. So, the picture on the left that most of us have on mind when imagining this types of attacks should be changed with a picture on the right-hand side where the longer-RTT flows actually collaborate and help DoS flow to kill short-RTT flows. ----- OUT Implication: Low-rate periodic open-loop streams can be very harmful to short-RTT TCP traffic Available bandwidth estimation techniques Audio/Video sources ----- The implication of this phenomena is that very low-rate periodic open loop streams, which are not necessarily maliciously generated, may deny service to certain TCP flows if their period matches one of the TCP’s null time scales. Candidate protocols are some available bandwidth estimation techniques that send huge, but short bursts of packets into the network in an attempt to estimate available bandwidth or audio/video sources that can send periodic bursts of packets into the network. DoS flow is masked with long-RTT TCP flows

TCP Variants TCP Reno is the most fragile NewReno? Sack? Scenario:
Tahoe SACK DoS stream Burst rate equals the bottleneck capacity Burst length:30ms, 50ms, 70ms, and 90ms Here, we explore different TCP versions. It is well-known that TCP Reno is the most fragile TCP stack. The question is what happens with more robust versions such as New Reno or Sack. For a fixed burst rate of the link capacity, we explore what happens as the burst length increases and start off with a 30ms burst. In the picture we have averaged TCP throughput as a function of the inter-burst period. Next, we explore different TCP versions. All the experiments up to now were performed with TCP Reno which is known to be the most fragile in the Internet, the question is what happens with more robust versions such as New Reno or Sack. burst length increases and start off with a 30ms burst.

TCP Variants Burst length = 30ms TCP Reno is the most fragile
In the picture we have averaged TCP throughput as a function of the inter-burst period. Observe that indeed TCP Reno is the most fragile TCP version, since we see the dip in throughput is most pronounced for TCP Reno, while the other versions have significant problems, but manage to survive.

TCP Variants Burst length = 50ms
TCP is the most vulnerable in sec time-scale region due to slow start Next, when we increase the outage length to 50ms, observe that all TCP variants have a null time scale at around 1 sec. Also, observe that all TCP variants are the most fragile in this sec period because of the fact that the TCP flow is in the slow start and that window sizes are very small, which means that small number of packets needs to be lost in a burst to cause TCP to enter timeout. When we go beyond this period of sec of the attack, the window size becomes larger, and thus a single 50ms burst length is not enough to cause outage throtle down TCP throughput.

TCP Variants All TCP variants obtain the same profile
Sufficient pulse width ensures timeout Windows remain small However, when we increase the outage length even further, to 70ms,

TCP Variants Burst length = 90ms
When burst length is severe enough -> all TCP stacks are equally fragile And 90ms, all TCP variants obtain obtain nearly the same response to the attack. Thus, when the burst length is severe enough, all TCP stacks are equally fragile.

The Role of Time-Scales
Scenario: R=2 Mb/s; T=1 sec; l~ ms So here we explore the issue of time-scales and the goal is to see how “fat” should these bursts be in order in order for RED-PD to detect them. Thus, we fix the magnitude of the burst and the period of the attack, and change the length of the attack from 50 to 450 miliseconds and observe what happens with TCP traffic.

The Role of Time-Scales
RED-PD detects l=300 ms shrews Recall that 30 ms enough for DoS A fundamental mismatch If shorter time-scales are used => high false alarm probability (bursty TCP flows) In the Figure, we have shown TCP throughput as a function of this peak length. So. If the attack is ineffective, we would have straight lines on 1. However, this is not the case. Observe that RED-PD starts detecting the DoS flow at 300ms. Recall that much shorter time-scales are sufficient to throttle entire aggregates of heterogeneous-RTT TCP traffic. Also, this picture captures the fundamental issue of time-scales. RED-PD detects high-rate flows on longer timescales, while DoS streams operate at very short timescales. If these shorter time-scales are used to detect Shrews, then many legitimate short-RTT TCP flows that send packet bursts in the slow-start phase would also be incorrectly detected as malicious, and this is exactly why RED-PD does not use these time scales. In other words - one may use shorter time-scales to detect Shrews, however, any such mechanism would inherently have a high false alarm probability. ----- OUT So, in the same scenario with 9 TCP-Sack flows, we first fix the burst length at 200ms, and then change the peak rate from 0.5Mb/s to 5Mb/s. Figure indicates that RED-PD starts detecting and throttling down the DoS stream at 4Mb/s, which is more than twice the bottleneck rate. Recall that we showed that the DoS streams with the capacity of 1/3 of the peak could be very harmful to short-RTT flows.

Shrews vs. Heterogeneous RTTs
Hypothesis: Shrews are high-RTT-pass filters Service is denied to short-RTT flows The next challenge is RTT heterogeneity. The RTTs may vary from several ms to several hundreds of ms, and the goal here is to show what happens in such realistic scenarios. Our hypothesis is that the attack should behave as a long-RTT-pass filter where the service is denied to shorter-RTT flows and where the cut off time scale is determined by the outage length. In other words, longer RTT flows should survive the outage because the length is not wide enough to kill these flows.

Flow Filtering Shrews damage short-RTT flows the most Scenario
20 TCP flows; RTT ~ ms Cut-off time scale ~ 180 ms And here is the experiment that confirms our hypothesis. We have an experiment with 20 TCP flows whose RTTs are uniformly distributed from 20ms to 460ms, and the cut-off time scale is around 180ms meaning that the service is denied to TCP flows whose RTT is less than 180 ms.

Aleksandar Kuzmanovic

Similar presentations

Presentation on theme: "Aleksandar Kuzmanovic"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Aleksandar Kuzmanovic

Similar presentations

Presentation on theme: "Aleksandar Kuzmanovic"— Presentation transcript:

Similar presentations

About project

Feedback