Download presentation
Presentation is loading. Please wait.
1
Algorithms for Network Security
George Varghese, UCSD
2
Network Security Background
Current Approach: When a new attack appears, analysts work for hours (learning) to obtain a signature. Following this, IDS devices screen traffic for signature (detection) Problem 1: Slow learning by humans does not scale as attacks get faster. Example: Slammer reached critical mass in 10 minutes. Problem 2: Detection of signatures at high speeds (10 Gbps or higher) is hard. This talk: Will describe two proposals to rethink the learning and detection problems that use interesting algorithms.
3
(OSDI 2004, joint with S. Singh, C. Estan, and S. Savage)
Dealing with Slow Learning by Humans by Automating Signature Extraction Since worms use automation to get speed, on the principle of using fire to fight fire, why should we not consider using automation to learn signatures. We will assume unsupervised learning: we have no labelled examples. After some of us (including folks at CMU) introduced this model in 2004 there has been a spate of work in this area, but I want to describe not just the technique but our experience using it first at UCSD, then at a company called NetSift that we did to commercialize this technology. (OSDI 2004, joint with S. Singh, C. Estan, and S. Savage)
4
Extracting Worm Signatures by Content Sifting
Unsupervised learning: monitor network and look for strings common to traffic with worm-like behavior Signatures can then be used for detection. PACKET HEADER SRC: DST: PROT: TCP 00F D 3F E M?.w FF cd EB 10 5A 4A 33 C9 66 B ZJ3.f A 99 E2 FA EB 05 E8 EB FF FF FF 70 f p PACKET PAYLOAD (CONTENT) Kibvu.B signature captured by EarlyBird on May 14th, 2004
5
Assumed Characteristics of Worm Behavior we used for Learning
Content Prevalence Payload of worm is seen frequently Address Dispersion Payload of worm is seen traversing between many distinct hosts ===================== There are 2 key characteristics of worm like behavior that drive our approach. 1 if a worm is spreading, content strings that constitute the worm are going to be more common than say an I send to my colleague. 2 also the communication is going to be between multiple hosts.. If the worm is successfully spreading then both these behaviors will hold true. NOTE THAT we do not make any assumptions about the vectors the worm chooses, or the address selection mechanism to spread, ====================== WORK ON TALKING THIS SLIDE, clearly explain dispersion (in out degree) WE WILL MAKE ASSUMPTIONS ABOUT HOW WORMS BEHAVE HERE ARE THE IMPORTANT THINGS.. Optionally, one can also incorporate characteristics that are common to large classes of worms. Random probing. What is central to all of this is that you need to be able to talk about sub-strings as first class entities. Both these behaviors hold true if a Worm is successfully spreading
6
Detector at Vantage Point
The Basic Algorithm B A Detector at Vantage Point cnn.com C E D >> Network detector, for example this could be a firewall or a router or a switch. Exemplify the ideal algorithm Say that, we use the ptable to count the number of times we see the worm or content, And the dispersion table to count the number of unique sources and destinations. We have a detector at a vantage point that is watching traffic say at the entrance to a network or at a data center, that is implementing ‘content sifting’ Let me play through an animation that shows you how the algorithm works. Address Dispersion Table Sources Destinations Prevalence Table
7
Detector at Vantage Point
The Basic Algorithm B A Detector at Vantage Point cnn.com C E D Exemplify the ideal algorithm We have a detector at a vantage point that is watching traffic say at the entrance to a network or at a data center, that is implementing ‘content sifting’ Let me play through an animation that shows you how the algorithm works. 1 (B) 1 (A) Address Dispersion Table Sources Destinations 1 Prevalence Table
8
Detector at Vantage Point
The Basic Algorithm B A Detector at Vantage Point cnn.com C E D Exemplify the ideal algorithm We have a detector at a vantage point that is watching traffic say at the entrance to a network or at a data center, that is implementing ‘content sifting’ Let me play through an animation that shows you how the algorithm works. 1 (A) 1 (C) 1 (B) Address Dispersion Table Sources Destinations 1 Prevalence Table
9
Detector at Vantage Point
The Basic Algorithm B A Detector at Vantage Point cnn.com C E D Exemplify the ideal algorithm We have a detector at a vantage point that is watching traffic say at the entrance to a network or at a data center, that is implementing ‘content sifting’ Let me play through an animation that shows you how the algorithm works. 1 (A) 1 (C) 2 (B,D) 2 (A,B) Address Dispersion Table Sources Destinations 1 2 Prevalence Table
10
Detector at Vantage Point
The Basic Algorithm B A Detector at Vantage Point cnn.com C E D Exemplify the ideal algorithm We have a detector at a vantage point that is watching traffic say at the entrance to a network or at a data center, that is implementing ‘content sifting’ Let me play through an animation that shows you how the algorithm works. 1 (A) 1 (C) 3 (B,D,E) 3 (A,B,D) Address Dispersion Table Sources Destinations 1 3 Prevalence Table
11
What are the challenges?
Computation We have a total of 12 microseconds processing time for a packet at 1Gbps line rate Not just talking about processing packet headers, need to do deep packet inspection not for known strings but to learn frequent strings. State On a fully-loaded 1Gbps link the basic algorithm could generate a 1GByte table in less than 10 seconds ============================= This basic algorithm is cute, and it works, but it is not scalable. Prevailing network devices operate in the range of 100Megabits to 10Gigabits per second, we need an algorithm that scales to those line rates. The first problem with the basic algorithm is computation, on a gigabit link we have a total of 12 microseconds of processing time per packet, this severely limits the number of memory references per packet. On top of that we are talking about processing packet payloads not just header, imagine processing a 1500 byte packet in under 12 seconds. The more pressing limitation, is the amount of state that we can keep at these line rates. We can fill a gigabyte, with just the prevalence table, in under 10 seconds Thus, our two challenges are to dramatically reduce memory and computation to track things over time. The remainder of this talk is focused on that. ======================== Doing that (line-rate) There are challnges 100Mbps 10Gbps that is the range of prevailing network devices, need an algo that scales in that range. What is the domain: “for example at 100mbps at commodity ram we can get these many per packet’ Content change to.. Packet payload, deep packet inspection.. Put some text on the slide. WHAT ARE THE CHALLENGES WE COULD MAINTIN A HISTOGRAM FOR ALL SUBSTRINGS, AND A LIST FOR ALL ADDRESSES IT takes Takes huge amount of state IF YOU WANT TO BE ABLE TO GO AT 200Mbps, you can only afford these many memory references per packet, Naïve way to do this is to keep table for every substring and addresses, but in fact to make it fly at line speed Here are the requirements that we have.. Clearly the naïve algorithm does not scale, We need to be able to talk about things in the string domain. We are looking for a byte-string. Sequential byte-string that is invariant. We are only gong to deal with worms for which that is true. =========== 1500 byte packet 1461 signatures of 40 bytes 6000 memory references (~ 4 memory references per byte) 60ns DRAM Max rate 0.03 Mbits/sec
12
Idea 1: Index fixed length substrings
Approach 1: Index all substrings Problem: too many substrings too much computation too much state Approach 2: Index packet as a single string Problem: easily evadable (e.g., Witty, viruses) Approach 3: Index all contiguous substrings of a fixed length ‘S’ Will track everything that is of length ‘S’ and larger ================================= The ideal approach is to index all possible substrings, the problem with this is that there are too many of them. The other extreme is to index the whole packet, this saves us in computation and state but the problem with this approach is that a simple byte shift would evade our system. Our approach is to index all contiguous substrings of a fixed length. Note that even though we are tracking fixed length substrings, we will get everything that is of the fixed length and larger. Imagine that we had a byte stream with the characters A through K, indexing every contiguous substring of fixed length 4 would mean, taking a window of 4 bytes and sliding it across the byte stream. ================================ What if it is less than 40 bytes? Number 40 is not intrinsic, can make it lower at a cost and there are a number of ways Yet to see anything even close to 40. whole set of other issues with polymorphism If you could fit the entire thing in less than 40 then more power to you ** WHRE IS A GOOD PLACE TO INTRODUCE THIS i.e. how we hash the content / rabin fingerprints ** Rabin fingerprints can be computed incrementally and have the property that no matter where the sub-string occurs in the string, it will always have the same hish. COMPROMIZE THAT If the invariant is at least that small, we will detect it. WE CAN PARAMETRIZE THAT, For the remainder of the paper, we talk about substrings of length 40 which we think is sufficient as all the worms that we have witnessed have had invariants been longer than 40 bytes. Why is this much easier to do: Talk about rabin, dramatically less state, What we give up is that if the worm is less than 40 bytes then we wont catch it. In principle Ideal case is approach 2, build histogram of all substrings that we see.. And also keep track of the number of unique sources and destinaitons. And the algorithm is simply going to be to pick the substrings that are seen a lot and classify them as worms. A B C D E F G H I J K
13
Idea 2: Incremental Hash Functions
Use hashing to reduce state. 40 byte strings 8 byte hash Use an Incremental hash function to reduce computation. Rabin Fingerprint: efficient incremental hash ====================== The next thing to note is that we can use a hash of the substring instead of the substring itself to reduce state, Even though we are now considering only fixed length substrings it still requires computation overhead. We note that we can use an efficient hash function to efficiently compute the hash by reusing the current hash as a starting point for the next one. Rabin fingerprint is one such efficient incremental hashing function. Additionally Rabin fingerprints have the property that a substring will generate the same hash no matter where it appears in the byte-stream. For example here we have 2 packets, if we were to compute the rfp of the byte sequence ABCD in packet P1, and packet P2, then we would get the same fingerprints. ================================== P1 R A N D A B C D O M Fingerprint = P2 R A B C D A N D O M Fingerprint =
14
Insight 3 : Don’t need to track every substring
Approach 1: sub-sample packets If we chose 1 in N, it will take us N times to detect the worm Approach 2: deterministic or random selection of offsets Susceptible to simple evasion attacks No guarantee that we will sample same sub-string in every packet Approach 3: sample based on the hash of the substring (Manber et al in Agrep) Value Sampling: sample fingerprint if last ‘N’ bits of the fingerprint are equal to the value ‘V’ The number of bits ‘N’ can be dynamically set The value ‘V’ can be randomized for resiliency ============== At this point we have been successful in considerably reducing the number of substring we track, even so we at gigabit speeds we still need to reduce the candidate signature pool further. One way to do this would be to sub-sample packets, the problem with this is that if we were to sample say 1 in every 64 packets, it would take us 64X the time to detect. The other approach would be to select fingerprints randomly, the problem with this is that there are no guarantees about how soon we will sample the fingerprint again, on the other hand deterministically choosing fingerprints based on byte position is easy to evade. The approach we use is the sample fingerprints based on their value, what this guarantees is that once we choose a fingerprint, we will always chose the fingerprint in the future. The way we implement this is to essentially check for an equality in the last N bits of the fingerprint. The thing to note is that the number of bits can be dynamically set to keep up with line rate, and the value in the equality can be randomized for resiliency. ================== APPROACH 1 subsample the packet Approach 2 Approach 3 Could do this by subsampling packets, it will take 64 times the time Determins of randomly is simple to evade SHOULD I HAVE AN ANIMATION HERE * What these bits are can be changed over time to evade aversion Say that the details are in the paper. This effects both prevelance and dispersion Gurantees that if the invariant part of the worm payload is larger than 40 bytes we will detect it Can sub-sample the data stream. Why not sub-sample packets.. Drawback for that is.. If it is 1:20, it will take us 20 times longer.. Especially if it is across packets
15
Implementing Insight 3: Value Sampling
Value Sampling Implementation: For selecting 1/64 fingerprints Last 6 bits equal to 0 Ptrack Probability of selecting at least one substring of length S in a L byte invariant For last 6 bits equal to 0 F=1/64 For 40 byte substrings (S = 40) Ptrack = 99.64% for a 400 byte invariant Fingerprint = SAMPLE A B C D E F G H I J K Fingerprint = IGNORE Fingerprint = IGNORE Fingerprint = SAMPLE To exemplify what I just explained, consider the bytestream A through K, we slide the window across, in the first case the last 6 bits of the fingerprint are 0 and we sample the fingerprint, the next time it is not and we ignore the fingerprint. We have done some analysis on the probability of tracking and missing invariants due to using value sampling, the details are in the paper, but the key take away is that if the invariant portion of the worm payload is 400 bytes or larger, then we have an extremely high probability of sampling at least one 40 byte sequence from the invariant.
16
Implementing Insight 2: Value Sampling
Ptrack Probability of selecting at least one substring of length S in a L byte invariant For last 6 bits equal to 0 F=1/64 For 40 byte substrings (S = 40) Ptrack = 92.00% for a 200 byte invariant Ptrack = 99.64% for a 400 byte invariant Ptrack= 1 – e –F (L - S +1) We have the same hash function, if the thing is in 2 different packets, the hash is going to be the same. Odds of getting the same substring in 2 different packets are here are a 100%
17
Insight 4: Repeated substrings are uncommon
Cumulative fraction of signatures Only 1% of the 40 byte substrings repeat more than 1 time SO FAR we have been successful in considerably reducing the number of candidate substring that we process but we need to dramatically reduce the state as well. We observe that in actual traffic most substrings appear just once! What this plot shows is blah blah! By focusing on the substrings that repeat we can get 2 or more orders of imp If we can get 2 or more orders of magnitue improvement. In state reducting if we use frequency. We observe that frequently appearing substrings are rare. Insight is that if we use the data structure that tracks only the really common ones then the state requirement goes away. It’s ok to not count all If we can capture the substrings that occur a lot then we don’t need to keep state for 99% of the things that go on ** COMMENT: The fact that content prevalence acts as a high pass filter simply because the way the traffic.. Number of repeats Can greatly reduce memory by focusing only on the high frequency content
18
Implementing Insight 4: Use an approximate high-pass filter
Multi Stage Filters use randomized techniques to implement a high pass filter using low memory and few false positives [EstanVarghese02]. Similar to approach by Motwani et al. Use the content hash as a flow identifier Three orders of magnitude improvement over the naïve approach (1 entry/string) Rather than maintaining state for every substring as one would in the naïve approach, instead we could use a randomized data structure called multi stage filters, developed to identify high bandwidth flows. Constant is inversely proportional to the threshold. N = number of signatures we throw at the multistage filter K = is the factor proportional to the memory of each stage d = number of stages. There are a number of optimizations that reduce the number of false positives, A piece of content is hashed using independent hash functions into each stage, each bucket is a counter that is incremented on every hit. If all the counters are above the prevalence threshold then we create and entry for this sub-string in the dispersion table. ** I WILL MAKE THIS PICTURE LOOK BETTER LATER ** == We have a MAGIC THING, very briefly the way it works is that. We have a previously establised bound on the error. Approx BLAH BLAH reduction in memory, and costs, only xyz memory references.. This is how we store data only propotional to the heavy hitters SAY HERE, that we use a threshold of 3 GIVE THE BOUND FOR MSF, There are additional heuristics that we can use to improve. constant + (n/kd)
19
INSERT in Dispersion Table
Multistage Filters Hash 1 Increment Counters Stage 1 Comparator Packet Window Hash 2 Stage 2 Comparator Hash 3 INSERT in Dispersion Table If all counters above threshold Stage 3 Comparator
20
Insight 5: Prevalent substrings with high dispersion are rare
21
Insight 5 : Prevalent substrings with high dispersion are rare
Naïve approach would maintain a list of sources (or destinations) We only care if dispersion is high Approximate counting suffices Scalable Bitmap Counters Sample larger virtual bitmap; scale and adjust for error Order of magnitude less memory than naïve approach and acceptable error (<30%) There are well known techniques but they are heavy weight, we have developed a problem specific one for worms. Let us reduce the amount of memory we use by a factor of 5 while keeping the error within reasonable bounds 0.285 INSIGHT, we only care if there are a lot Use a data structure that trades accuracy for state. THIS IS A DIFFERENT PROBLEM. There is this bunch of work that can provide counts, even with this, it still consumes significant amount of state, but we are going to leverage property of worms that the numbers are increasing. We can bound the error with a function of number of bitmaps, with 3 bitmaps, Worm outbreak will have an increasing number of source IPs and destination IPs associated with it.
22
Implementing Insight 5: Scalable Bitmap Counters
1 1 Hash(Source) Hash : based on Source (or Destination) Sample : keep only a sample of the bitmap Estimate : scale up sampled count Adapt : periodically increase scaling factor With 3, 32-bit bitmaps, error factor = 28.5% Incase you are curious here is how we do it Imagine that we had a bitmap of size the max number of sources, and all we need to do is hash the source address and count But unfortunately if we want to count a million, we need a million bits If the hash comes in the first 32 we mark the bits else ignore In the end, we count the number of bits and scale up.. If you wanted to count up to 32K using 32 bits, we have salced by 1K so we scale up by 1K Trouble is that if there are only 10 sources, therefore we start small and grow up and that is how a worm outbreak happens.. Error Factor = 2/(2numBitmaps-1)
23
High Speed Implementation: Practical Content Sifting
Memory “State” scaling Hash of fixed sized substrings Multi Stage Filters Allow us to focus on the prevalent substrings Total size is 2MB Scalable Bitmap counters Scalable counting of sources and destinations CPU “Computation” scaling Incremental hash functions Value Sampling 1/64 sampling detects all known worms To summarize, our high-speed implementations uses Multi Stage Filters to detect the prevalent sub-strings, and bitmap counters are used to count distinct hosts to ensure that there is minimal memory utilization. ** Put numbers for how much space is reduced.. ** Put numbers for how much we value sample, and what megabit rates we can do.
24
Implementing Content Sifting
IAMAWORM Update Multistage Filter (0.146) Multi-stage Filter (Dynamic Per Port Thresholds) Key = RabinHash(“IAMA”) (0.349, 0.037) Found ADTEntry? Prevalence Table value sample key NO is prevalence > thold Scaling bitmap counters (5 bytes) ADTEntry=Find(Key) (0.021) YES This slide will be an animation Show multi stage filters graphically YES KEY Repeats Sources Destinations Update Entry (0.027) Create & Insert Entry (0.37) 0.042us per byte (in software implementation), with 1/64 value sampling Address Dispersion Table
25
Deployment Experience
1: Large fraction of the UCSD campus traffic, Traffic mix: approximately 5000 end-hosts, dedicated servers for campus wide services (DNS, , NFS etc.) Line-rate of traffic varies between 100 & 500Mbps. 2: Fraction of local ISP Traffic, (DEMO) Traffic mix: dialup customers, leased-line customers Line-rate of traffic is roughly 100Mbps. 3: Fraction of second local ISP Traffic, Traffic mix: inbound / outbound traffic into a large hosting center. Line-rate is roughly 300Mbps.
26
False Positives we encountered
Common protocol headers Mainly HTTP and SMTP headers Distributed (P2P) system protocol headers Procedural whitelist Small number of popular protocols Non-worm epidemic Activity SPAM GNUTELLA.CONNECT /0.6..X-Max-TTL: .3..X-Dynamic-Qu erying:.0.1..X-V ersion: X -Query-Routing:. 0.1..User-Agent: .LimeWire/ .Vendor-Message: .0.1..X-Ultrapee r-Query-Routing: Too many fp’s can rended a system unusable.. Since our goal is to automate So fp’s need to be rare or have minimal impact. Out of the box, over this period of time, these many alerts, these many real, these many false positives Of these XXX are whitelistable, we have a procedural whitelist, where we describe thest type of protocls These fraction goes away, what is left in terms of false positives is SPAM, and BitTorrent SPAM is being sent through lots of relays, with most p2p protocols content does not match, 1 to many BT has particuar char that they replicate piece of content, many-to-many Number of heuristics that one can try to use to address these two remaining classes If large classes appear In eterprises, have no interest in rec’ spam and p2p protocols, as such this weakness is considered an asset by them. COULD EMPLOY HEURISICS, Don’t pretend to have solved the SPAM problem. The weakness in our approach that causes us to mischaracterize spam and bittorrent as a bad activity is viewed as a feature If epidemic protocols become popular MAILING LIST Challenge in bittorrent is that many sites send and load from each other the same thing. Spam is other class, that cannot be whitlisted Saving grace is that it is bursty Totally resonable approach for this is to rate-limit these because its not the worst thing in the world if it gets to you slower
27
Other Experience: Lesson 1: From experience, static whitelisting is still not sufficient for HTTP and P2P. We needed other more dynamic white listing techniques Lesson 2: Signature selection is key. From worms like Blaster, we get several options. A major delay today in signature release is “vetting” signatures. Lesson 3: Works better for vulnerability based mass attacks; does not work for directed attacks or attacks based on social engineering where rep rate is low, Lesson 4: Major IDS vendors have moved to vulnerability signatures. Automated approaches to this (CMU) are very useful but automated exploit signature detection may also be useful as an addition piece of defense in depth for truly Zero day stuff.
28
Related Work and issues
3 roughly concurrent pieces of work: Autograph (CMU), Honeycomb (Cambridge) and EarlyBird (us). EarlyBird is only Further work at CMU extending Autograph to polymorphic worms (can do with Earlybird in real-time as well). Automating vulnerability sigs Issues: encryption, P2P false positives like Bit Torrent, etc.
29
Part 2: Detection of Signatures with Minimal Reassembly
I’d like to kick-off this meeting with a 20 minute presentation overview about NetSift: where it came from, where it is now, and where it might go (to appear in SIGCOMM 06, joint with F. Bonomi and A.. Fingerhut of Cisco Systems)
30
Membership Check via Bloom Filter
Set Field Extraction Equal to 1 ? Equal to 1 BitMap Hash 1 Hash 2 Hash 3 Stage 1 Stage 2 Stage 3 ALERT ! If all bits are set
31
Example 1: String Matching (Step 1: Sifting using Anchor Strings)
String Database to Block Anchor Strings Bloom Filter ST0 A0 ST1 A1 ST2 A2 Hash Function Imagine we wish to block all packets that contain one or more instances of a large number,ST0 - ST N, of attack strings. We wish to block a packet if it contains a string at any offset. Since header checks are itself hard at 20 Gbps, doing a memory reference for each offset, 1500 in a 1500 byte packet, is challenging. In NetSift, we used sifting to cope with the challenge. The first sifting step is based on a Bloom filter that is replicated four times to meet the per-byte speed requirement. But a bloom filter is based on hashing and hashing needs fixed size strings, and the strings may not be of the same length. So in the first step, the software picks a fixed length anchor string A1 through An in each of the strings, where the anchors can be anywhere in the corresponding strings. Next, the software hashes the anchors into a Bloom filter and removes the actual anchors. The hardware, now in Step 1 of the string matching, quickly sifts out only the packets which hits the Bloom filter at any offset. STn An Sushil Singh, G. Varghese, J. Huber, Sumeet Singh, Patent Application
32
String Matching Step 2: Standard hashing
Hash Bucket-0 ST1 A1 ST2 A2 Hash Bucket-1 Hash Function STn An In Step 2, we do standard hashing based on the offset and anchor length found in Step To facilitate this, the software builds a standard hash table. However, the trick is that instead of standard hashing with chaining we use a much more compact structure called a bit Tree Hash Bucket-m
33
Matching Step 3: Bit Trees instead of chaining
Strings in a single hash bucket ST8 A8 ST2 A2 1 ST11 1 A11 ST8 A8 LOC L2 ST11 A11 ST2 A2 1 ST17 A17 1 ST17 A17 LOC L1 LOC L3 ST8 L2 To understand a bit tree, assume that ST2, ST 8, St 11, and St 17 hash into the same bucket in Step 2. To decide which, if any of these 4 strings matches, the hardware walks the bit tree shown at the bottom. The bit tree is built as follows. The software first picks a bit position L1 that divides the set of 4 strings into 2 groups, 2 with bit position equal to 0,and 2 with bit posotion equal to 1. This separates out ST 8 and ST 11. Similarly, the software finds a bit position L2 in which ST8 has bit 0, and ST11 has bit Thus a node of a bit tree is compact, it only has a bit position and a value and 2 pointers. Step 3 is also a sifting step, because at the end of it we can only be sure that the packet matches (or does not) a particular string in the database say ST8 but can’t be sure because we have not checked all bytes. Thus, we need a final Step 4 to compare all bytes in ST8 with bytes at the offset specified by the leaf of the bit tree. Note that this last step can be done much more slowly. For example, in NetSift, we stored the actual string bytes in DRAM, and the Bloom Filters and bit tree in on-chip SRAM, 1 ST11 L1 ST17 1 L3 1 ST2
34
Problem is harder than it may appear
Network IDS devices are beginning to be folded into network devices like switches. Cisco, Force10 Instead of having appliances that work at 1 or 2 Gbps, we need IDS line cards (or better still chips) that scale to Gbps. Because attacks can be fragmented into pieces which can be sent out of order and even with inconsistent data, the standard approach has been to reassemble the TCP stream and to normalize the stream to remove inconsistency. Theoretically, normalization requires storing 1 round trip delay worth per connection, which at 20 Gbps is huge, not to mention the computation to index this state. Worse, have to do Reg-Ex not just exact match (Cristi)
35
Headache: dealing with Evasions
SEQ = 13, DATA = “ACK” SEQ = 10, DATA = “ATT” THE CASE OF THE MISORDERED FRAGMENT SEQ =10, TTL = 10, “ATT” SEQ = 13, TTL = 1, “JNK” . . SEQ = 13, “ACK” THE CASE OF THE INTERSPERSED CHAFF SEQ = 10, “ATTJNK” SEQ = 13, ACK THE CASE OF THE OVERLAPPING SEGMENTS
36
Conclusions Surprising what one can do with network algorithms. At first glance, learning seems much harder than lookups or QoS. Underlying principle in both algorithms is “sifting”: reducing traffic to be examined to a manageable amount and then doing more cumbersome checks. Lots of caveats in practice: moving target
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.