Download presentation
Presentation is loading. Please wait.
1
ECE 526 – Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese
2
Ning WengECE 5262 Goal Gain basic knowledge to improve network security from network processing system design perspective
3
Ning WengECE 5263 Outline Signature-based IDSs String matching algorithms ─ Boyer-Moore ─ Aho-Corasic ─ Bloom Filter ─ Approximated Searching ─ Approximated Searching Based on Bloom Filters Summary
4
Ning WengECE 5264 Internet Security Internet lacking of security ─ Example? What is Internet Security ─ Confidentiality: data keeping private ─ Integrity: protected from modification or destruction ─ Availability: data or service accessible What are current approaches ─ Engineering? ─ non-engineering? ─ Intrusion Detection Systems (IDSs)
5
Ning WengECE 5265 Intrusion Detection Systems Two types of Intrusion Detection Systems (IDSs) ─ Signature detection: based on matching events to the signatures of known attacks ─ Anomaly detection: based on statistical or learning theory to identify aberrant events Three important tasks ─ String matching: searching suspicious strings in packet payloads ─ Traceback: to detect intruder who uses forged source address ─ Detect onset of new worm without prior knowledge The problems of current IDSs ─ Very slow ─ Have a high false-positive rate ─ false positive: answering membership query positively when member is not in the set
6
Ning WengECE 5266 Snort Rule Example Snort: ─ one of lightweight detection system, open source ─ www.snort.org Snort rule example: Alert tcp $BAD 80 -> $GOOD 90 \ (content: “perl.exe”; msg: “detected perl.exe”;) ─ Looking for string “perl.exe” contained in TCP packet from IP: $BAD, Port: 80 to IP: $GOOD, Port: 90 ─ Upon detection, generating alert with “detected perl.exe” Question: a packet coming, how to check it? Question: how about multiple rules? String matching is bottleneck
7
Ning WengECE 5267 String Searching: brute force Arbitrary string can be anywhere in the packet Naive approach Input: String size: m; packet size: n (assuming n >m) For i:=0 to n-m do For j:=0 to m-1 do Compare string[j] with packet[i+j] If not equal exit the inner loop Complexity: ─ worst case O(m*n) ─ Best case O(n) Can we do better?
8
Ning WengECE 5268 Boyer-Moore: example Improving by skipping over a larger number of character and by comparing last character first How to build the ship table?
9
Ning WengECE 5269 Boyer Moore: skip table How far to skip when the last character does not match. For example ─ pattern: CAB ─ Skip: 1 * 2 3 3… ─ Last A B C D E Care is needed with repeated letters For example ─ pattern: ABBA ─ Skip: * 1 4 4 4… ─ Last: A B C D E … Skip[c] = distance of last occurrence of c from end in pattern
10
Ning WengECE 52610 Boyer Moore: algorithm Input: pattern with size m; packet with size n i: =0 While i<=n-m do If pattern[m-1] = packet[i+m-1] then //last character first For j:=0 to m – 1 do Compare pattern[j] with packet[i+j] //one by one sequentially i:=i+1 Else i:=i+skip[packet[i+m-1] //skip Complexity: ─ best case O(n/m) ─ worst case still O(nm)
11
Ning WengECE 52611 Aho-Corasic Failure pointer ─ Prevent restarting at top of trie when failure occurring ─ New attempt made by shifting How about multiple strings? BABAR
12
Ning WengECE 52612 Multiple String Trie Construction 0 1 h 2 9 8 6 3 4 57 e s i h s ers Initial State Accepting State State Transition Function h S h h h h h S S S S S S i h r h Example: P = {he, she, his, hers}
13
Ning WengECE 52613 0 1 h 2 9 8 6 3 4 57 e s i h s e r s h S h h h h h S S S S S S i h r h Aho-Corasick: Searching hxhers Scanning input stream only once Complexity: linear time. Input stream: Matching String
14
Ning WengECE 52614 Aho-Corasick: summary Pros: ─ Computation complexity: worst case O(n) ─ Can scan once and output all matches Cons: ─ Constructing a finite state machine ─ Failure pointers needed ─ Too big to be on chip Each node has maximum 256 pointers
15
Ning WengECE 52615 Hashing One efficient set membership query mechanism ─ Programming trivial ─ Query complexity: O(n) best case (n: size of packet) ─ Query accuracy: possible false positive However, to handle collision ─ Each hash entry containing a list of IDs of all elements share the hash value ─ Storage minimal requirement: O(n*w) n: number of elements, w: minimal width of each element Question: can we trade accuracy for storage requirement using hashing idea?
16
Ning WengECE 52616 Bloom Filter Data structured proposed by Burton Bloom Randomized data structure ─ Strings stored using multiple hash functions (programming) ─ Check string’s presence based on multiple bits (querying) Membership queries result in false positives Powerful tools for ─ Content networks ─ Route trace back ─ Network measurements ─ Intrusion Detection
17
Ning WengECE 52617 Bloom Filter Programming Instead using one hash function, k independent hash functions Instead requiring n*w bit storage; m-bit vector required Initially all bit are cleared Programming set bit based on each hashing function ─ bit remaining set if two elements hashed to same position
18
Ning WengECE 52618 Bloom Filter Querying Procedure: String x is computed by k hashing functions Each hashing function pinpointing one bit in m-bit vector All value in m-bit vector are ANDed If match ==0, x is not a member else x is positive member
19
Ning WengECE 52619 Bloom Filter: false positive rate n: number of strings to be stored k: number of hash functions m: the size of bit array The false positive probability ─ f = (1/2) k ─ Optimal value hash functions k K = ln2 * m/n = 0.693*m/n False positive rate decreases exponentially with number of hash functions & memory
20
Ning WengECE 52620 Counting Bloom Filters Member deletion ─ Deletion of a member requiring clearing all the related bits ─ A bit once set in the bit vector can not be deleted easily the bit can be set by multiple members Solution ─ Assuming member deletion rare case ─ Counting bloom filter Updating counter when element added or deleted Bit reset in m-bit vector when counter value is 0
21
Ning WengECE 52621 Approximate String Searching Using Bloom filter
22
Ning WengECE 52622 Approximate String Searching John W. Lockwood and etc. “DEEP PACKET INSPECTION USING PARALLEL BLOOM FILTERS”
23
Ning WengECE 52623 Summary IdeaComputationStorageProblem Brute Force NaïveO(m*n)slow Boyer-Moore SkipO(m*n) –worst O(n/m) – best 0.1 MB (10K Rules) Shift table needed Aho Corasick TireO(n) – worst case 50 MB (1500 Rules) Storage demanding Bloom-Filter Approximate searching O(n)0.1 MB (10K Rules) False positive
24
Ning WengECE 52624 For Next Class Read Comer: chapter 6 and 9 Final Project (option 1) ─ Project group finalized 9/19/07: group leader: email me your group members. each group no more than 3 members. ─ Project topic finalized. 9/28/07: Group leader: email me your topic. Paper presentation + Final exam (Option 2) 9/19/07: group leader: email me your group members. each group no more than 2 members. based on assigned one or two papers (<20 min)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.