Download presentation
Presentation is loading. Please wait.
Published byAnita Worland Modified over 9 years ago
1
Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection Nan Hua 1, Haoyu Song 2, T. V. Lakshman 2 1 Georgia Tech, 2 Bell Labs, Alcatel-Lucent April 12, 2015
2
All Rights Reserved © Alcatel-Lucent 2009 2 | IEEE INFOCOM | April 2009 Introduction Deep Packet Inspection (DPI) Stateful inspection on packet header + packet payload Network Intrusion Detection & Prevention, Lawful Inspection, Censorship, Quality of Service … Focus of this work Fixed String Pattern Matching Why important? –Key component of signature-based DPI system –The basis for advanced inspection –Performance bottleneck Requirement –High speed, real time in-line processing –Low memory storage and bandwidth consumption –Low false positive rate and low miss rate –Resilient to the worst case scenarios
3
All Rights Reserved © Alcatel-Lucent 2009 3 | IEEE INFOCOM | April 2009 Classical Algorithm: Aho-Corasick DFA (1975) Set the foundation for most of the latest multi-pattern matching algorithms Consumes one byte/character per lookup cycle 10GbE/OC192 ~1 gigabytes/sec. Too many state transitions even for such a small set state fan-out = alphabet size init state accept state Failure transitions back to init state are not shown. String set: {he, his, him, her}
4
All Rights Reserved © Alcatel-Lucent 2009 4 | IEEE INFOCOM | April 2009 Increasing Throughput Through Parallelism Multiple parallel load-balancing search engines Memory Bandwidth Intensive Complex packet scheduler Overall cost depends on each single engine Make a single search engine scalable Simple pipeline does not work due to the DFA feedback path Superscalar & Multi-threading works with complex packet scheduler Examine multiple bytes or characters per lookup step Our goal: Improving throughput without exploding the memory Better state machine implementation Better (on-chip and off-chip) memory organization
5
All Rights Reserved © Alcatel-Lucent 2009 5 | IEEE INFOCOM | April 2009 A Naive realization of multi-byte pattern matching s3 : tel s5 : phon e s6 : elep hant s4 : tele phon e s1 : tech nica l s2 : tech nica lly s3 : tel s5 : phone s6 : elephant s4 : telephone s1 : technical s2 : technically q0q0 q1q1 q5q5 tech nica s3, q 2 q6q6 tele phon q3q3 hant q4q4 S 6 q 7 elep s3s3 tel S 4, s 5 e s5s5 e s1s1 l lly S 1, s 2 Input alignment problem. e.g. it can match “ phone ” but not “ iphone ” Still one character per lookup, but speedup can be achieved by …
6
All Rights Reserved © Alcatel-Lucent 2009 6 | IEEE INFOCOM | April 2009 Deploying Multiple Multi-byte Search Engines Replicate the table for different shift offsets. Waste memory storage One lookup for each offset Waste memory bandwidth Many previous work can be classified as using this approach: ANCS ’ 05, JSAC ’ 06 … technxyzicallyab
7
All Rights Reserved © Alcatel-Lucent 2009 7 | IEEE INFOCOM | April 2009 Amending Bandwidth with Storage (ISCA ’ 06) Combining all possible offsets into one state machine leading to memory explosion –state fan-out = Sⁿ, S is the alphabet size and n is the stride DFA for one pattern: “ abba ” in alphabet {a, b}
8
All Rights Reserved © Alcatel-Lucent 2009 8 | IEEE INFOCOM | April 2009 What is the problem of the naive approach? The segments within source and target are not aligned Key Idea of Variable Stride DFA (VS-DFA) How does human recognize string patterns in natural language? Using words as atomic units separated by space and punctuation this talk is interesting! I thinkthistalkisboring! technxyzicallyab Source (data flow) technically Signature (to be matched)
9
All Rights Reserved © Alcatel-Lucent 2009 9 | IEEE INFOCOM | April 2009 Winnowing [S. Schleimer, et al, SIGMOD ’ 03] extract documents ’ signature for similarity comparison First: hash every k characters, say, k = 2 Second: select the max hash value within a w-byte sliding window, say, w = 3 Third (our extension): partition the string into blocks at the positions of chosen values Identifying Atomic Units using Winnowing technxyzicallyab 514620576179149787517616l4916810554 99 514620576179149787517616l4916810554 99 149 51
10
All Rights Reserved © Alcatel-Lucent 2009 10 | IEEE INFOCOM | April 2009 Segmenting Strings to Blocks using Winnowing Each pattern string is divided into a head block, one or more core blocks, and a tail block The core blocks are context independent The head block and the tail block are context dependent Some short pattern can be coreless or indivisible Key idea: Using the core blocks to identify the pattern and then using the head and tail to verify the matching head block conf id r ent --- id id |ent ent|ica id | ic|ulo|u (empty-core) (indivisible) s4: s5: s3: s1: s6: s7: ent ial l s ire --- confident confidential identical ridiculous entire set s4: s5: s3: s1: s6: s7: winnowed core blocks tail block auth ent|icas2:te authenticates2:
11
All Rights Reserved © Alcatel-Lucent 2009 11 | IEEE INFOCOM | April 2009 Building the Variable-Stride DFA q0q0 id | l s2s2 s3s3 auth | te s4s4 conf | ent s5s5 conf | ial s1s1 r|sr|s s6s6 set s7s7 Short patterns are handled by TCAM ent | ire head string conf id r ent --- id id |ent ent|ica id | ic|ulo|u (empty-core) (indivisible) s4: s5: s3: s1: s6: s7: ent ial l s ire --- core string tail string auth ent|icas2:te Compiled ic q2q2 ulo id ent q1q1 ica q 12 q 15 q 14 q 11 q3q3 u ica A difference from Aho- Corasick is that sometimes this jump could be removed
12
All Rights Reserved © Alcatel-Lucent 2009 12 | IEEE INFOCOM | April 2009 Pattern Matching System using VS-DFA Data Stream (Payload) Blocks Queue t x y z e c h n i l c a l Block-based State Machine One Block per cylce state Match Result technxyz icallyab connecti Winnowing Module Multi-bytes per cycle Throughput depends on the state machine
13
All Rights Reserved © Alcatel-Lucent 2009 13 | IEEE INFOCOM | April 2009 VS-DFA comprises two tables: the State Transition Table (STT) and the Match Table (MT) State Machine Implementation StateHeadTail q 14 confent q 15 confial q 12 authte q 11 rs 1 3 Depth 2 2 q 12 idl2 (b) Match Table (MT) Start State block End State q0q0 idq 14 q0q0 entq1q1 q 14 icq2q2 q3q3 uq 11 q 14 entq 15 q1q1 icaq 12 q 15 icaq 12 Hash Key Value Start Transitions (a) State Transition Table (STT) q2q2 uloq3q3 Implemented as efficient hash tables
14
All Rights Reserved © Alcatel-Lucent 2009 14 | IEEE INFOCOM | April 2009 Using TCAM to Handle Short Patterns The “ empty-core ” pattern could still benefit from the segmentation An indivisible pattern needs max {w, w+k-2} replications entire tes tes tes tes Head (w bytes) Tail (w+k-2 bytes) Empty-Core Pattern Indivisible Pattern
15
All Rights Reserved © Alcatel-Lucent 2009 15 | IEEE INFOCOM | April 2009 Defending Against the Single-byte blocks The expected throughput speedup is (w+1)/2 Prone to Denial-of-Service attack single-byte blocks can lower the throughput adversaries can easily construct repeated single-byte blocks by sending repeated patterns We can reduce or even eliminate the single-byte pattern by applying the combination rules on the data stream and pattern at the same time combining up to w consecutive single-byte blocks into one block maintaining the block synchronization feature –see paper for details
16
All Rights Reserved © Alcatel-Lucent 2009 16 | IEEE INFOCOM | April 2009 Evaluation Pattern Sets & Memory Efficiency Snort-full and ClamAV-full also includes the fixed strings extracted from the Regular Expressions (in snort) or the advanced rules (in ClamAV)
17
All Rights Reserved © Alcatel-Lucent 2009 17 | IEEE INFOCOM | April 2009 Evaluation Results: Tradeoffs of w and k Larger w or k results in smaller memory Larger w or k results in larger TCAM Larger w results in higher throughput results for snort-fixed. results for ClamAv is similar
18
All Rights Reserved © Alcatel-Lucent 2009 18 | IEEE INFOCOM | April 2009 Conclusion & Future Work Multi-pattern matching is a key building block of a DPI system VS-DFA can process multiple bytes per step with small memory size and memory bandwidth consumption A single VS-DFA search engine can support 10Gbps+ throughput Future Work Find other segmentation algorithms instead of Winnowing that are more suitable for our application Use larger stride for higher throughput without incurring the short pattern penalty Extend the algorithm to support regular expression matching
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.