Space-Time Tradeoffs in Software-based Deep Packet Inspection Author: Anat Bremler-Barr, Yotam Harchol, and David Hay Published in Proc. IEEE HPSR 2011
2 Goal Software based DPI AC based (Exact Matching) Reduced memory size Fit in CPU cache Worst case throughput
3 Aho-Corasick Forward Transitions (To Deeper states) Failure Transitions Given a states s, Depth(s): Depth(S 4 ) = 2, Depth(S 13 ) = 3 Label(s): Label(S 4 ) = BD, Label(S 13 ) = BCA Label(S 12 ) = CDBCAB Failure Transitions to S 0 are omitted
4 ABCDE S2S2S0S2S5S4S3 S4S4S0S2S7S0S1 S5S5S0S2S7S6S1 S13S14S2S7S0S1 … Lookup Table format used in: (# of Forward transitions) more than 64. State Structure (1/3) Lookup Table Format
5 State Structure (2/3) Linear Format ABCDE S2S2S0S2S5S4S3 S5S5S0S2S7S6S1 S4 (S0) S5 (S7) DS6 S2 (S0) CS5DS4ES3
6 State Structure (3/3) Bitmap Format ABCDE S2S2S0S2S5S4S3 S5S5S0S2S7S6S1 S5 (S7) DS6 S2 (S0) CS5DS4ES S S5S4S3S0 S7
7 Path-Compression (1/3) One-way branch states are compressed. Problem: Incoming Failure Transition Outgoing Failure Transition Solution: No incoming failure transition is allowed Multiple outgoing transition Fields
8 Path-Compression (2/3) SaSbScSd ABC SaSd ABC SxSySz A, Sx 3, Sd B, Sy C, Sz A, Sb *, Sx B, Sc *, Sy C, Sd *, Sz
9 Path-Compression (3/3) Tuck. (INFOCOM 2004) SaSbScSd ABC SaSd ABCSxSySz A, Sx 3, Sd B, Sy C, Sz A, Sb *, Sx B, Sc *, Sy C, Sd *, Sz SiSjSk A T TS T, Sj *, Sp A, Sk *, Sq *, Sb SiSk TA T, Sp 2, Sk A, Sq BeforeAfter ???
10 Aho-Corasick Path Compression: Before and After Text: CDBCAB Text: CDBCAA
11 Leaves-Compression Trie leaves consists only failure transition. SaSb A Sc B SaSb A Sa Adding one bit for each forward transition => indicate an accept state The process can be applied recursively A, SbB, Sc*, Sx A, Sb, 0B, Sx, 1 AB, Sx, 1 Original 1st process 2nd process
12 Use both techniques Add one bit for every symbol of compressed path. Sa Sb Sc S0 Sp Sq AB, 0C, 1 B E Set the bit of i-th symbol when: (1) when a transition with the first i symbols of the path is to an accepting state (2) if the failure transition of the pre-compressed state reached after the first i symbols of the path, is to a leaf Sd D, 1
13 Leaves Compression: Before and After
14 Pointer Compression There are many transitions that go to states whose depth is small. 31% of the failure transitions go to depth 1 states Additional 35% of the failure transitions go to depth 2 states.
15 Variable-Size Pointers Two lengths: 2 and 2+log2|S| 00: Go to state S0 01: Go to depth 1 states (S0 occurs current symbols) 10: Go to depth 2 states (S0 occurs last symbols + current symbols) (Valid pairs are less, thus use hashing) 11: Go to next states as regular pointer
16 Huffman Coding Huffman coding allocates short code for frequent symbols and long code for infrequent ones. A lookup table is used to provide symbol-to-Huffman-code conversion. The idea is not used.
17 Evaluation Environment Two Environment: Core 2 Duo 2.53 GHz (2 Core), 32KB L1, 3MB L2. Core i GHz (4 Core), 32 KB L1, 256 KB L2, 8MB L3.
18 Evaluation Traffic Pattern: Snort ClamAV (Partial) Traffic: DARPA (Real Life) Exhaustive Traversal Failure path Traversal Worst Case
19 Space Requirement
20 Throughput
21 Memory Access
22 L1 Cache Miss Ratio
23 Miss ratio of Larger L2 Cache