Author : Xinming Chen,Kailin Ge,Zhen Chen and Jun Li Publisher : ANCS, 2011 Presenter : Tsung-Lin Hsieh Date : 2011/12/14 1
Introduction Related Work Background Proposed Algorithm : AC-Suffix-Tree Algorithm Performance Analysis 2
TCP and IP fragmentation can be used to evade signature detection at IDS / IPS. The common defense is buffering and reassembling packets. However, buffering of out-of-sequence packets can become impractical on high speed links due to limited fast memory capacity. 3
In this paper, AC-Suffix-Tree, a buffer free scheme for string matching is proposed, which detects patterns across out-of-sequence packets without buffering and reassembly. This novel algorithm associates the classical AC (ACA) algorithm with a pattern suffix tree to search patterns with only the state numbers of AC automaton and suffix tree stored. 4
What is the current situation of packet reordering in Internet? In 2005, Dharmapurikar found that packet reordering in TCP traffic only affects 2-3% of the overall traffic[6]. An older paper reports that 90% of the TCP packets were reordered in the trace of Dec and Jan [3], but Dharmapurikar claims it was because the older generation of router architecture. 5
Pattern Suffix Tree : Let X = {abaaba,ababab}, suffix set of X is {a,ba,aba,aaba,baaba,b,ab,bab,abab,babab} 6
The return value contains the stop state and a “fact” mark. Once the input string is not finished but there is no available next state, fact is false; and once the input string is finished but PST is not finished, fact is true. So fact = true means str is a proper factor of some patterns in X. 7
A simple situation of two packets’ reordering. When packet y2 comes first, a pattern may exist between the two packets only if some prefix of y2 is one suffix of the patterns. 8
For example : y i =aaba, y j =abaa stop at s 6 append path(s 6 ) 9
What if the pattern x crosses more than two segments? A information merging mechanism is used to merge the PST state records in successive blocks. the return value “fact” of PST is used to identify the proper factor of x. fact = true means the entire segment is a proper factor of x, thus needs to merge the PST state with the predecessor segment. 10
Example : Pattern set X = {abaaba, ababab} Input Y = y 1 y 2 y 3 y 4, where y 1 = bbaa,y 2 = baba,y 3 = baab,y 4 = aabb 11 -> flow number -> sequence number -> length -> state of ACA -> state of PST
First input is y 3 : (baab) passing y 3 to both ACA & PST Buffer contains (1,8,4,2,11,true) 12
Second input is y 1 : (bbaa) passing y 1 to both ACA & PST Buffer contains (1,8,4,2,11,true), (1,0,4,2,11,false) 13
Third input is y 4 : (aabb) combine y 4 with its predecessor (1,8,4,2,11,true) ACA begin with s2,PST begin with s11 Buffer contains (1,0,4,1,8,false), (1,8,8,0,12,false) 14
Fourth input is y 2 : (baba) combine y 2 with (1,0,4,1,8,false) & (1,8,8,0,12,false) path(12) appended to y 2 ’s tail -> bababaaba ACA match with {abaaba,ababab} Buffer clean all records with fid = 1. 15
Compression of Suffix Tree : idea - using a suffix array instead of a tree Pre-processing time will be longer but not the focus 16
Pattern set is chosen from snort,released on 2010/07/22. no regular expressions included. Use traces generated by their own program. Running on PC Pentium 2-core CPU,4GB RAM,32- bit XP 17
Processing speed for different traces with long set 18
Memory usage of AC and suffix tree 19
1,3,2,4,5, 6,7,8,9,10 1,4,5,6,7 8,9,10,2,3 1,3,4,6,7 8,9,2,10,5 20