Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF.

Similar presentations


Presentation on theme: "1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF."— Presentation transcript:

1 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A

2 2 Rule Based Systems Applications in Security –  Intrusion Detection System  Firewalls  Access Control Systems Policy specified in terms of a database of rules Enforcement involves identifying the applicable rule(s)

3 3 Fundamental Operation Given an input p with attributes {p 1, p 2,..., p k }, identify the rules R i from {R 1, R 2,..., R n } that match p R i : condition -> action e.g. R 1 : dhost == PLUTO && dport == HTTP && content: “Bad command” -> DENY Challenge Rule matching algorithms do not scale well – either in space or in time

4 4 Matching Algorithms n – no. of rules k – no. of attributes Linear Search  Match one rule at a time  Space efficient – O(n*k)  Matching time increases very fast – O(n) Table-based Search  Columns correspond to attributes  Rows correspond to rules  Wastes space when many rules specify “*” for many attributes – O(n*k)  Efficient matching in hardware/multiprocessor – match different attributes in parallel and combine results  In uniprocessor environment matching time – O(n)

5 5 Matching Algorithms contd. Decision Tree (Trie-like structure)  Each node corresponds to test on an attribute  Matching time – O(k) No. of attributes is order of magnitude smaller than no. of rules  Size – Can be exponential in n Minimization of decision tree is a NP-complete problem! Goal Develop efficient techniques for rule matching that scale to support thousands of rules

6 6 Outline Problem Formulation Techniques  Minimize duplication  Benign non-determinism  Polynomial bound  Utility Results

7 7 Packet Classification A mechanism that  inspects network packets  determines how to process a packet based on the values of header fields and the payload Applications  Firewalls – Identify highest priority matching rule  Intrusion Detection Systems Use unordered rules Identify all matching rules  Network Monitoring – whether a packet satisfies any of the conditions

8 8 Objective Promote sharing of tests  not restricted to equality tests  we need to support inequalities, disequalities, and bit-masking operations Flexibility to support diverse application  Ordered (firewalls) and unordered (intrusion detection) rule sets  Packet-filtering (network monitoring)

9 9 Problem Formulation Tests involve a variable x and one or two constants (denoted by c). Equality tests x == c  tcp_sport == 80 Equality tests with bitmasks x & c1 == c  tcp_flags & 0x03 == 0x03 Disequality tests x != c  tcp_sport != 80 Disequality tests with bitmasks x & c1 != c  tcp_flags & 0x03 != 0x03 Inequality tests x <= c  tcp_dport <= 1024

10 10 Rules and priorities A rule R is a conjunction of tests  (dport == 22) && (sport <=1024) && (flags&0xb == 0x3) A set of rules may be partially ordered by a priority relation  The priority of R is denoted as Pri(R). A rule R matches a packet p, if:  the packet satisfies R, i.e., R(p) is true  the packet does not satisfy any rule that has higher priority than R

11 11 Decision Tree for Packet Classification {R 1, R 2, R 3 } {} icmp_type == ECHO ttl == 1 ttl != 1 icmp_type == ECHO_REPLY {R 1, R 3 } {R 2, R 3 } {R 3 } {} {R 3 } {R 2, R 3 } {R 1, R 3 } {R 1 } icmp_type != ECHO && icmp_type != ECHO_REPLY R 1 : (icmp_type == ECHO) R 2 : (icmp_type == ECHO_REPLY) && (ttl ==1) R 3 : (ttl == 1)

12 12 Exponential Blowup R 1 : x == 1 R 2 : x == 2 R 3 : x == 3 R 4 : x == 4 R 5 : y == 1 R 6 : y == 2 R 7 : y == 3 R 8 : y == 4 1 2 3 4 x y 2 1 3 4 else 1 2 3 4 {R 1, R 5 }{R 1, R 6 }{R 2, R 5 } {R 2, R 6 }

13 13 Decision Tree Construction Decompose and reorder tests to increase sharing of tests among rules R 1 : x == 5 R 2 : x & 0x03 != 1 {R 2 } x & 0x03 != 1 x & 0x03 == 1 x & 0x03 != 1 x & 0x03 == 1 x == 5 x != 5 {R 1 } {R 1, R 2 } {} {R 1 }

14 14 Condition Factorization Decomposing rules into combination of more primitive tests Similar to factorization of integers Based on the residue operation – analogous to integer division Residue We want to determine if there is a match for a rule C 1 We have so far tested a condition C 2 A residue captures the additional tests that need to be performed at this point to verify C 1

15 15 Residue Operation The residue C 1 /C 2 is another condition C 3 such that: 1. C 2 Æ C 3 ) C 1 2. C 1 Æ C 2 ) C 3 Examples C 1 : x 2 [1, 20], C 2 : x 2 [15, 25] C 3 : x <= 20 C 1 : x 2 [1, 20], C 2 : x == 15 C 3 : true C 1 : x 2 [1, 20], C 2 : x == 35 C 3 : false C 1 : x 2 [1, 20], C 2 : y == 15 C 3 : x 2 [1, 20]

16 16 Computing Residue on Tests

17 17 Build Algorithm Recursive procedure Takes a node s as its first parameter Builds the sub-tree that is rooted at s It takes two other parameters Candidate Set ( C s ) – rules that haven’t completed a match, but future matches can’t be ruled out either. Match Set ( M s ) – all rules for which a match can be announced at s.

18 18 Minimize Duplication R 1 : x == 1 && y == 1 R 2 : x == 2 && y == 2 R 3 : y == 3 x 1 2 else yyy 1 3 2 3 3 {R 1 } {R 3 } {} {R 3 } {R 2 } {}{R 3 }

19 19 Minimize Duplication R 1 : x == 1 && y == 1 R 2 : x == 2 && y == 2 R 3 : y == 3 y 1 2 else xx 1 2 3 {R 3 } {R 1 } {} {R 2 }

20 20 Benign Non-determinism Two rules R 1 and R 2 are said to be independent of each if they do not have a common test Build separate trees for each independent set Match packets against each tree – non-determinism without incurring any performance penalties If R 1 and R 2 are independent, packet may match R 1, R 2, both, or neither. Number of nodes of tree for R 1 is k 1, for R 2 is k 2. Number of states of tree for R 1 U R 2 is k 1 * k 2. Combined number of nodes of independent trees for R 1 and R 2 is k 1 + k 2.

21 21 Exponential Blowup R 1 : x == 1 R 2 : x == 2 R 3 : x == 3 R 4 : x == 4 R 5 : y == 1 R 6 : y == 2 R 7 : y == 3 R 8 : y == 4 1 2 3 4 x y 2 1 3 4 else 1 2 3 4 {R 1, R 5 }{R 1, R 6 }{R 2, R 5 } {R 2, R 6 } y x {R 1 } {R 2 }{R 5 }{R 6 }

22 22 Ensuring Polynomial Bounds Breadth of tree is function of breadth of sub- trees Select a polynomial bound to satisfy at each node Pick tests that satisfy the bounds Pick a test that comes closest to satisfying this constraint and make some outgoing edges nondeterministic

23 23 Improving Matching Time Utility - how much a test goes towards checking a rule based on notion of assigning costs to tests and rules compare cost of a rule with combined cost of a test and the residue of a rule w.r.t the test select strategy Size reduction more important than matching time 1. Pick discriminating test when available  Pick test with higher utility 2. Examine opportunities for benign-nondeterminism 3. Pick tests that satisfy polynomial bound

24 24 Tree Size

25 25 Matching Time

26 26 Summary Developed a new technique for fast packet classification  Flexible – support diverse applications in a uniform framework  Promotes sharing of tests Developed novel techniques for generating packet classification trees that  Have polynomial size  Virtually constant matching time Demonstrated the gains from our technique for intrusion detection systems and firewalls


Download ppt "1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF."

Similar presentations


Ads by Google