Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs.

Slides:



Advertisements
Similar presentations
Deep Packet Inspection: Where are We? CCW08 Michela Becchi.
Advertisements

Deep packet inspection – an algorithmic view Cristian Estan (U of Wisconsin-Madison) at IEEE CCW 2008.
Fast Submatch Extraction using OBDDs Liu Yang 1, Pratyusa Manadhata 2, William Horne 2, Prasad Rao 2, Vinod Ganapathy 1 Rutgers University 1 HP Laboratories.
CS 44 – Aug. 29 Regular operations –Union construction Section Nondeterminism –2 kinds of FAs –How to trace input –NFA design makes “union” operation.
4b Lexical analysis Finite Automata
CSC 361NFA vs. DFA1. CSC 361NFA vs. DFA2 NFAs vs. DFAs NFAs can be constructed from DFAs using transitions: Called NFA- Suppose M 1 accepts L 1, M 2 accepts.
Lecture 23UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 23.
Complexity and Computability Theory I Lecture #4 Rina Zviel-Girshin Leah Epstein Winter
DFA Minimization Jeremy Mange CS 6800 Summer 2009.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Efficient Memory Utilization on Network Processors for Deep Packet Inspection Piti Piyachon Yan Luo Electrical and Computer Engineering Department University.
Compiler Construction
XFA : Faster Signature Matching With Extended Automata Author: Randy Smith, Cristian Estan and Somesh Jha Publisher: IEEE Symposium on Security and Privacy.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University,
1 Regular expression matching with input compression : a hardware design for use within network intrusion detection systems Department of Computer Science.
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Department of Computer Science and Information Engineering National.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
1.Defs. a)Finite Automaton: A Finite Automaton ( FA ) has finite set of ‘states’ ( Q={q 0, q 1, q 2, ….. ) and its ‘control’ moves from state to state.
Deep Packet Inspection with Regular Expression Matching Min Chen, Danny Guo {michen, CSE Dept, UC Riverside 03/14/2007.
1 Non-Deterministic Finite Automata. 2 Alphabet = Nondeterministic Finite Automaton (NFA)
Liu Yang New Pattern Matching Algorithms for Network Security Applications Liu Yang Department of Computer Science Rutgers University April 4th, 2013.
High-Speed Parallel Processing of Protocol-Aware Signatures Jordi Ros-Giralt, James Ezick, Peter Szilagyi, Richard Lethin Unclassified, DISTRIBUTION STATEMENT.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
REGULAR LANGUAGES.
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Fang Yu Microsoft Research, Silicon Valley Work was done in UC Berkeley,
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Space-Time Tradeoffs in Software-Based Deep Packet Inspection Anat Bremler-Barr Yotam Harchol ⋆ David Hay IDC Herzliya, Israel Hebrew University, Israel.
Space-Time Tradeoffs in Software-Based Deep Packet Inspection Anat Bremler-Barr Yotam Harchol ⋆ David Hay IDC Herzliya, Israel Hebrew University, Israel.
SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher:
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Authors: Fang Yu, Zhifeng Chen, Yanlei Diao, T. V. Lakshman, Randy H.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
An Efficient Regular Expressions Compression Algorithm From A New Perspective  Author: Tingwen Liu, Yifu Yang, Yanbing Liu, Yong Sun, Li Guo  Publisher:
Converting NFAs to DFAs How a Syntax Analyser is constructed.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
Deterministic Finite Automaton for Scalable Traffic Identification: the Power of Compressing by Range Authors: Rafael Antonello, Stenio Fernandes, Djamel.
Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection Sailesh Kumar Sarang Dharmapurikar Fang Yu Patrick Crowley Jonathan.
CMSC 330: Organization of Programming Languages Finite Automata NFAs  DFAs.
Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions Publisher : Conference on emerging Networking EXperiments and Technologies.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
Author : Yang Xu, Lei Ma, Zhaobo Liu, H. Jonathan Chao Publisher : ANCS 2011 Presenter : Jo-Ning Yu Date : 2011/12/28.
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
Theory of Computing CSCI 356/541 Lab Session. Outline Lab 1: Finite Automata  Construct and Run Construct and Run  Manipulating Transitions Manipulating.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.
Finite Automata Chapter 1. Automatic Door Example Top View.
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
An Improved DFA for Fast Regular Expression Matching Author : Domenico Ficara 、 Stefano Giordano 、 Gregorio Procissi Fabio Vitucci 、 Gianni Antichi 、 Andrea.
Author : S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese Publisher : ANCS ‘07 Presenter : Jo-Ning Yu Date : 2011/04/20.
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
Advanced Algorithms for Fast and Scalable Deep Packet Inspection Author : Sailesh Kumar 、 Jonathan Turner 、 John Williams Publisher : ANCS’06 Presenter.
1 Section 11.2 Finite Automata Can a machine(i.e., algorithm) recognize a regular language? Yes! Deterministic Finite Automata A deterministic finite automaton.
Deterministic Finite Automata Nondeterministic Finite Automata.
Deflating the Big Bang: Fast and Scalable Deep Packet Inspection with Extended Finite Automata Date:101/3/21 Publisher:SIGCOMM 08 Author:Randy Smith Cristian.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Compiler Construction Lecture Three: Lexical Analysis - Part Two CSC 2103: Compiler Construction Lecture Three: Lexical Analysis - Part Two Joyce Nakatumba-Nabende.
CIS Automata and Formal Languages – Pei Wang
A DFA with Extended Character-Set for Fast Deep Packet Inspection
Lexical analysis Finite Automata
Yan Chen Department of Electrical Engineering and Computer Science
Binary Decision Diagrams
Finite Automata.
4b Lexical analysis Finite Automata
Discrete Controller Synthesis
4b Lexical analysis Finite Automata
Chapter 1 Regular Language
A Hybrid Finite Automaton for Practical Deep Packet Inspection
Presentation transcript:

Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Signature matching in IDS Find instances of network packets that match attack signatures 2 signatures Sig: /.*evil.*/ innocent 12evil34

Matching regular expressions Signatures represented as regular expressions (RE) Finite Automata –Represent and operate regular expressions Time Efficiency –Throughput must cope with Gbps link speeds Space Efficiency –Must fit in main memory of NIDS 3 Time-Space Tradeoff in Finite Automata

Time/Space tradeoff 4 DFAs NFAs Processing time NFA-OBDD Space efficient DFA >= 4 orders of magnitude DFA : Deterministic Finite Automaton NFA : Non- Deterministic Finite Automaton IDEAL

Contributions of our paper NFA-OBDD: New data structure that offers fast regular expression matching with space consumption comparable to that of NFAs –Up to 1645x faster than NFAs with comparable memory consumption –Speed is competitive with DFA variants DFA runs out of memory for our signature sets –Outperforms or is competitive with PCRE 5 Key Idea: Boolean encoding of NFA operation

Trends and challenges 6 Signature set size increased 5x in the last 5 years Challenge Perform fast matching with low memory consumption

Combining DFAs 7 /.*ab.*cd//.*ef.*gh//.*ab.*cd |.*ef.*gh/ Picture courtesy : [Smith et al. Oakland’08] Multiplicative increase in number of states

Combining NFAs 8 M N /.*ab.*cd//.*ef.*gh//.*ab.*cd |.*ef.*gh/ Additive increase in number of states

NFAs more compact than DFAs 9 Real Snort signatures have large counter values Signature /.*1[0|1] {3} / NFA DFA

10 Signature 1: /.*ab.*.{15}cd/ Signature 2: /.*ef.*.{10}gh/ DFA NFA

Outline  Problem Definition  Our Contribution  NFA-OBDD model and operation  Implementation  Evaluation  Related Work  Conclusion 11

NFA-OBDDs: Main idea Why are NFAs slow? –NFA frontiers may contain multiple states –Each frontier state must be processed for each symbol in the input. Idea: Represent and operate NFA frontiers symbolically using Boolean functions –Entire frontier can be modified using a single Boolean formula –Use ordered binary decision diagrams (OBDDs) to represent Boolean formulae

Encoding NFA transition functions An NFA for (0|1)*1, ∑ = {0,1} Transition function 13 xiyf(x,i,y) Frontier: Set of current states Size: O(n); n= # of states Set membership function -Disjunction of binary values of member states

Determine new frontier after processing input: Next set of states = Checking acceptance: NFA operation 14 SAT(Set_of_ Accept_States(x) Frontier(x)) Transition_Function(x,i,y) Frontier(x) Input_Symbol(i) Map y→x Ǝ x,i ) (

Ordered binary Decision Diagram (OBDD) [Bryant 86] Compact representation of Boolean function 15 The BDD of f(x,i,y) with order i < x < y xiyf(x,i,y)

NFA-OBDD NFA-OBDD: NFA representation and operation using OBDDs OBDD Representation of –Transitions –Frontiers Current set of states –Input symbols –Set of accepting states –Set of start states 16

Space efficiency of NFA-OBDDs NFA-OBDD construction: –Uses same combination algorithm as NFAs –OBDD data structure itself utilizes the redundancy of the binary function table 17 Rapid growth of signature set has little impact on NFA-OBDD space consumption (unlike DFAs)

Outline  Problem Definition  Our Contribution  NFA-OBDD model and operation  Implementation  Evaluation  Related Work  Conclusion 18

Experimental apparatus 19 C++ and CUDD package for OBDDs

Regular expression sets Snort HTTP signature set –1503 regular expressions from March 2007 –2612 regular expressions from October 2009 Snort FTP signature set –98 regular expressions from October 2009 Extracted regular expressions from pcre and uricontent fields of signatures 20

Traffic traces HTTP traces –33 traces –Size: 5.1MB –1.24 GB –One week period in Aug 2009 from Web server of the CS department at Rutgers FTP Traces –2 FTP traces –Size: 19.4MB, 24.7 MB –Two week period in March 2010 from FTP server of the CS department at Rutgers 21

Experimental results For 1503 REs from HTTP Signatures 22 *Intel Core2 Duo E7500, 2.93GHz; Linux-2.6; 2GB RAM* 1645x 9-26x 10x

Experimental results For 2612 REs from HTTP signatures 23 MDFA ran out of memory for 2612 REs

Experimental results For 98 RE from FTP signatures 24 MDFA ran out of memory for 98 REs

Multibyte matching Matches k>1 input symbol in a single step Also possible with NFA-OBDDs –Use OBDDs to represent k-step transitive closure of NFA transition function –See paper for details. Brief summary of experimental results –2-stride NFA-OBDD doubles the throughput –Outperforms 2-stride NFA by 3 orders of magnitude 25

Related work Multiple DFAs [Yu et al., ANCS’06] Extended finite automata [Smith et al., Oakland’08, SIGCOMM’08] D 2 FA [Kumar et al., SIGCOMM’06] Hybrid finite automata [Becchi et al., ANCS’08] Multibyte speculative matching [Luchaup et al., RAID’09] Many more – see paper for details 26

Conclusion NFA-OBDDs –Outperform NFAs by three orders of magnitude Up to 1645× in the best case Retain space efficiency of NFAs –Outperform or competitive with the PCRE package –Competitive with variants of DFAs but drastically less memory-intensive 27

Improving Signature Matching using Binary Decision Diagrams Thank You Liu Rezwana Vinod Randy

BDD var ordering 29

30

NFA-OBDD vs NFA NFA-OBDD Time Efficient –Process a set of states in one step Space Efficient –Use the same combination algorithm 31

NFA-OBDD operation Temp] = OBDD[Transition Function] OBDD[Frontier] OBDD[Input Symbol] OBDD[Next Set of States] = Map x→y ( Ǝ x,i OBDD[Temp]) Checking Acceptance : OBDD[Set of Accept States] OBDD[Frontier] 32

Solution for Mutibyte Matching 33

Experimental results 2 Stride 1400 HTTP 34

Experimental results 2 Stride 1400 HTTP signatures zoomed 35

Experimental Results 2 stride ftp 95 signatures 36

37

Matching multiple input symbol 38

39 accept sig

accept sig

NFA is compact 41 Recent signatures have large counter with value up to 500 Signature /.*1[0|1] {3}/