A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan U of Illinois, Urbana Champaign Tim Sherwood UC, Santa Barbara.

Slides:



Advertisements
Similar presentations
Deep Packet Inspection: Where are We? CCW08 Michela Becchi.
Advertisements

A R EAL -T IME P ACKET S CAN A RCHITECTURE Tim Sherwood UC Santa Barbara.
Fast and Scalable Pattern Matching for Content Filtering Sarang Dharmapurikar John Lockwood.
Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Efficient Memory Utilization on Network Processors for Deep Packet Inspection Piti Piyachon Yan Luo Electrical and Computer Engineering Department University.
Outline Introduction Related work on packet classification Grouper Performance Empirical Evaluation Conclusions.
Authors: Raphael Polig, Kubilay Atasu, and Christoph Hagleitner Publisher: FPL, 2013 Presenter: Chia-Yi, Chu Date: 2013/10/30 1.
Reviewer: Jing Lu Gigabit Rate Packet Pattern- Matching Using TCAM Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004.
A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems Authors: Seongwook Youn and Dennis McLeod Presenter:
Using Cell Processors for Intrusion Detection through Regular Expression Matching with Speculation Author: C˘at˘alin Radu, C˘at˘alin Leordeanu, Valentin.
1 An Evolution of Pattern Matching within Network Intrusion Detection Systems Erik Anderson 9 November 2006.
Efficient Multi-match Packet Classification with TCAM Fang Yu Randy H. Katz EECS Department, UC Berkeley {fyu,
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University,
Snort - an network intrusion prevention and detection system Student: Yue Jiang Professor: Dr. Bojan Cukic CS665 class presentation.
Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
CS 104 Introduction to Computer Science and Graphics Problems
Efficient Multi-Match Packet Classification with TCAM Fang Yu
1 Gigabit Rate Multiple- Pattern Matching with TCAM Fang Yu Randy H. Katz T. V. Lakshman
ECE 526 – Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
A Signature Match Processor Architecture for Network Intrusion Detection Janardhan Singaraju, Long Bu and John A. Chandy Electrical and Computer Engineering.
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
Deep Packet Inspection with Regular Expression Matching Min Chen, Danny Guo {michen, CSE Dept, UC Riverside 03/14/2007.
Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,
1 ARCHITECTURES FOR BIT-SPLIT STRING SCANNING IN INTRUSION DETECTION Author: Lin Tan, Timothy Sherwood Publisher: IEEE MICRO, 2006 Presenter: Hsin-Mao.
Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs.
On the Use of Regular Expressions for Searching Text Charles L.A. Clarke and Gordon V. Cormack Fast Text Searching.
RAID2005 CardGuard: Towards software-based signature detection for intrusion prevention on the network card Herbert Bos and Kaiming Huang presented by.
ICS-FORTH WISDOM Workpackage 3: New security algorithm design FORTH-ICS The next six months Cork, 29 January 2007.
 Author: Tsern-Huei Lee  Publisher: 2009 IEEE Transation on Computers  Presenter: Yuen-Shuo Li  Date: 2013/09/18 1.
Presentation by : Samad Najjar Enhancing the performance of intrusion detection system using pre-process mechanisms Supervisor: Dr. L. Mohammad Khanli.
Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,
CSE7701: Research Seminar on Networking
A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan, Timothy Sherwood Appeared in ISCA 2005 Presented by: Sailesh.
Sujayyendhiren RS, Kaiqi Xiong and Minseok Kwon Rochester Institute of Technology Motivation Experimental Setup in ProtoGENI Conclusions and Future Work.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Vladimír Smotlacha CESNET Full Packet Monitoring Sensors: Hardware and Software Challenges.
Modular SRAM-based Binary Content-Addressable Memories Ameer M.S. Abdelhadi and Guy G.F. Lemieux Department of Electrical and Computer Engineering University.
Fast Packet Classification Using Bloom filters Authors: Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, and John Lockwood Publisher: ANCS 2006 Present:
Automatic Synthesis of Efficient Intrusion Detection Systems on FPGAs by Zachary K. Baker and Viktor K. Prasanna University of Southern California, Los.
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
1 Optimization of Regular Expression Pattern Matching Circuits on FPGA Department of Computer Science and Information Engineering National Cheng Kung University,
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
Department of Computer Science and Engineering Applied Research Laboratory Architecture for a Hardware Based, TCP/IP Content Scanning System David V. Schuehler.
Workpackage 3 New security algorithm design ICS-FORTH Ipswich 19 th December 2007.
Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection Sailesh Kumar Sarang Dharmapurikar Fang Yu Patrick Crowley Jonathan.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
Introduction to Intrusion Detection Systems. All incoming packets are filtered for specific characteristics or content Databases have thousands of patterns.
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
Snort – IDS / IPS.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
CSE7701: Research Seminar on Networking
Scalable Memory-Less Architecture for String Matching With FPGAs
Hash Functions for Network Applications (II)
Author: Yaron Weinsberg ,Shimrit Tzur-David ,Danny Dolev and Tal Anker
A Hybrid Finite Automaton for Practical Deep Packet Inspection
Design principles for packet parsers
Presentation transcript:

A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan U of Illinois, Urbana Champaign Tim Sherwood UC, Santa Barbara

Outline Why String Matching –Matching against multiple strings The Aho-Corasick Algorithm –The Devil in the Constants A Bit-Split Algorithm Hardware Design and Analysis Conclusions

To Protect and Serve Our machines are constantly under attack Cannot rely on end users, we need networks which actively defend themselves. This requires the protection system to be able to operate at 10 to 40 Gb/s. (We aim at current and next generation networks.) IDS/IPS are promising ways of providing protection Market for such systems: $918.9 million by the end of Snort: an widely accepted open source IDS

Our Contributions String Matching Architecture: –0.4MB and 10Gbps for Snort rule set ( >10,000 characters) Bit-Split String Matching Algorithm –Reduces out edges from 256 to 2. Performance/area beats the best techniques we examined by a factor of 10 or more.

Scanning for Intrusions Most IDS define a set of rules. A string defines a suspicious transmission. We are not building a full IDS, rather building the primitives from which full systems can be built CodeRed worm: web flow established uricontent with “/root.exe” Traffic InTraffic Out Scan Software IDS

Multiple String Matching The multiple string matching algorithm: –Input: A set of strings/patterns S, and a buffer b –Output: Every occurrence of an element of S in b –Extra constraint: b is really a stream How to implement: Option 1) search for each string independently Option 2) combine strings together and search all at once A B A string can be anywhere in the payload of a packet. A B D F C A B Input: A BC A Strings:

Why hardware Snort: >1,000 rules, growing at 1 rule/day or more Active research into automated rule building Strings are not limited to be just [a-z]+ We need a high speed string matching technique with stringent worst case performance. Many algorithms are targeted for average case performance. Aho-Corasick can scan once and output all matches. But it is too big to be on-chip.

Outline Why String Matching –Matching against multiple strings The Aho-Corasick Algorithm –The Devil in the Constants A Bit-Split Algorithm Hardware Design and Analysis Conclusions

The Aho-Corasick Algorithm Given a finite set P of patterns, build a deterministic finite automaton G accepting the set of all patterns in P.

An AC Automaton Example Example: P = {he, she, his, hers} 0 1 h e s i h s ers Initial State Accepting State State Transition Function h S h h h h h S S S S S S i h r h The Construction: linear time. The search of all patterns in P: linear time (Edges pointing back to State 0 are not shown).

Linear Time: So what’s the problem … … … … … 16, Next State Pointers How to implement it on chip? Problem: Size too big to be on-chip –~ 10,000 nodes –256 out edges per node –Requires 16,384*256*14 = ~10MB Solution: partition into small state machines –Less strings per machine –Less out edges per machine

Outline Why String Matching –Matching against multiple strings The Aho-Corasick Algorithm –The Devil in the Constants A Bit-Split Algorithm Hardware Design and Analysis Conclusions

Our Main Idea: Bit-Split Partition rules (P) into smaller sets (P 0 to P n ) Build AC state-machine for each subset For each DFA P i, rip state-machine apart into 8 tiny state-machines (B i0 through B i7 ) Each of which searches for 1 bit in the 8 bit encoding of an input character –O–Only if all the different B machines agree can there actually a match

Binary Encoding P 0 = { he, she, his, hers }

An example of Bit-Split P 0 = { he, she, his, hers } 0 1 h e s i h s e rs h S h h h h h S S S S S S i h r h (Edges pointing back to State 0 are not shown) b0 {0} P0P0 B 03 0 b1{ }0 1 b2{ },10, { } 0,3 { } 0,1,2,6 b3 1 b3{0,1,2,6} 0 1 b4{0,1,4} b6{0,1,2,5,6} b5{0,3,7,8} b7{0,3,9}

Compact State Set P 0 = { he, she, his, hers } 0 1 h e s i h s e rs h S h h h h h S S S S S S i h r h (Edges pointing back to State 0 are not shown). b0 { } P0P0 B 03 0 b1{ } 1 b2{ } 1 b3{ 2 } 0 b4 { } b6{ 2,5 } b5{7} b7{9}

An example of Bit-Split P 0 = { he, she, his, hers } (Edges pointing back to State 0 are not shown). P0P0 0 1 h e s i h s e r s h S h h h h h S S S S S S i h r h B 03 b0 {} b1{} b2{} b3{2} b4 {} b6{2,5} b5{7} b7{9} B 04 1 b8{2,7} b5 {} b0 {} b1{} b2{} b4{2} b3 {} b6{2,5} b9{9} 0 1 b7 {}

Nice Properties The number of states in B ij is rigorously bounded by the number of states in P i No exponential blow up in state Linear construction time Possible to traverse multiple edges at a time to multiply throughput

0 1 h e s i h s e r s h S h h h h h S S S S S S i h r h Matching on the example hxhers Only scan the input stream once. Input stream:

0 1 h e s i h s e r s h S h h h h h S S S S S S i h r h Matching on the example P0P0 B 03 b0 {} b1{} b2{} b3{2} b4 {} b6{2,5} b5{7} b7{9} B 04 1 b8{2,7} b5 {} b0 {} b1{} b2{} b4{2} b3 {} b6{2,5} b9{9} 0 1 b7 {} hxhe How do you “combine” the results from the different state machines? Only if all the state machines agree, is there actually a match. 2

How to Implement The AC state machine is equivalent to the 8 tiny state machines. The 8 tiny state machines can run independently, which means in parallel Intersection done with bit-wise AND. 8 is intuitive but not optimal How to build a system to implement this algorithm? –Our algorithm makes it feasible to be on-chip

A Hardware Implementation A rule module is equivalent to an AC state machine Rule modules, tiles are structurally equivalent All full match vectors are concatenated to indicate which strings are matched One tile stores one tiny bit-split state machine 8 4 Next State Pointers Partial Match Vector … 3 decoder Input Current State 2 bits from each byte Partial Match Vector Config Data Output Latch 4:1 Mux 16 State Machine Tile Rule Module 0 Tile 0 Tile 1 Tile 3 Tile 2 Full Match Vector 2-bit Input [0:1] Partial Match Vector 16 8 [6:7] [2:3] [4:5] Control Block Rule Module 1 Byte from Payload 8 … 2 Rule Module N 8 8 Complete Set of Matches for All Rules String Match Engine 16

An efficient Implementation PMV PMV PMV PMV Tile 0Tile 2 Tile 1Tile 3 Cycle 3e Cycle 2h Cycle 1x Cycle 0h h h x e h h x e h h e x h x h e e1100 h0000 x h e1111 h1110 x1000 h0000 e1000 h0000 x h e1000 h0000 x h Cycle 3 + P1000 Cycle 2 + P0000 Cycle 1 + P0000 Cycle 0 + P

Performance of Hardware Key Metric: Throughput*Character/Area

Related Work Software based –Good for ~100Mb/s, common case FPGA-based –Many schemes map rules down to a specialized circuit Near optimal utilization of hardware resources –Implementing state machines on block-RAMs [Cho and Mangione- Smith] –Concurrent to our work: mapping state machines to on-chip SRAM [Aldwairi et. al.] –Bloom filters [Dharmapurikar et al.] Excellent filter in the common case TCAM-based –Require all patterns to be shorter or equal to TCAM width –Cutting long patterns: 2Gbps with 295KB TCAM [Yu et. al.]

Conclusions New Tile-based Architecture –0.4MB and 10Gbps for Snort rule set ( >10,000 characters) –Possible to be used for other applications, e.g. IP lookups, packet classification. New Bit-split Algorithm: –General purpose enough for many other applications, e.g. spam detection, peephole optimization, IP lookups, packet classification, etc. –Feasible to be implemented on other tile-based architecture.

Thank you! Questions?

Backup Slides

An efficient Implementation PMV PMV PMV PMV Tile 0Tile 2 Tile 1Tile 3 Cycle 3e Cycle 2h Cycle 1x Cycle 0h h h x e h h x e h h e x h x h e e1100 h0000 x h e1111 h1110 x1000 h0000 e1000 h0000 x h e1000 h0000 x h Cycle 3 + P1000 Cycle 2 + P0000 Cycle 1 + P0000 Cycle 0 + P