ECE 526 – Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese.

Slides:



Advertisements
Similar presentations
Fast and Scalable Pattern Matching for Content Filtering Sarang Dharmapurikar John Lockwood.
Advertisements

Space-for-Time Tradeoffs
Efficient Memory Utilization on Network Processors for Deep Packet Inspection Piti Piyachon Yan Luo Electrical and Computer Engineering Department University.
Detecting Evasion Attacks at High Speeds without Reassembly Detecting Evasion Attacks at High Speeds without Reassembly George Varghese J. Andrew Fingerhut.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
Indian Statistical Institute Kolkata
A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems Authors: Seongwook Youn and Dennis McLeod Presenter:
Using Cell Processors for Intrusion Detection through Regular Expression Matching with Speculation Author: C˘at˘alin Radu, C˘at˘alin Leordeanu, Valentin.
Modern Information Retrieval
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
Snort - an network intrusion prevention and detection system Student: Yue Jiang Professor: Dr. Bojan Cukic CS665 class presentation.
Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.
Chapter 2: Algorithm Discovery and Design
1 Energy Efficient Multi-match Packet Classification with TCAM Fang Yu
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
1 Gigabit Rate Multiple- Pattern Matching with TCAM Fang Yu Randy H. Katz T. V. Lakshman
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
Pattern Matching COMP171 Spring Pattern Matching / Slide 2 Pattern Matching * Given a text string T[0..n-1] and a pattern P[0..m-1], find all occurrences.
A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan U of Illinois, Urbana Champaign Tim Sherwood UC, Santa Barbara.
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
Deep Packet Inspection with Regular Expression Matching Min Chen, Danny Guo {michen, CSE Dept, UC Riverside 03/14/2007.
Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,
Chapter 2: Algorithm Discovery and Design
A Fast Algorithm for Multi-Pattern Searching Sun Wu, Udi Manber May 1994.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Presentation by : Samad Najjar Enhancing the performance of intrusion detection system using pre-process mechanisms Supervisor: Dr. L. Mohammad Khanli.
Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,
CSE7701: Research Seminar on Networking
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
CPSC 171 Introduction to Computer Science 3 Levels of Understanding Algorithms More Algorithm Discovery and Design.
Identifying Reversible Functions From an ROBDD Adam MacDonald.
Fast and deterministic hash table lookup using discriminative bloom filters  Author: Kun Huang, Gaogang Xie,  Publisher: 2013 ELSEVIER Journal of Network.
A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan, Timothy Sherwood Appeared in ISCA 2005 Presented by: Sailesh.
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition.
Invitation to Computer Science, Java Version, Second Edition.
Author : Ozgun Erdogan and Pei Cao Publisher : IEEE Globecom 2005 (IJSN 2007) Presenter : Zong-Lin Sie Date : 2010/12/08 1.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
CSCI 530 Lab Intrusion Detection Systems IDS. A collection of techniques and methodologies used to monitor suspicious activities both at the network and.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
FPGA Based String Matching for Network Processing Applications Janardhan Singaraju, John A. Chandy Presented by: Justin Riseborough Albert Tirtariyadi.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Application: String Matching By Rong Ge COSC3100
1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3.
Design and Analysis of Algorithms - Chapter 71 Space-time tradeoffs For many problems some extra space really pays off: b extra space in tables (breathing.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
© Love Ekenberg Hashing Love Ekenberg. © Love Ekenberg In General These slides provide an overview of different hashing techniques that are used to store.
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
1 ECE 526 – Network Processing Systems Design System Implementation Principles I Varghese Chapter 3.
PANACEA: AUTOMATING ATTACK CLASSIFICATION FOR ANOMALY-BASED NETWORK INTRUSION DETECTION SYSTEMS Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao.
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
Design and Analysis of Algorithms – Chapter 71 Space-Time Tradeoffs: String Matching Algorithms* Dr. Ying Lu RAIK 283: Data Structures.
ECE 526 – Network Processing Systems Design Network Address Translator II.
Theory of Computational Complexity Yusuke FURUKAWA Iwama Ito lab M1.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Snort – IDS / IPS.
The Variable-Increment Counting Bloom Filter
CSE7701: Research Seminar on Networking
James Logan CS526 Dr. Chow April 29, 2009
CSCE350 Algorithms and Data Structure
Bloom Filters Very fast set membership. Is x in S? False Positive
Objective of This Course
Chapter 7 Space and Time Tradeoffs
Algorithm Discovery and Design
Hash Functions for Network Applications (II)
15-826: Multimedia Databases and Data Mining
An index-split Bloom filter for deep packet inspection
Presentation transcript:

ECE 526 – Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese

Ning WengECE 5262 Goal Gain basic knowledge to improve network security from network processing system design perspective

Ning WengECE 5263 Outline Signature-based IDSs String matching algorithms ─ Boyer-Moore ─ Aho-Corasic ─ Bloom Filter ─ Approximated Searching ─ Approximated Searching Based on Bloom Filters Summary

Ning WengECE 5264 Internet Security Internet lacking of security ─ Example? What is Internet Security ─ Confidentiality: data keeping private ─ Integrity: protected from modification or destruction ─ Availability: data or service accessible What are current approaches ─ Engineering? ─ non-engineering? ─ Intrusion Detection Systems (IDSs)

Ning WengECE 5265 Intrusion Detection Systems Two types of Intrusion Detection Systems (IDSs) ─ Signature detection: based on matching events to the signatures of known attacks ─ Anomaly detection: based on statistical or learning theory to identify aberrant events Three important tasks ─ String matching: searching suspicious strings in packet payloads ─ Traceback: to detect intruder who uses forged source address ─ Detect onset of new worm without prior knowledge The problems of current IDSs ─ Very slow ─ Have a high false-positive rate ─ false positive: answering membership query positively when member is not in the set

Ning WengECE 5266 Snort Rule Example Snort: ─ one of lightweight detection system, open source ─ Snort rule example: Alert tcp $BAD 80 -> $GOOD 90 \ (content: “perl.exe”; msg: “detected perl.exe”;) ─ Looking for string “perl.exe” contained in TCP packet from IP: $BAD, Port: 80 to IP: $GOOD, Port: 90 ─ Upon detection, generating alert with “detected perl.exe” Question: a packet coming, how to check it? Question: how about multiple rules? String matching is bottleneck

Ning WengECE 5267 String Searching: brute force Arbitrary string can be anywhere in the packet Naive approach Input: String size: m; packet size: n (assuming n >m) For i:=0 to n-m do For j:=0 to m-1 do Compare string[j] with packet[i+j] If not equal exit the inner loop Complexity: ─ worst case O(m*n) ─ Best case O(n) Can we do better?

Ning WengECE 5268 Boyer-Moore: example Improving by skipping over a larger number of character and by comparing last character first How to build the ship table?

Ning WengECE 5269 Boyer Moore: skip table How far to skip when the last character does not match. For example ─ pattern: CAB ─ Skip: 1 * 2 3 3… ─ Last A B C D E Care is needed with repeated letters For example ─ pattern: ABBA ─ Skip: * … ─ Last: A B C D E … Skip[c] = distance of last occurrence of c from end in pattern

Ning WengECE Boyer Moore: algorithm Input: pattern with size m; packet with size n i: =0 While i<=n-m do If pattern[m-1] = packet[i+m-1] then //last character first For j:=0 to m – 1 do Compare pattern[j] with packet[i+j] //one by one sequentially i:=i+1 Else i:=i+skip[packet[i+m-1] //skip Complexity: ─ best case O(n/m) ─ worst case still O(nm)

Ning WengECE Aho-Corasic Failure pointer ─ Prevent restarting at top of trie when failure occurring ─ New attempt made by shifting How about multiple strings? BABAR

Ning WengECE Multiple String Trie Construction 0 1 h e s i h s ers Initial State Accepting State State Transition Function h S h h h h h S S S S S S i h r h Example: P = {he, she, his, hers}

Ning WengECE h e s i h s e r s h S h h h h h S S S S S S i h r h Aho-Corasick: Searching hxhers Scanning input stream only once Complexity: linear time. Input stream: Matching String

Ning WengECE Aho-Corasick: summary Pros: ─ Computation complexity: worst case O(n) ─ Can scan once and output all matches Cons: ─ Constructing a finite state machine ─ Failure pointers needed ─ Too big to be on chip Each node has maximum 256 pointers

Ning WengECE Hashing One efficient set membership query mechanism ─ Programming trivial ─ Query complexity: O(n) best case (n: size of packet) ─ Query accuracy: possible false positive However, to handle collision ─ Each hash entry containing a list of IDs of all elements share the hash value ─ Storage minimal requirement: O(n*w) n: number of elements, w: minimal width of each element Question: can we trade accuracy for storage requirement using hashing idea?

Ning WengECE Bloom Filter Data structured proposed by Burton Bloom Randomized data structure ─ Strings stored using multiple hash functions (programming) ─ Check string’s presence based on multiple bits (querying) Membership queries result in false positives Powerful tools for ─ Content networks ─ Route trace back ─ Network measurements ─ Intrusion Detection

Ning WengECE Bloom Filter Programming Instead using one hash function, k independent hash functions Instead requiring n*w bit storage; m-bit vector required Initially all bit are cleared Programming set bit based on each hashing function ─ bit remaining set if two elements hashed to same position

Ning WengECE Bloom Filter Querying Procedure: String x is computed by k hashing functions Each hashing function pinpointing one bit in m-bit vector All value in m-bit vector are ANDed If match ==0, x is not a member else x is positive member

Ning WengECE Bloom Filter: false positive rate n: number of strings to be stored k: number of hash functions m: the size of bit array The false positive probability ─ f = (1/2) k ─ Optimal value hash functions k K = ln2 * m/n = 0.693*m/n False positive rate decreases exponentially with number of hash functions & memory

Ning WengECE Counting Bloom Filters Member deletion ─ Deletion of a member requiring clearing all the related bits ─ A bit once set in the bit vector can not be deleted easily the bit can be set by multiple members Solution ─ Assuming member deletion rare case ─ Counting bloom filter Updating counter when element added or deleted Bit reset in m-bit vector when counter value is 0

Ning WengECE Approximate String Searching Using Bloom filter

Ning WengECE Approximate String Searching John W. Lockwood and etc. “DEEP PACKET INSPECTION USING PARALLEL BLOOM FILTERS”

Ning WengECE Summary IdeaComputationStorageProblem Brute Force NaïveO(m*n)slow Boyer-Moore SkipO(m*n) –worst O(n/m) – best 0.1 MB (10K Rules) Shift table needed Aho Corasick TireO(n) – worst case 50 MB (1500 Rules) Storage demanding Bloom-Filter Approximate searching O(n)0.1 MB (10K Rules) False positive

Ning WengECE For Next Class Read Comer: chapter 6 and 9 Final Project (option 1) ─ Project group finalized 9/19/07: group leader: me your group members. each group no more than 3 members. ─ Project topic finalized. 9/28/07: Group leader: me your topic. Paper presentation + Final exam (Option 2) 9/19/07: group leader: me your group members. each group no more than 2 members. based on assigned one or two papers (<20 min)