Space-Time Tradeoffs in Software-Based Deep Packet Inspection Anat Bremler-Barr Yotam Harchol ⋆ David Hay IDC Herzliya, Israel Hebrew University, Israel.

Slides:



Advertisements
Similar presentations
Deep Packet Inspection: Where are We? CCW08 Michela Becchi.
Advertisements

Deep packet inspection – an algorithmic view Cristian Estan (U of Wisconsin-Madison) at IEEE CCW 2008.
Deep Packet Inspection(DPI) Engineering for Enhanced Performance of Network Elements and Security Systems PIs: Dr. Anat Bremler-Barr (IDC) Dr. David.
Fast and Scalable Pattern Matching for Content Filtering Sarang Dharmapurikar John Lockwood.
Space-Time Tradeoffs in Software-based Deep Packet Inspection Author: Anat Bremler-Barr, Yotam Harchol, and David Hay Published in Proc. IEEE HPSR 2011.
1 IP-Lookup and Packet Classification Advanced Algorithms & Data Structures Lecture Theme 08 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Spring 2006CS 685 Network Algorithmics1 Principles in Practice CS 685 Network Algorithmics Spring 2006.
Network Algorithms, Lecture 4: Longest Matching Prefix Lookups George Varghese.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Efficient Memory Utilization on Network Processors for Deep Packet Inspection Piti Piyachon Yan Luo Electrical and Computer Engineering Department University.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
IP Routing Lookups Scalable High Speed IP Routing Lookups.
Multi-Core Packet Scattering to Disentangle Performance Bottlenecks Yehuda Afek Tel-Aviv University.
Deep Packet Inspection as a Service Yaron Koral† Joint work with Anat Bremler-Barr‡, Yotam Harchol† and David Hay† †The Hebrew University, Israel ‡IDC.
Decompression-Free Inspection: DPI for Shared Dictionary Compression over HTTP Author: Anat Bremler-Barr, Yaron Koral, Shimrit Tzur David, David Hay Publisher:
Decompression-Free Inspection: DPI for Shared Dictionary Compression over HTTP Anat Bremler-Barr Interdisciplinary Center Herzliya Shimrit Tzur David Interdisciplinary.
MCA 2: Multi Core Architecture for Mitigating Complexity Attacks Yaron Koral (TAU) Joint work with: Yehuda Afek (TAU), Anat Bremler-Barr (IDC), David Hay.
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
CS 268: Lectures 13/14 (Route Lookup and Packet Classification) Ion Stoica April 1/3, 2002.
1 Accelerating Multi-Patterns Matching on Compressed HTTP Traffic Authors: Anat Bremler-Barr, Yaron Koral Presenter: Chia-Ming,Chang Date: Publisher/Conf.
1 Gigabit Rate Multiple- Pattern Matching with TCAM Fang Yu Randy H. Katz T. V. Lakshman
ECE 526 – Network Processing Systems Design Network Security: string matching algorithm Chapter 17: George Varghese.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan U of Illinois, Urbana Champaign Tim Sherwood UC, Santa Barbara.
Modified Data Structure of Aho-Corasick Project ECE-526 Spring 2006 Benfano Soewito, Ed Flanigan and John Pangrazio Southern Illinois University Carbondale.
Deep Packet Inspection with Regular Expression Matching Min Chen, Danny Guo {michen, CSE Dept, UC Riverside 03/14/2007.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs.
Presentation by : Samad Najjar Enhancing the performance of intrusion detection system using pre-process mechanisms Supervisor: Dr. L. Mohammad Khanli.
CSE7701: Research Seminar on Networking
1 Routing with a clue Anat Bremler-Barr Joint work with Yehuda Afek & Sariel Har-Peled Tel-Aviv University.
PEDS: Parallel Error Detection Scheme for TCAM Devices David Hay, Politecnico di Torino Joint work with Anat Bremler Barr (IDC, Israel), Danny Hendler.
Deep Packet Inspection as a Service Anat Bremler-Barr IDC Herzliya Joint work with Yotam Harchol, David Hay and Yaron Koral The Hebrew University Appeared.
A High Throughput String Matching Architecture for Intrusion Detection and Prevention Lin Tan, Timothy Sherwood Appeared in ISCA 2005 Presented by: Sailesh.
Para-Snort : A Multi-thread Snort on Multi-Core IA Platform Tsinghua University PDCS 2009 November 3, 2009 Xinming Chen, Yiyao Wu, Lianghong Xu, Yibo Xue.
Author : Ozgun Erdogan and Pei Cao Publisher : IEEE Globecom 2005 (IJSN 2007) Presenter : Zong-Lin Sie Date : 2010/12/08 1.
INTERNATIONAL NETWORKS At Indiana University Hans Addleman TransPAC Engineer, International Networks University Information Technology Services Indiana.
Sujayyendhiren RS, Kaiqi Xiong and Minseok Kwon Rochester Institute of Technology Motivation Experimental Setup in ProtoGENI Conclusions and Future Work.
Accelerating Multipattern Matching on Compressed HTTP Traffic Published in : IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 20, NO. 3, JUNE 2012 Authors : Bremler-Barr,
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Space-Time Tradeoffs in Software-Based Deep Packet Inspection Anat Bremler-Barr Yotam Harchol ⋆ David Hay IDC Herzliya, Israel Hebrew University, Israel.
ORange: Multi Field OpenFlow based Range Classifier Liron Schiff Tel Aviv University Yehuda Afek Tel Aviv University Anat Bremler-Barr Inter Disciplinary.
Leveraging Traffic Repetitions for High- Speed Deep Packet Inspection Author: Anat Bremler-Barr, Shimrit Tzur David, Yotam Harchol, David Hay Publisher:
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
A Dynamic Packet Stamping Methodology for DDoS Defense Project Presentation by Maitreya Natu, Kireeti Valicherla, Namratha Hundigopal CISC 859 University.
Efficient Processing of Multi-Connection Compressed Web Traffic Yaron Koral 1 with: Yehuda Afek 1, Anat Bremler-Barr 1 * 1 Blavatnik School of Computer.
IP Address Lookup Masoud Sabaei Assistant professor
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
Memory Compression Algorithms for Networking Features Sailesh Kumar.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
A Pattern-Matching Scheme With High Throughput Performance and Low Memory Requirement Author: Tsern-Huei Lee, Nai-Lun Huang Publisher: TRANSACTIONS ON.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
On-Chip Logic Minimization Roman Lysecky & Frank Vahid* Department of Computer Science and Engineering University of California, Riverside *Also with the.
Accelerating Multi-Pattern Matching on Compressed HTTP Traffic Dr. Anat Bremler-Barr (IDC) Joint work with Yaron Koral (IDC), Infocom[2009]
Gnort: High Performance Network Intrusion Detection Using Graphics Processors Date:101/2/15 Publisher:ICS Author:Giorgos Vasiliadis, Spiros Antonatos,
Ofir Luzon Supervisor: Prof. Michael Segal Longest Prefix Match For IP Lookup.
IP Routers – internal view
CSE7701: Research Seminar on Networking
Load Balancing Memcached Traffic Using SDN
HEXA: Compact Data Structures for Faster Packet Processing
James Logan CS526 Dr. Chow April 29, 2009
Advanced Algorithms for Fast and Scalable Deep Packet Inspection
KUO-KUN TSENG, YUAN-CHENG LAI, YING-DAR LIN, and TSERN-HUEI LEE
A Hybrid Finite Automaton for Practical Deep Packet Inspection
Presentation transcript:

Space-Time Tradeoffs in Software-Based Deep Packet Inspection Anat Bremler-Barr Yotam Harchol ⋆ David Hay IDC Herzliya, Israel Hebrew University, Israel. OWASP Israel 2011 Parts of this work were supported by European Research Council (ERC) Starting Grant no ⋆ Supported by the Check Point Institute for Information Security (Was also presented in IEEE HPSR 2011)

2 Outline Motivation Background New Compression Techniques Experimental Results Conclusions

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Network Intrusion Detection Systems Classify packets according to: – Header fields: Source IP & port, destination IP & port, protocol, etc. – Packet payload (data) 3 Internet IP packet IP packet Deep Packet Inspection Motivation

BackgroundNew Compression TechniquesExperimental ResultsConclusions Deep Packet Inspection (D)RAM Cache Memory High Capacity Slow Memory Locality-based Low Capacity Fast Memory The environment: Motivation 4

BackgroundNew Compression TechniquesExperimental ResultsConclusions Our Contributions Literature assumption: try to fit data structure in cache  Efforts to compress the data structures Our paper: Is it beneficial? In reality, even in non-compressed implementation, most memory accesses are done to the cache BUT One can attack the non-compressed implementation by reducing its locality, getting it out of cache - and making it much slower! How to mitigate this attack? Compress even further - our new techniques: 60% less memory 5 Motivation

BackgroundNew Compression TechniquesExperimental ResultsConclusions Complexity DoS Attack Find a gap between average case and worst case Engineer input that exploits this gap Launch a Denial of Service attack on the system 6 Internet Real-Life Traffic Throughput Motivation

7 Outline Motivation Background New Compression Techniques Experimental Results Conclusions

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Aho-Corasick Algorithm Build a Deterministic Finite Automaton Traverse the DFA, byte by byte Accepting state  pattern found Example: {E, BE, BD, BCD, CDBCAB, BCAA} 8 [Aho, Corasick; 1975] s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 s 14 s 13 s6s6 s8s8 s9s9 s 10 s 11 C C E D B E D D B C A B A A B E CB E C B E C D E B C D E C E B C E B C E B C E C B B Background B BCDBCAB Input: s0s0 s 12 s2s2 s5s5 s6s6 s9s9 s 10 s 11

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Aho-Corasick Algorithm Naïve implementation: Represent the transition function in a table of |Σ|×|S| entries – Σ: alphabet – S: set of states Lookup time: one memory access per input symbol Space: In reality: 70MB to gigabytes… 9 [Aho, Corasick; 1975] Background ABCDE S0S S1S S2S S3S S4S S5S S6S S7S S8S :

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Potential Complexity DoS Attack 1.Exhaustive Traversal Adversarial Traffic – Traverses as much states of the automaton – Bad locality - Bad for naïve implementation (will not utilize cache) 10 s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 C C E D B ED s 14 s 13 s6s6 D s8s8 B s9s9 C s 10 A s 11 B A A Background

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Alternative Implementation Failure transition goes to the state that matches the longest suffix of the input so far Lookup time: at most two memory accesses per input symbol (via amortized analysis) Space: at most, # of symbols in pattern set, depends on implementation 11 [Aho, Corasick; 1975] B E CB E C B E C D E B C D E C E B C E B C E B C E B C B B s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 s 14 s 13 s6s6 s8s8 s9s9 s 10 s 11 C C E D B E D D B C A B A A Forward Transition Failure Transition Background s 10 s5s5 s7s7 s0s0 s1s1

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Potential Complexity DoS Attack 1.Exhaustive Traversal Adversarial Traffic -Traverses as much states of the automaton -Bad locality - Bad for naïve implementation (will not utilize cache) 2.Failure-path Traversal Adversarial Traffic -Traverses as much failure transitions -Bad for failure-path based automaton (as much memory accesses per input symbol) 12 s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 C C E D B ED s 14 s 13 s6s6 D s8s8 B s9s9 C s 10 A s 11 B A A Background

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 C C E D B ED s 14 s 13 s6s6 D s8s8 B s9s9 C s 10 A s 11 B A A s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 C C E D B ED s 14 s 13 s6s6 D s8s8 B s9s9 C s 10 A s 11 B A A Prior Work: Compress the State Representation 13 symbolABCDE forward:136 Lookup Table 7 failure: False match: ABCDE Bitmap Encoded Bitmap: Length=|Σ| forward:136 7 failure: False match: symbolAD forward:136 Linear Encoded 7 failure: False match: 2 size: Background Experimental ResultsConclusions Can count bits using popcnt instruction

14 Outline Motivation Background New Compression Techniques Experimental Results Conclusions

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Path Compression One-way branches can be represented using a single state – Similarly to PATRICIA tries Problem: Incoming failure transitions Solution: Compress only states with no incoming failure transitions 15 New Compression Techniques s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 s 14 s 13 s6s6 s8s8 s9s9 s 10 s 11 C C E D B E D D B C A B A A s0s0 s7s7 s 12 s1s1 s2s2 s3s3 s5s5 s4s4 s 14 s 13 s6s6 s8s8 s9s9 s 10 s 11 C C E D B E D D B C A B A A s0s0 s7s7 s1s1 s2s2 s3s3 s5s5 s4s4 s 14 s 13 s6s6 s8s8 s9's9' C C E D B E D D BCAB A A (B) (BC) (BCA) (BCAB) Tuck et al. Our Path Compression 100% 75% 2004

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Leaves Compression By definition, leaves have no forward transitions Their single purpose is to indicate a match – We can push this indication up by adding a bit to each pointer – Then, leaves can be eliminated from the automaton - by copying their failure transition up 16 s0s0 s7s7 s1s1 s2s2 s3s3 s5s5 s4s4 C C E D B ED s 14 s 13 s6s6 D s8's8' BCAB s9's9' A A (B) (BC) (BCA) s0s0 s7s7 s2s2 s5s5 C C E* D B D* s 13 D* BCAB* A A* (B) (BC) (BCA) E* s8's8' 3% more space reduction Reduces number of transitions taken s0s0 s7s7 s1s1 s2s2 s3s3 s5s5 s4s4 C C E* D B D* s 14 s 13 s6s6 D* s8's8' BCAB* s9's9' A A* (B) (BC) (BCA) New Compression Techniques

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Pointer Compression 17 In Snort IDS pattern-set, 79% of the fail pointers point to states in depths 0, 1, 2 Add two bits to encode depth of pointer: 00: Depth 0 01: Depth 1 10: Depth 2 11: Depth 3 and deeper DepthPointers 0 (s 0 )13% 131% 235% ≥ 321% New Compression Techniques Depth ≤ 2 16 bits pointer2 bits 11 Depth > 2 16 bits pointer2 bits16 bits pointer

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Pointer Compression 18 DepthPointers 0 (s 0 )13% 131% 235% ≥ 321% New Compression Techniques Tuck et al. Our Path Compression 100% 75% Pointer Comp. 41% 2004 Determine next state from pointer depth: -0: Go to root -1: Use a lookup table using last symbol -2: Use a hash table using last two symbols -≥ 3: Use the stored pointer SymbolState A - B s2s2 C s7s7 D - E s1s1 Depth 1 Lookup Table:Depth 2 Hash Table: hash table Last 2 symbols Next state

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Function Inlining Compressed implementation makes more memory accesses Initial implementation was based on a few functions calling each other Avoiding function calls (by inlining their code) reduced total number of memory reads by 36% 19 New Compression Techniques

20 Outline Motivation Background New Compression Techniques Experimental Results Conclusions

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Experimental Setup 21 System 1System 2 TypeMacBook ProiMac CPUCore 2 Duo 2.53GHz dual coreCore i7 2.93GHz quad core L1 Cache:16KB (data, per core) L2 Cache:3MB (shared)256KB (per core) L3 Cache:-8MB (shared) SnortClamAV* Patterns31,09416,710 States in Naïve Implementation 77,182745,303 Test Systems Pattern-Sets Experimental Results Real-life traffic logs taken from MIT DARPA * We used only half of ClamAV signatures for our tests

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Space Requirement 22 Experimental Results Memory Footprint [MB]

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Experimental Results Memory Accesses per Input Symbol 23

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Experimental Results L1 Data Cache Miss Rate 24 Intel Core 2 Duo (2 cores) 16KB L1 Data Cache 3MB L2 Cache L1 Data Cache Miss Rate

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Experimental Results L2 Cache Miss Rate 25 Intel Core 2 Duo (2 cores) 16KB L1 Data Cache 3MB L2 Cache Real-Life Traffic: 0.7% L2 Cache Miss Rate Real-Life Traffic: 0.7% L2 Cache Miss Rate Adversarial Traffic: 23% L2 Cache Miss Rate Adversarial Traffic: 23% L2 Cache Miss Rate Maximal L2 Miss Rate: 0.06% Maximal L2 Miss Rate: 0.06% L2 Cache Miss Rate

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions Experimental Results Space vs. Time: % Our Implementation Naïve Implementation Experimental Results

27 Outline Motivation Background New Compression Techniques Experimental Results Conclusions

Motivation BackgroundNew Compression TechniquesExperimental ResultsConclusions 28 Naïve Aho-Corasick implementation It is crucial to model the cache in software-based Deep Packet Inspection: Naïve Aho-Corasick implementation has a huge memory footprint, but works well on real-life traffic due to locality of reference Naïve implementation can be easily attacked, making it 7 times slower, even though it has constant number of memory accesses We also show new compression techniques: 60% less memory than best prior-art compression Stable throughput, better performance under attacks Conclusions

Questions? Thank you!