SWM: Simplified Wu-Manber for GPU- based Deep Packet Inspection Author: Lucas Vespa, Ning Weng Publisher: The 2012 International Conference on Security.

Slides:



Advertisements
Similar presentations
1 Overview Assignment 4: hints Memory management Assignment 3: solution.
Advertisements

Authors: Wei Lin, Bin Liu Publisher: ICPADS, 2008 (IEEE International Conference on Parallel and Distributed Systems) Presenter: Chia-Yi, Chu Date: 2014/03/05.
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Segmentation and Paging Considerations
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
Authors: Raphael Polig, Kubilay Atasu, and Christoph Hagleitner Publisher: FPL, 2013 Presenter: Chia-Yi, Chu Date: 2013/10/30 1.
Hermes: An Integrated CPU/GPU Microarchitecture for IPRouting Author: Yuhao Zhu, Yangdong Deng, Yubei Chen Publisher: DAC'11, June 5-10, 2011, San Diego,
Modern Information Retrieval
Deterministic Memory- Efficient String Matching Algorithms for Intrusion Detection Nathan Tuck, Timothy Sherwood, Brad Calder, George Varghese Department.
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
A Multipattern Matching Algorithm Using Sampling and Bit Index Author: Jinhui Chen, Zhongfu Ye, Min Tang Publisher: Computer Systems and Applications,
Compact State Machines for High Performance Pattern Matching Department of Computer Science and Information Engineering National Cheng Kung University,
1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
1 Scalable Multigigabit Pattern Matching for Packet Inspection Authors: Ioannis Sourdis,Dionosios N. Pnevmatikatos and Stamatis Vassiliadis Publisher:
Hashing General idea: Get a large array
1 Scalable Pattern-Matching via Dynamic Differentiated Distributed Detection (D 4 ) Author: Kai Zheng, Hongbin Lu Publisher: GLOBECOM 2008 Presenter: Han-Chen.
Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,
Gregex: GPU based High Speed Regular Expression Matching Engine Date:101/1/11 Publisher:2011 Fifth International Conference on Innovative Mobile and Internet.
A Fast Algorithm for Multi-Pattern Searching Sun Wu, Udi Manber May 1994.
Data Cache Prefetching using a Global History Buffer Presented by: Chuck (Chengyan) Zhao Mar 30, 2004 Written by: - Kyle Nesbit - James Smith Department.
SHOCK: A Worst-Case Ensured Sub-linear Time Pattern Matching Algorithm for Inline Anti-Virus Scanning Author: Nen-Fu Huang, Wen-Yen Tsai Publisher: IEEE.
 Author: Tsern-Huei Lee  Publisher: 2009 IEEE Transation on Computers  Presenter: Yuen-Shuo Li  Date: 2013/09/18 1.
High-Performance Packet Classification on GPU Author: Shijie Zhou, Shreyas G. Singapura and Viktor K. Prasanna Publisher: HPEC 2014 Presenter: Gang Chi.
A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.
Fast forwarding table lookup exploiting GPU memory architecture Author : Youngjun Lee,Minseon Jeong,Sanghwan Lee,Eun-Jin Im Publisher : Information and.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 34 Paging Implementation.
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Authors: Fang Yu, Zhifeng Chen, Yanlei Diao, T. V. Lakshman, Randy H.
Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1.
Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,
GPEP : Graphics Processing Enhanced Pattern- Matching for High-Performance Deep Packet Inspection Author: Lucas John Vespa, Ning Weng Publisher: 2011 IEEE.
8.1 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Paging Physical address space of a process can be noncontiguous Avoids.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
EQC16: An Optimized Packet Classification Algorithm For Large Rule-Sets Author: Uday Trivedi, Mohan Lal Jangir Publisher: 2014 International Conference.
StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher:
IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference.
Multi-Core Development Kyle Anderson. Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism.
INFAnt: NFA Pattern Matching on GPGPU Devices Author: Niccolo’ Cascarano, Pierluigi Rolando, Fulvio Risso, Riccardo Sisto Publisher: ACM SIGCOMM 2010 Presenter:
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Author : Weirong Jiang, Yi-Hua E. Yang, and Viktor K. Prasanna Publisher : IPDPS 2010 Presenter : Jo-Ning Yu Date : 2012/04/11.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Sunpyo Hong, Hyesoon Kim
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Publisher : ANCS’ 06 Author : Fang Yu, Zhifeng Chen, Yanlei Diao, T.V.
GFlow: Towards GPU-based High- Performance Table Matching in OpenFlow Switches Author : Kun Qiu, Zhe Chen, Yang Chen, Jin Zhao, Xin Wang Publisher : Information.
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
An Improved Multi-Pattern Matching Algorithm for Large-Scale Pattern Sets Author : Zhan Peng, Yu-Ping Wang and Jin-Feng Xue Conference: IEEE 10th International.
Heterogeneous Computing using openCL lecture 4 F21DP Distributed and Parallel Technology Sven-Bodo Scholz.
My Coordinates Office EM G.27 contact time:
Gnort: High Performance Network Intrusion Detection Using Graphics Processors Date:101/2/15 Publisher:ICS Author:Giorgos Vasiliadis, Spiros Antonatos,
Author : Tzi-Cker Chiueh, Prashant Pradhan Publisher : High-Performance Computer Architecture, Presenter : Jo-Ning Yu Date : 2010/11/03.
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
NFV Compute Acceleration APIs and Evaluation
Oracle SQL*Loader
Challenging Cloning Related Problems with GPU-Based Algorithms
Accelerating MapReduce on a Coupled CPU-GPU Architecture
Faster File matching using GPGPU’s Deephan Mohan Professor: Dr
Kiran Subramanyam Password Cracking 1.
Lecture 3: Main Memory.
A New String Matching Algorithm Based on Logical Indexing
Author: Domenico Ficara ,Gianni Antichi ,Nicola Bonelli ,
Knuth-Morris-Pratt Algorithm.
CS703 - Advanced Operating Systems
6- General Purpose GPU Programming
Virtual Memory 1 1.
Presentation transcript:

SWM: Simplified Wu-Manber for GPU- based Deep Packet Inspection Author: Lucas Vespa, Ning Weng Publisher: The 2012 International Conference on Security and Management Presenter: Ye-Zhi Chen Date: 2012/02/20

Introduction In this work we present SWM, a simplified, multiple stride, Wu-Manber like algorithm for GPU-based deep packet inspection. SWM uses a novel method to group patterns such that the shift tables are simplified and therefore appropriate for SIMD operation

Wu-manber Wu-Manber constructs a shift table which stores a shift value (in bytes) for all B- byte substrings in the first m bytes of each pattern Ex: HELLO, B=2, m=5 Shift table operation begins by examining B-bytes of a packet starting at offset m-B+1 If shift value ≠ 0, shift p bytes and examine the B-bytes at the location If shift value = 0, compare to any patterns that share this B-byte substring as a suffix SubstringShift value (p) HE3 EL2 LL1 LO0 others4

Wu-Manber Drawback of Wu-Manber: It may be many patterns that need sequential comparison Methods to improve by Wu-Manber: 1.Using a larger value B helps reduce the number of patterns that share suffixes. 2.The shift tables for larger values of B utilize a hash table to reduce memory.

SWM Goals Avoid using larger values for B Use a direct indexed lookup for each B-bytes Avoid sequential pattern comparison to the packet text Create shift tables with full patterns rather than the first m bytes of each pattern

SWM If all patterns have a unique suffix, then any time that a stride of zero occurs, the current B-bytes are known to belong to only one specific pattern SWM pattern Grouping : Find the minimal number of groups such that no two patterns in a group share a suffix. (minimal graph coloring problem) SWM Shifting Table Construction: 1. Finding the number of characters m, in the shortest pattern. 2. Find the shift value for any B-byte substring we use the distance in bytes v from the end of the pattern that the substring occurs. The shift value for any substring is calculated to be MIN( v, m - B + 1)

SWM m=10, B=2

SWM Group Balancing In order to equalize the processing time for each pattern set and minimize the overall latency for processing a packet.

Architecture CPU : 1. Create the SWM shift tables and transferring the tables to the local memory of the GPU compute units. 2. Maintain a current packet buffer which is mapped to the global memory of the GPU 3. reads results from the matching buffer on the GPU and reports any potential attack patterns

Architecture GPU (ATI graphic card with openCL) : 1. Each stream core in a GPU can processing one group of patterns. 2. The local data store (LDS) of each compute unit, and the private memory of each stream core, contain the shift tables necessary for SWM kernel operation. each stream core processes a separate packet and any matches are reported back to the CPU

Experiment