Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,

Similar presentations


Presentation on theme: "Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,"— Presentation transcript:

1 Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos, Sotiris Ioannidis Institute of Computer Science Foundation for Research and Technology Hellas

2 General Idea How to speed up the processing throughput of intrusion detection systems by offloading the pattern matching operations to the GPU. 2Giorgos Vasiliadis ICS-FORTH

3 Introduction The problem – Network Intrusion Detection Systems (NIDS) are based on String Matching for detecting and preventing from well-known attacks – String Matching process accounts up to 75% of the total CPU processing String Matching Algorithms –Aho-Corasick Specialized hardware devices (NP, FPGAs, ASICs) –Complex to modify and program –Poor flexibility Graphics Cards –Easy to program –Powerful and ubiquitous –Researches have begun exploring ways to tap their power for non-graphics applications 3Giorgos Vasiliadis ICS-FORTH

4 Why use the GPU ? The GPU is specialized for compute-intensive, highly parallel computation 4Giorgos Vasiliadis ICS-FORTH

5 NVIDIA GeForce SIMD Architecture Many Multiprocessors Each multiprocessor contains many Stream Processors Memory model – Shared On-Chip Memory 1 cycle – Constant Memory 400-600 cycles; 1 cycle if cached – Texture Memory 400-600 cycles; 1 cycle if cached – Global Device Memory 400-600 cycles Size Giorgos Vasiliadis ICS-FORTH GPU can be used as a general purpose processor, capable of executing many threads in parallel

6 The Aho-Corasick Algorithm Used in most modern NIDSes  Scans for multiple patterns simultaneously Preprocess all patterns to build a state machine The state machine is used to scan for multiple patterns simultaneously at linear time  Complexity is independent of the number of patterns Example: P={he, she, his, hers} 6Giorgos Vasiliadis ICS-FORTH

7 Mapping Aho-Corasick on GPU How to represent the State Machine ? Snort represent each state as an array of pointers – It is difficult to map them on the GPU memory  Transform to a 2D array – Can easily bind to Texture Memory Texture fetches are cached Aho-Corasick exhibits strong locality of references Random access memory read  The usage of Texture Memory boosts GPU execution time about 19 % 7Giorgos Vasiliadis ICS-FORTH

8 Parallelizing Packet Searching (1/2) Assigning a Single Packet to each Multiprocessor  Each packet is copied to the shared memory of the Multiprocessor  Stream Processors search different parts of the packet concurrently  Overlapping computation Matching patterns may span consecutive chunks of the packet  Same amount of work per Stream Processor Stream Processors will be synchronized 8Giorgos Vasiliadis ICS-FORTH

9 Parallelizing Packet Searching (2/2) Assigning a Single Packet to each Stream Processor  Each packet is processed by a different Stream Processor  No overlapping computation  Different amount of work per Stream Processor Stream processors of the same Multiprocessor will have to wait until all have finished 9Giorgos Vasiliadis ICS-FORTH

10 Software Mapping Packets are transferred to the GPU in batches – Performs much better than making each transfer separately  Packets are stored to a buffer that is copied to the GPU when gets full Use page-locked memory to store the packets – Higher transfer throughput from host to device – Copies are performed using DMA, without occupying the CPU CPU and GPU execution can overlap 10Giorgos Vasiliadis ICS-FORTH

11 Evaluation (1/2) Scalability as a function of the number of patterns 11Giorgos Vasiliadis ICS-FORTH We ran Snort using random generated patterns All patterns are matched against every packet Payload trace contained UDP 800-bytes packets of random payload  Throughput remains constant when #patterns increases  2.4x faster than the CPU

12 Evaluation (2/2) Throughput as a function of the packets size 12Giorgos Vasiliadis ICS-FORTH Ran Snort using 1000 random patterns All patterns are matched against every packet  2.3 Gbit/s for full packets  3.2x faster compared to the CPU  Both GPU implementations do not present significant differences in performance

13 Evaluation with real input and rules Experimental setup – Two PCs connected via a 1 Gbit/s Ethernet switch To directly compare with prior work [Jacob et al], we re-implemented the Knuth-Morris-Pratt (KMP) and Boyer-Moore (BM) algorithms on the GPU. Giorgos Vasiliadis ICS-FORTH13

14 Evaluation with real input and rules 14Giorgos Vasiliadis ICS-FORTH Snort loaded about 8000 patterns. Preprocessors and PCRE were disabled  Original Snort (AC) cannot process all packets in rates higher than 300 Mbit/s  GPU-assisted Snort (AC1, AC2) begins to loose packets at 600 Mbit/s  200% improvement  KMP and BM algorithms used from [Jacob et al] perform worse in all cases

15 Conclusion Graphics cards can be used effectively to speed up Network Intrusion Detection Systems. – Low-cost – Easy programming Future work includes – Transfer the packets directly from the NIC to the GPU – Utilize multiple GPUs on multi-slot motherboards 15Giorgos Vasiliadis ICS-FORTH

16 Thank you Any questions? gvasil@ics.forth.gr Giorgos Vasiliadis ICS-FORTH16


Download ppt "Gnort: High Performance Intrusion Detection Using Graphics Processors Giorgos Vasiliadis, Spiros Antonatos, Michalis Polychronakis, Evangelos Markatos,"

Similar presentations


Ads by Google