Deep Packet Inspection: Where are We? CCW08 Michela Becchi
Michela Becchi – 2/27/ /23/2008 Assumption n The packet payload is not encrypted and can therefore be inspected 2
Michela Becchi – 2/27/ /23/2008 Background: Rule-set complexity n Practical rule-sets: »Snort, as of November 2007 –8536 rules, 5549 Perl Compatible Regular Expressions l 99% with character ranges ([c 1 -c k ],\s,\w…) l 16.3 % with dot-star terms (.*, [^c 1..c k ]* l 44 % with counting constraints (.{n.m}, [^c 1..c k ]{n,m}) n Rule-set proposals: »[R. Sommer and V. Paxson, CCS 2003] »[J. Newsome et al, Security and Privacy Symposium 2005] »[Y. Xie et al, SIGCOMM 2008] 3 Deep packet inspection Regular expression matching at line rate Finite Automata based techniques =
Michela Becchi – 2/27/ /23/2008 Target Architectures Regex-Matching Engine FPGA logic Memory-centric architectures FPGA / ASIC + memory General purpose processors Network processors 4 available parallelism
Michela Becchi – 2/27/ /23/2008 Challenges FPGA logic Memory-centric architectures FPGA / ASIC + memory General purpose processors Network processors Logic cell utilization Clock frequency Memory space Memory bandwidth 5 NFA DFA
Michela Becchi – 2/27/ /23/2008 Directions for DFA-based solutions 6 Memory-centric architectures FPGA / ASIC + memory General purpose processors Network processors DFA COMPRESSION - Default transitions (D 2 FA) - Alphabet reduction + ENCODING STATE EXPLOSION - Multiple-DFA - Hybrid-FA - History-based-FA - XFA - Generality Covered regex classes Automatability - Suitable memory architecture - Average and worst case bound
Michela Becchi – 2/27/ /23/2008 Multiple Flow Handling FPGA logic Memory-centric architectures FPGA / ASIC + memory General purpose processors Network processors Peak performance on single flow No intrinsic multiple-flow support Amount of per-flow state Active states Counters History bits … 7 NFA - Multiple-DFA - Hybrid-FA - History-based-FA - XFA Can we aggregate throughput over multiple flows ? Can we face denial of service attacks based on multiple flows?
Michela Becchi – 2/27/ /23/2008 Some Results n About 500 complex regex from Snort NIDS n FPGA logic (NFA) – Xilinx Virtex 5 – 1 flow »6.1 Gpbs, using slices (46% utilization on XC5VLX50) »Note: XC5VLX330 device has 51,840 slices n FPGA/ASIC + memory (projected) »Multiple-DFA: 13 DFAs, < 1MB footprint each »2Gbps on single flow assuming 500 MHz n NP: Intel IXP2800 »5 1.4 GHz, 5 flows »1 KB scratchpad, 5MB SRAM, 128 MB DRAM »Multiple-DFA (13 DFAs): Mbps »Hybrid-FA: Mbps 8
Michela Becchi – 2/27/ /23/2008 Discussion n FPGA offer an easier way to support large data-sets of complex regular expressions n On memory based architectures »high parallelism, large memory bandwidth and low memory latency necessary to guarantee high throughput »complex rule-sets bring data-structure/algorithmic challenges n Multiple flow support necessary n Finite state machines performance bottleneck: »One input character processed at each iteration n Open question: less complex patterns allowing tokenizers n Payload encryption »Anomaly detection and probabilistic based methods »Deep packet inspection still available as filtering/classification tool within private networks 9
Michela Becchi – 2/27/ /23/2008 Code available at: 10