High Performance Pattern Matching using Bloom–Bloomier Filter

Slides:



Advertisements
Similar presentations
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Advertisements

Massively Parallel Cuckoo Pattern Matching Applied For NIDS/NIPS  Author: Tran Ngoc Thinh, Surin Kittitornkun  Publisher: Electronic Design, Test and.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Pipelined Parallel AC-based Approach for Multi-String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
1 Multi-Core Architecture on FPGA for Large Dictionary String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
An Efficient and Scalable Pattern Matching Scheme for Network Security Applications Department of Computer Science and Information Engineering National.
1 Performance Improvement of Two-Dimensional Packet Classification by Filter Rephrasing Department of Computer Science and Information Engineering National.
Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,
OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.
Packet Classification using Rule Caching Author: Nitesh B. Guinde, Roberto Rojas-Cessa, Sotirios G. Ziavras Publisher: IISA, 2013 Fourth International.
Packet Classification Using Multi-Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: COMPSACW, 2013 IEEE 37th Annual (Computer.
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
Leveraging Traffic Repetitions for High- Speed Deep Packet Inspection Author: Anat Bremler-Barr, Shimrit Tzur David, Yotam Harchol, David Hay Publisher:
A Hybrid IP Lookup Architecture with Fast Updates Author : Layong Luo, Gaogang Xie, Yingke Xie, Laurent Mathy, Kavé Salamatian Conference: IEEE INFOCOM,
EQC16: An Optimized Packet Classification Algorithm For Large Rule-Sets Author: Uday Trivedi, Mohan Lal Jangir Publisher: 2014 International Conference.
Lecture 16: Reconfigurable Computing Applications November 3, 2004 ECE 697F Reconfigurable Computing Lecture 16 Reconfigurable Computing Applications.
StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher:
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
Memory-Efficient Regular Expression Search Using State Merging Author: Michela Becchi, Srihari Cadambi Publisher: INFOCOM th IEEE International.
Research on TCAM-based OpenFlow Switch Author: Fei Long, Zhigang Sun, Ziwen Zhang, Hui Chen, Longgen Liao Conference: 2012 International Conference on.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Early Detection of DDoS Attacks against SDN Controllers
OpenFlow MPLS and the Open Source Label Switched Router Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:
1 DESIGN AND EVALUATION OF A PIPELINED FORWARDING ENGINE Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan.
LOP_RE: Range Encoding for Low Power Packet Classification Author: Xin He, Jorgen Peddersen and Sri Parameswaran Conference : IEEE 34th Conference on Local.
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
Deep Packet Inspection as a Service Author : Anat Bremler-Barr, Yotam Harchol, David Hay and Yaron Koral Conference: ACM 10th International Conference.
Range Hash for Regular Expression Pre-Filtering Publisher : ANCS’ 10 Author : Masanori Bando, N. Sertac Artan, Rihua Wei, Xiangyi Guo and H. Jonathan Chao.
LightFlow : Speeding Up GPU-based Flow Switching and Facilitating Maintenance of Flow Table Author : Nobutaka Matsumoto and Michiaki Hayashi Conference:
Scalable Multi-match Packet Classification Using TCAM and SRAM Author: Yu-Chieh Cheng, Pi-Chung Wang Publisher: IEEE Transactions on Computers (2015) Presenter:
Reorganized and Compact DFA for Efficient Regular Expression Matching
2018/4/27 PiDFA : A Practical Multi-stride Regular Expression Matching Engine Based On FPGA Author: Jiajia Yang, Lei Jiang, Qiu Tang, Qiong Dai, Jianlong.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
2018/6/26 An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping Author: Zhao Ruan, Xianfeng Li , Wenjun Li Publisher: 2013.
Accelerating Pattern Matching for DPI
Cache Memory Presentation I
Regular Expression Matching in Reconfigurable Hardware
Statistical Optimal Hash-based Longest Prefix Match
2018/11/19 Source Routing with Protocol-oblivious Forwarding to Enable Efficient e-Health Data Transfer Author: Shengru Li, Daoyun Hu, Wenjian Fang and.
SigMatch Fast and Scalable Multi-Pattern Matching
Parallel Processing Priority Trie-based IP Lookup Approach
Scalable Memory-Less Architecture for String Matching With FPGAs
2019/1/3 Exscind: Fast Pattern Matching for Intrusion Detection Using Exclusion and Inclusion Filters Next Generation Web Services Practices (NWeSP) 2011.
Memory-Efficient Regular Expression Search Using State Merging
A Small and Fast IP Forwarding Table Using Hashing
Scalable Multi-Match Packet Classification Using TCAM and SRAM
A New String Matching Algorithm Based on Logical Indexing
EMOMA- Exact Match in One Memory Access
2019/5/2 Using Path Label Routing in Wide Area Software-Defined Networks with OpenFlow ICNP = International Conference on Network Protocols Presenter:Hung-Yen.
Compact DFA Structure for Multiple Regular Expressions Matching
2019/5/3 A De-compositional Approach to Regular Expression Matching for Network Security Applications Author: Eric Norige Alex Liu Presenter: Yi-Hsien.
2019/5/5 A Flexible Wildcard-Pattern Matching Accelerator via Simultaneous Discrete Finite Automata Author: Hsiang-Jen Tsai, Chien-Chih Chen, Yin-Chi Peng,
Online NetFPGA decision tree statistical traffic classifier
Pipelined Architecture for Multi-String Matching
2019/5/14 New Shift table Algorithm For Multiple Variable Length String Pattern Matching Author: Punit Kanuga Presenter: Yi-Hsien Wu Conference: 2015.
Power-efficient range-match-based packet classification on FPGA
OpenSec:Policy-Based Security Using Software-Defined Networking
Authors: A. Rasmussen, A. Kragelund, M. Berger, H. Wessing, S. Ruepp
Design principles for packet parsers
A Hybrid IP Lookup Architecture with Fast Updates
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
MEET-IP Memory and Energy Efficient TCAM-based IP Lookup
Packet Classification Using Binary Content Addressable Memory
Presentation transcript:

High Performance Pattern Matching using Bloom–Bloomier Filter 2019/9/10 High Performance Pattern Matching using Bloom–Bloomier Filter Author: Nguyen Duy Anh Tuan, Bui Trung Hieu, Tran Ngoc Thinh Presenter: Yi-Hsien Wu Conference: Electrical Engineering/Electronics Computer Telecommunications and Information Technology (ECTI-CON), 2010 International Conference Date: 2017/8/9 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. CSIE CIAL Lab 1

Outline Introduction Background System Architecture FPGA Performance Implementation Conclusion National Cheng Kung University CSIE Computer & Internet Architecture Lab

2019/9/10 Introduction Nowadays, along with the development of internet, information security is becoming more and more critical. One of the most important aspects in this field is antivirus. Although there are many improvements in antivirus programs, they still have to match a file stream with static patterns of known viruses. This process occupies a noticeable amount of resource and slows down entire system due to the growing number of viruses. In addition, software-based solution can not catch up with the gigabit networks. FPGA-based system is one of popular hardware technologies because of its high speed operation and flexibility in changing application. When the controller is responsible for all the tasks it becomes a bottleneck and the solution does not scale well. (reason 2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Background (Bloom Filter ) 2019/9/10 Background (Bloom Filter ) Bloom Filter : A method to use an index table to check the existence of a given string in a pre-defined set . Initially, all entries in the 1-bit-array index table are set to '0'. Each member of pre-defined set is then hashed to k positions in the index table by k hash functions, entries corresponding with those positions are set to '1'. This process is repeated until all members of pre-defined set are hashed to index table. Suppose k = 3 : Xi = element When the controller is responsible for all the tasks it becomes a bottleneck and the solution does not scale well. (reason 2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Background (Bloom Filter ) 2019/9/10 Background (Bloom Filter ) Membership of a string is checked in similar method. At first, that string is fed to k hash functions above to get k entries in the index table. If one of these entries is '0', this string is not member of the set, otherwise, the existence of this string in the set is uncertain and further check is required. This uncertainty is caused by "false positive" problem in hash-based system. Probability of false positive depends on number of hash functions (k), size of set (n) and length of index table (m). When the controller is responsible for all the tasks it becomes a bottleneck and the solution does not scale well. (reason 2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Background (Bloomier Filter ) 2019/9/10 Background (Bloomier Filter ) Bloomier Filter : It can show exactly which pattern in the set is the best match with the searched string so the query time is constant. Bloomier Filter’s algorithm is similar to Bloom Filter but its index table is constructed in a different method. It stores more information in one entry, as a result, size of each entry depends on which information is encoded. When the controller is responsible for all the tasks it becomes a bottleneck and the solution does not scale well. (reason 2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Background (Bloomier Filter ) 2019/9/10 Background (Bloomier Filter ) When the controller is responsible for all the tasks it becomes a bottleneck and the solution does not scale well. (reason 2) National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2019/9/10 System Architecture The OpenSec framework : Security functions are provided by the processing units; traffic is routed to each processing unit based on requiremens given through security policies; the reaction to security alerts is automated. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

System Architecture Our BBF system in Fig. 2 . It includes a Character Scanning Unit, a Comparison Unit and an Off-chip Memory to store original patterns. There are 3 main components in Character Scanning Unit: Standalone Hash Unit (SHU), Bloom –Bloomier Unit (BBU) and On-chip Memory to store suspected strings. If one of BBUs signals a match, the address of correlated original pattern and current scanning string (suspected string) are passed to Comparison Unit to determine whether that string is identical with original pattern. When the exact match is confirmed, our system reports this match together with ID of the pattern. National Cheng Kung University CSIE Computer & Internet Architecture Lab

System Architecture Character Scanning Unit : 2019/9/10 System Architecture Character Scanning Unit : Our solution is to combine Bloomier Filter with Bloom Filter in order to minimize the comparison process which requires accessing to low-speed off-chip memory. We use one more bit in the index table entry, called "Bloom bit“. All of the Bloom bits in index table act as an independent Bloom Filter. We use this Bloom Filter to check whether the query string may be in the set. If all of Bloom bits in hashed positions are ‘1’, we decode the information from Bloomier encoded bits then start comparing the string. We have to access off-chip memory to do comparison for every searched string. At this point, Bloomier Filter is worse than Bloom Filter which just performs comparison when all of bits in hashed-positions are ‘1’. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

System Architecture Another problem with this structure is the limitation of SRAM-based FPGA chips. k hash functions require k random lookups to the memory (index table) in a single clock cycle, whereas SRAM-based FPGA only supports 2 queries at a time. To solve this problem, we break Bloomier bits into (k/2) parts, encode them into (k/2) separated index tables. Hence k hash functions are used, each pair of them corresponds to one index table. National Cheng Kung University CSIE Computer & Internet Architecture Lab

System Architecture Example 1 demonstrates the operation of Bloom – Bloomier Filter : National Cheng Kung University CSIE Computer & Internet Architecture Lab

System Architecture Hash Function Consideration The number of hash functions used in BBF has major impact on system performance because it affects the false positive rate of filter. High false positive rate means there will be more suspected strings need to be checked . National Cheng Kung University CSIE Computer & Internet Architecture Lab

System Architecture Comparison Unit 2019/9/10 System Architecture Comparison Unit It has 2 parts: the Priority FIFO Module and Comparator Module. Priority FIFO : Collect all output data of BBU , and pass it to Comparator one by one. Comparator Module : Reads data from Priority FIFO, uses this information to compare original pattern from off-chip memory and corresponding suspected string from on-chip memory. Whenever an exact match is detected, the comparator will report the pattern’s ID, terminate the system till the next stream arrives. We connect all output data of BBU-FIFOs to a common bus and selectively fetch BBU-FIFOs’ content to Priority FIFO. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

System Architecture Database update Our system mostly relies on memory: on-chip memory stores index tables and off-chip memory stores original patterns Content. To remove a pattern out of database, we simply change the value of pointer in off-chip memory to null, when the Comparator notices this invalid value, it drops current suspected string, proceeds to examine the next string. Adding new patterns is not easy as removing. The software running on PC has to re-construct the index table then transmit new index table’s value via Communication Module in BBF system to replace old index table. National Cheng Kung University CSIE Computer & Internet Architecture Lab

FPGA Performance Implementation 2019/9/10 FPGA Performance Implementation Our system is implemented on the Altera DE2 Development and Education board. The FPGA chip in DE2 is Cyclone II EP2C35F672C6. We use Quartus II 9.1 Web Edition for hardware synthesis, mapping, placing and routing. We only implement the patterns of lengths from 10 to 20 characters. There are 2751 patterns with total of 43,951 characters in this range. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

FPGA Performance Implementation 2019/9/10 FPGA Performance Implementation National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

FPGA Performance Implementation 2019/9/10 FPGA Performance Implementation National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

FPGA Performance Implementation 2019/9/10 FPGA Performance Implementation Because of the limitation of low-cost FPGA chip and of synthesis tool, our system can only operate at 128 MHz (1Gbps). Table 1 is the list of hardware consumption for all components in the system which consists of 9 SHUs, 11 BBUs with their index tables and FIFO, Comparator Module, Priority FIFO and on-chip memory to save suspected strings. Our system also uses 54.5 kilobytes available off-chip memory on DE2 board to store all original patterns. Table 2 shows the comparison of our system with previous systems in on-chip memory usage. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2019/9/10 Conclusion Our system is an effective solution to accelerate the performance of pattern matching in ClamAV. The novel aspect of this system is the combination of Bloom Filter and Bloomier Filter to minimize the off-chip memory access times for exact pattern comparison. In near future, our system will support all ClamAV simple patterns and some kinds of wildcard. We also intend to create a system called Hybrid Antivirus Solution, which is a combination of hardware and software to take full advantage of high speed FPGA-based system as well as flexibility of PC application. An anti-virus application running on PC uses some heuristic algorithms to discover unknown viruses while FPGA-based system scans the file stream in order to detect known viruses by their signatures. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab