Authors: Raphael Polig, Kubilay Atasu, and Christoph Hagleitner Publisher: FPL, 2013 Presenter: Chia-Yi, Chu Date: 2013/10/30 1.

Slides:



Advertisements
Similar presentations
Deep Packet Inspection: Where are We? CCW08 Michela Becchi.
Advertisements

Computer Architecture
System Integration and Performance
Authors: Wei Lin, Bin Liu Publisher: ICPADS, 2008 (IEEE International Conference on Parallel and Distributed Systems) Presenter: Chia-Yi, Chu Date: 2014/03/05.
Dr. Rabie A. Ramadan Al-Azhar University Lecture 3
A HIGH-PERFORMANCE IPV6 LOOKUP ENGINE ON FPGA Author : Thilan Ganegedara, Viktor Prasanna Publisher : FPL 2013.
TOPIC : Finite State Machine(FSM) and Flow Tables UNIT 1 : Modeling Module 1.4 : Modeling Sequential circuits.
Boosting XML filtering through a scalable FPGA-based architecture A. Mitra, M. Vieira, P. Bakalov, V. Tsotras, W. Najjar.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
1 Asynchronous Bit-stream Compression (ABC) IEEE 2006 ABC Asynchronous Bit-stream Compression Arkadiy Morgenshtein, Avinoam Kolodny, Ran Ginosar Technion.
400 Gb/s Programmable Packet Parsing on a Single FPGA Authors : Michael Attig 、 Gordon Brebner Publisher: 2011 Seventh ACM/IEEE Symposium on Architectures.
Using Cell Processors for Intrusion Detection through Regular Expression Matching with Speculation Author: C˘at˘alin Radu, C˘at˘alin Leordeanu, Valentin.
Processor Technology and Architecture
Data Parallel Algorithms Presented By: M.Mohsin Butt
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
The Control Unit: Sequencing the Processor Control Unit: –provides control signals that activate the various microoperations in the datapath the select.
1 A Virus Scanning Engine Using a Parallel Finite-Input Memory Machine and MPUs Author: Hiroki Nakahara, Tsutomu Sasao, Munehiro Matsuura, and Yoshifumi.
1 Multi-Core Architecture on FPGA for Large Dictionary String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
1 Regular expression matching with input compression : a hardware design for use within network intrusion detection systems Department of Computer Science.
1 Scalable Pattern Matching for High Speed Networks Authors: Christopher R.Clark and David E. Schemmel Publisher: Proceedings of IEEE Symposium on Field-
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Deep Packet Inspection with Regular Expression Matching Min Chen, Danny Guo {michen, CSE Dept, UC Riverside 03/14/2007.
Chapter 6 Memory and Programmable Logic Devices
Chapter 3 Digital Logic Structures. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3-2 Building Functions.
CHP - 9 File Structures. INTRODUCTION In some of the previous chapters, we have discussed representations of and operations on data structures. These.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
 Author: Kubilay Atasu, Florian Doerfler, Jan van Lunteren, Christoph Hagleitner  Publisher: 2013 FPL  Presenter: Yuen-Shuo Li  Date: 2013/10/30 1.
1 of 23 Fouts MAPLD 2005/C117 Synthesis of False Target Radar Images Using a Reconfigurable Computer Dr. Douglas J. Fouts LT Kendrick R. Macklin Daniel.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
FPGA Based String Matching for Network Processing Applications Janardhan Singaraju, John A. Chandy Presented by: Justin Riseborough Albert Tirtariyadi.
TFA : A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Tang Song and H. Jonathan Chao Publisher: Technical.
designKilla: The 32-bit pipelined processor Brought to you by: Victoria Farthing Dat Huynh Jerry Felker Tony Chen Supervisor: Young Cho.
An Efficient Regular Expressions Compression Algorithm From A New Perspective  Author: Tingwen Liu, Yifu Yang, Yanbing Liu, Yong Sun, Li Guo  Publisher:
GPEP : Graphics Processing Enhanced Pattern- Matching for High-Performance Deep Packet Inspection Author: Lucas John Vespa, Ning Weng Publisher: 2011 IEEE.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
Lecture 16: Reconfigurable Computing Applications November 3, 2004 ECE 697F Reconfigurable Computing Lecture 16 Reconfigurable Computing Applications.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 Optimization of Regular Expression Pattern Matching Circuits on FPGA Department of Computer Science and Information Engineering National Cheng Kung University,
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Chapter 3 Digital Logic Structures. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3-2 Transistor: Building.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
1 Fundamentals of Computer Science Combinational Circuits.
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
LaFA Lookahead Finite Automata Scalable Regular Expression Detection Authors : Masanori Bando, N. Sertac Artan, H. Jonathan Chao Masanori Bando N. Sertac.
Sequential Logic Flip-Flop Circuits By Dylan Smeder.
Author: Weirong Jiang, Viktor K. Prasanna Publisher: th IEEE International Conference on Application-specific Systems, Architectures and Processors.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Introduction to Intrusion Detection Systems. All incoming packets are filtered for specific characteristics or content Databases have thousands of patterns.
Gnort: High Performance Network Intrusion Detection Using Graphics Processors Date:101/2/15 Publisher:ICS Author:Giorgos Vasiliadis, Spiros Antonatos,
Range Hash for Regular Expression Pre-Filtering Publisher : ANCS’ 10 Author : Masanori Bando, N. Sertac Artan, Rihua Wei, Xiangyi Guo and H. Jonathan Chao.
400 Gb/s Programmable Packet Parsing on a Single FPGA Author: Michael Attig 、 Gordon Brebner Publisher: ANCS 2011 Presenter: Chun-Sheng Hsueh Date: 2013/03/27.
Raphael Polig, Kubilay Atasu, Christoph Hagleitner, Theresa Xu, Akihiro Nakayama 30 August 2016 Annotation-Based Finite-State Transducers on Reconfigurable.
Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher:
2018/4/27 PiDFA : A Practical Multi-stride Regular Expression Matching Engine Based On FPGA Author: Jiajia Yang, Lei Jiang, Qiu Tang, Qiong Dai, Jianlong.
CHP - 9 File Structures.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
Regular Expression Matching in Reconfigurable Hardware
Regular Expression Acceleration at Multiple Tens of Gb/s
Scalable Memory-Less Architecture for String Matching With FPGAs
Compact DFA Structure for Multiple Regular Expressions Matching
High-Performance Pattern Matching for Intrusion Detection
Presentation transcript:

Authors: Raphael Polig, Kubilay Atasu, and Christoph Hagleitner Publisher: FPL, 2013 Presenter: Chia-Yi, Chu Date: 2013/10/30 1

 Introduction  Dictionary Matching and Text analytics  Architecture  Implementation  Evaluation 2

 A novel architecture consisting of a deterministic finite- state automaton (DFA) to detect single token-based patterns and a non-deterministic finite-state automaton (NFA) to find token sequences or multitoken patterns. ◦ DFA: “High-performance pattern-matching for intrusion detection,” INFOCOM ◦ NFA: “Fast regular expression matching using fpgas,” FCCM  Implemented on an Altera Stratix IV GX530, and were able to process up to 16 documents in parallel at a peak throughput rate of 9.7 Gb/s. 3

 Text Analytics (TA) ◦ Extracting structured information from large amounts of unstructured data  Dictionary Matching (DM) ◦ the detection of particular sets of words and patterns ◦ Before DM is performed, a document is split into parts called tokens. ◦ By splitting the text at 1.whitespaces and/or 2.other special characters or 3.more complex patterns 4

5

 Text analytics adds specific details to pattern matching ◦ Tokenization  The text data gets split into parts, so-called tokens  The output of the tokenizer is a set of tuples, each consisting of a start and end offset of a token within the text data.  Matches found by the pattern matching algorithm are only valid if the start and end offsets of the found pattern match the offsets for the given token. ◦ similar to the regular expression \bJohn\b where John is only valid at word boundaries. 6

 Patterns consisting of a sequence of tokens are referred to as multi-token patterns. ◦ e.g., Assuming John and Doe are defined as tokens, it does not matter by what characters they are separated as long as these two tokens appear in sequence.  Text analytics allows the use of simple regular expressions in dictionaries ◦ e.g., a dictionary may contain various date formats  Requires exact operation ◦ Exclude the approaches causing false positive results 7

 High-level view of the architecture ◦ Single token Matching  A DFA-based architecture is used to provide a deterministic scan rate of the document text data. ◦ Multi token Matching  An NFA based circuitry is used which advances its states on a token by token basis. ◦ All its detected patterns are reported as a combination of 1.the start offset 2.the end offset 3.the dictionary identifier value 8

 High-level view of the architecture 9

 Overall hardware architecture ◦ 2 individual DMA engines  document text data  token offset values ◦ The text data  8-bit ASCII characters ◦ The token offsets  a tuple of two 32-bit integer values representing the start and end offsets of a token. 10

 Overall hardware architecture ◦ Counter  as character pointer to keep track of the current position in the document text. ◦ Boundary detector  determining the start and end characters of a token by comparing the character pointer with the incoming offset stream 11

 Overall hardware architecture ◦ Boundary detector  It controls the operation of the BFSMs by issuing start-of-token and end-of token pulses ◦ Decoder  generate the appropriate transition signals for the multi-token chains implementing the NFA 12

 BFSM ◦ A scalable Balanced Routing Table finite-state machine ◦ For regular expression matching with deterministic performance. ◦ AC-DFA based ◦ The default rules define the extra links that are necessary in an AC-DFA to return from failing patterns and only depend on the incoming character ◦ Transition rules depend on both the current state and the incoming character and define the positive transitions in the DFA 13

14

 BFSM 1.Start from default state, if a token is detected by the Boundary Detector  Compares the character pointer value to the start offset field for the current token. 2.If these values match a start of token signal is activated  Allowing the BFSM to leave its default state using the current character. 3.The subsequent transitions are carried out without any external interaction 4.When the end of a token is detected by the Boundary Detector  An end of token signal is activated 5.When the end of a token is signaled,  The BFSM will be reset to its default state regardless of the calculated next state. 15

 Multi-token NFA ◦ Custom hardware logic as described by Sidhu and Prasanna ◦ Each element has its own small machine consisting of registers connected by AND gates ◦ The number of registers is equal to the number of tokens in the multi-token element. ◦ A single token match id is decoded and fed to the individual register chains  where it is combined with the previous matches. ◦ When a set of single token matches appears in the correct sequence, a multi-token match is signalled 16

 Multi-token NFA 17

18

 Result reporting ◦ A single result is composed of a 32-bit dictionary identifier and the start and end offset position of the element found in the current document. ◦ A single element may be contained in multiple dictionaries  The amount of data generated by the dictionary matcher may exceed the size of the actual document processed. ◦ Cascaded result lookup is used to efficiently store the information 19

 Result reporting 20

 Compiler ◦ Takes the dictionaries as plain text files as input ◦ Each file is considered as one dictionary ◦ Each line in a file as a dictionary element ◦ Multi-token sequences  Be defined in a dictionary file by separating the tokens by a whitespace character 21

 Altera Startix-IV GX530 FPGA ◦ 250 MHz ◦ Latency can be neglected if a continuous stream of document data can be maintained at high frequency. ◦ Dual-port BlockRAM ◦ The number of pipeline stages is determined by  the single token decoder  grows with the number of single token patterns used in the multi- token patterns  the multi-token result multiplexer  the number of multi-token patterns 22

 Altera Quartus 12.1  The resource numbers were generated using a two threaded dual BFSM capable of processing two by two interleaved streams.  A set of test documents with sizes ranging from 100 B to 10 MB. 23

 Resource utilization ◦ Examine the impact of the size of the dictionary containing single token elements only ◦ The size of the transition rule memory 24

 Resource utilization 25

26

 Throughput 27

 Comparison with original software ◦ Intel ® XEON ® E5530 with 2.40 GHz using 16 threads ◦ Maximum throughput of 6.43 MB/s ◦ Compared with the 4 stream implementation  14x to 75x improvement 28

 Comparison with related work 1.The throughput 2.The technology available at implementation time 3.Storage efficiency  logical cells per character (LC/char)  memory bits per character (bit/char) 29

 Comparison with related work 30