Automated Parser Generation for High-Speed NIDS

Slides:



Advertisements
Similar presentations
CPSC Compiler Tutorial 9 Review of Compiler.
Advertisements

UltraPAC : automated protocol parser generator Daniel Burgener Jing Yuan.
Compiler Principles Winter Compiler Principles Exercises on scanning & top-down parsing Roman Manevich Ben-Gurion University.
Fortran- Subprograms Chapters 6, 7 in your Fortran book.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Lexical Analysis Hira Waseem Lecture
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
CS30003: Compilers Lexical Analysis Lecture Date: 05/08/13 Submission By: DHANJIT DAS, 11CS10012.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
The Role of Lexical Analyzer
Object storage and object interoperability
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Compiler Syntactic Analysis r Two general classes of parsing techniques m Bottom-up (Operator-Precedence parsing) Begin with the terminal nodes.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Recursion,
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Logical Database Design and the Rational Model
Compiler Design (40-414) Main Text Book:
Information Retrieval in Practice
Computer Organization
SLIDES FOR CHAPTER 15 REDUCTION OF STATE TABLES STATE ASSIGNMENT
Module 11: File Structure
CPS216: Data-intensive Computing Systems
Record Storage, File Organization, and Indexes
Indexing and hashing.
The Relational Database Model
A Simple Syntax-Directed Translator
Chapter 12 File Management
Database Management System
Introduction to Parsing (adapted from CS 164 at Berkeley)
F453 Computing Questions and Answers
DATA STRUCTURES AND OBJECT ORIENTED PROGRAMMING IN C++
Parallel Density-based Hybrid Clustering
Chapter 4 Relational Model Characteristics
Chapter 12: Query Processing
Database Performance Tuning and Query Optimization
Data Structures Recursion CIS265/506: Chapter 06 - Recursion.
MG4J – Managing GigaBytes for Java Introduction
Chapter 15 QUERY EXECUTION.
Automated Parser Generation for High-Speed NIDS
Front End vs Back End of a Compilers
File organization and Indexing
Programming Language Syntax 2
P4-to-VHDL: Automatic Generation of 100 Gbps Packet Parsers
CMPE 152: Compiler Design September 13 Class Meeting
Unit-2 Divide and Conquer
MATS Quantitative Methods Dr Huw Owens
Format String.
Closure Representations in Higher-Order Programming Languages
Memory Organization.
Data Structures (CS212D) Week # 2: Arrays.
DATABASE SYSTEM.
Chapter 11 I/O Management and Disk Scheduling
The Relational Database Model
Files Management – The interfacing
Arrays Week 2.
Chapter 12 Query Processing (1)
Chapter 11 Database Performance Tuning and Query Optimization
Relational Query Optimization
Indexing 4/11/2019.
File Organization.
High-Level Programming Language
Basic Concepts of Algorithm
Chapter 10: Compilers and Language Translation
Ben-Gurion University
Chapter 5 File Systems -Compiled for MCA, PU
Week 3: Format String Vulnerability
Presentation transcript:

Automated Parser Generation for High-Speed NIDS Hongyu Gao Clint Sbisa

Motivation Processing speed is crucial concern for NIDS/NIPS Limited by rate of parsing packets Inefficient parsing leads to slow speeds and bottlenecks

Current Solutions Binpac Declarative language and compiler Designed to simplify task of constructing complex protocol parsers Constructs a full parsing tree

Current Solutions Netshield Integrate high-speed protocol parser to provide fast parsing speed Parsers are manually written, which is tedious work and error-prone

Proposed Solution A protocol parser generator Read the protocol specification Output the parser for the specific protocol The parser is aware of matching The parser focuses on the fields needed by matching and skip unnecessary fields

Automated parser generation? Proposed Solutions Comparison table Automated parser generation? Yes No Fast parsing Our solution Netshield parser Binpac parser

Design Principles The parsing process should avoid recursive calls Parse trees are not used in parsing phase Skip unneeded information After parsing one field, the parser should be able to quickly jump to the next necessary field

Detailed design The parser consists of three parts A pair of buffer pointers A field table ( key data structure) A table pointer

Detailed design on field table Metadata Field type Field value Field length Garbage length Next field Field 1 Field 2 … Field n

Detailed Design on field table There are six columns in our current design Field metadata: A structure contains field name, field length and other metadata. Field type: An enum to mark if the field is directly used in matching (type 1) or used to parse other fields that need matching ( type 2) or others. Field value: Start and end pointers in the buffer. Field length: Function to obtain bytes in the field. Garbage length: Length of unnecessary fields-- used to skip fields. Next field: A method to decide the next field in the table.

How to realize the system It is difficult to determine the number of necessary fields from the very beginning. We adopt a two-phase approach. Generate the table of all fields specified. Compress the generated table to produce the table actually used in parsing.

How to realize the system The value of each column is computed in the following way: Metadata: Obtain metadata from hierarchical protocol structure in memory ( in parser generation). Obtain length value from the function to decide field length ( during parsing). Field type: Refer to both the rule set and the protocol specification to determine the field type ( in parser generation) Field value: Depend on buffer pointer and function to compute field length ( during parsing)

How to realize the system Field length: Note that this attribute is a function to compute the value of field length. It is not a simple number itself. The function can be fixed referring to protocol specification. Garbage length: In the table generation phase one, garbage length equals to 0 for every field. Next field: For fix-order fields, the next field can be determined in parser generation by search in the hierarchical protocol structure. However, for branch cases, the function to decide the next field is defined in the protocol specification.

How to realize the system Compress the phase-one table to get the final table All type 1 and type 2 fields are to be kept. They are called necessary fields Starting from each necessary fields, all fix-ordered, consecutive unnecessary fields should be merged into its garbage length, until it reaches an unnecessary field that is involved in the computation of any necessary fields.

Questions? Suggestions?