Automated Parser Generation for High-Speed NIDS Hongyu Gao Clint Sbisa
Motivation Processing speed is crucial concern for NIDS/NIPS Limited by rate of parsing packets Inefficient parsing leads to slow speeds and bottlenecks
Current Solutions Binpac Declarative language and compiler Designed to simplify task of constructing complex protocol parsers Constructs a full parsing tree
Current Solutions Netshield Integrate high-speed protocol parser to provide fast parsing speed Parsers are manually written, which is tedious work and error-prone
Proposed Solution A protocol parser generator Read the protocol specification Output the parser for the specific protocol The parser is aware of matching The parser focuses on the fields needed by matching and skip unnecessary fields
Automated parser generation? Proposed Solutions Comparison table Automated parser generation? Yes No Fast parsing Our solution Netshield parser Binpac parser
Design Principles The parsing process should avoid recursive calls Parse trees are not used in parsing phase Skip unneeded information After parsing one field, the parser should be able to quickly jump to the next necessary field
Detailed design The parser consists of three parts A pair of buffer pointers A field table ( key data structure) A table pointer
Detailed design on field table Metadata Field type Field length Garbage length Next field Field 1 Field 2 … Field n
Detailed Design on Parser
Implementation Basic approach: Fixed driver Fixed data structure Protocol-specific table content
Related files
How to realize the system Determine the size of field table Start with one root node in protocol parse tree Iteratively substitute complex field with multiple simpler fields Determine the FieldLength function Retrieve the information from Type class Type::attr_length_expr_, Type::attr_oneline_, etc.
How to realize the system Determine the GarbageLength function Before compression, GarbageLength returns “0” for every field Compress the table Look forward for consequent fields Merge the length of unused fields into garbage fields of the field that precedes them
Snapshot for generated code
Snapshot for generated code, cont’d
Snapshot for generated code, cont’d
Demo
Questions? Suggestions?