ANTLR with ASTs
Abstract Syntax Trees ANTLR can be instructed to produce ASTs for the output of the parser ANTLR uses a prefix notation for representing ASTs A tree like this: + / \ 3 4 Has a text representation as (+ 3 4)
Abstract Syntax Trees A tree like this: + / \ 3 * / \ 4 5 Has the following text representation: (+ 3 (* 4 5)) Every tree can be represented in a linear format using this prefix notation
ASTs Instead of recognizing a token sequence and firing off an action, we instead generate an AST By instructing the parser to output ASTs, we can create an input stream called a CommonTreeNodeStream that contains a sequence of tokens that capture the tree structure We can then write a “tree walking” (second) grammar that describes the sequence of input tokens and actions that process the trees
Tree Walkers The trees emitted by the parser are much more concise than the parse tree used to discover the structure in the original token sequence Since the structure is simpler, there are usually many fewer rules in a tree walker grammar than in the original grammar
Tree Walkers Interpreting ASTs is much faster than interpreting the original token sequence You will want to generate ASTs for most grammars
Translation Data Flow LexerParser Tree Walker Characters Tokens Output AST