CS 280 Data Structures Professor John Peterson
Lexer Project Questions? Must be in by Friday – solutions will be posted after class The next project is nearly ready – look in the wiki.
Parse Trees One of the big deals in computer science is context free languages – we use these to create recursive structures (trees) from linear ones (strings or sequences). There is a whole lot of theory underneath – we’ll skip most of it and concentrate on the practical stuff.
Example: English
The Problem Given: A sequence of tokens A grammar that gives structure to these tokens Produce: A parse tree that covers the sequence
Grammars Names: the left side of a production is a name – this name can be used in other productions Constants: specific pieces of the underlying token-level language Sequence: x y means that y follows x Choice: (x | y) means either x or y may appear here Optionals: [x] means x may appear here Repetition: (x)* means that an arbitrary number of x’s are repeated
Example: Java Tokens: a = a + b * c; Grammar: statement = assignment assignment = var ‘=‘ addexp ‘;’ addexp = mulexp (‘+’ mulexp)* mulexp = aexp (‘*’ aexp)* aexp = var | num | ‘(‘ addexp ‘)’
How Does this Work? You need to know where to start (“statement”) This grammar is constructed so that you can always decide what to do based on the next token (peek). When you have a choice, always go as far as possible. If you get to a place where the current token doesn’t fit into the grammar, you have a “parse error”.