CS /11/12 Matthew Rodgers
What are LL and LR parsers? What grammars do they parse? What is the difference between LL and LR? Why do we care?
Top-Down LL(k) parsers are Top-Down Parsers LL(1) is Deterministic The way you are most likely familiar with how to parsing grammars Bottom-Up LR(k) Parsers are Bottom- Up Parsers LR(k) Grammars is exactly the set of Deterministic Context-Free Grammars LR(k), for some k, is also LR(1)
Consider the following grammar: o S → F o S → (S+F) o F → a Input: (a+a) The parsing table for this grammar is shown ()a+$ S21 F3
The stack initializes with the start symbol, S and is compared to the first symbol in the input Since it does not find an ( on the stack, it looks at the table to see which rule to apply After applying the rule, it attempts again It finds the ( in both the input string and the top of the stack so it removes both ()A+$ S21 F3 The parser continues to do this until it reaches the end symbol, $, or rejects the string
Given an LR(1) grammar, we can produce a shift-reduce parser table Shift– “Shifts” an input symbol onto the parser’s stack and builds a node in the parse tree labeled by that symbol Reduce– “Reduces” a string of symbols from the top of the stack to a non-terminal symbol using a grammar rule o When it does this it builds the piece of the parse tree However, many LR(1) languages have too large of a parse table to be practical Instead we use LALR parsing
DerivationParse StackUnparsed InputAction (1+(2+3)) ε Shift (1+(2+3))(1+(2+3))Shift (1+(2+3))(1+(2+3))Reduce E→ num (E+(2+3))(E+(2+3))Reduce S→E (S+(2+3))(S+(2+3))Shift (S+(2+3))(S+(2+3))Shift (S+(2+3))(S+(2+3))Shift (S+(2+3))(S+(2+3))Reduce E→ num (S+(E+3))(S+(E+3))Reduce S→E (S+(S+3))(S+(S+3))Shift (S+(S+3))(S+(S+3))Shift Etc. S→S + E | E E → num | (S)
First we shall define a simple grammar o E → E * B o E → E + B o E → B o B → 0 o B → 1 We also add a new rule, S → E, which is used by the parser as a final accepting rule
To create a parsing table for this grammar we must introduce a special symbol, ∙, which indicates the current position for which the parser has already read symbols on the input and what to expect next E.g. E → E ∙ + B o This shows that the E has already been processed and the parser is looking for a + symbol next Each of these above rules is called an item There is an item for each position the dot symbol can take along the right-hand side of the rule
Since a parser may not know which grammar rule to use in advance, when creating our table we must use sets of items to consider all the possibilities E.g. o S → E o E → E * B o E → E + B o E → B o B → 0 o B → 1 The first line is the initial rule for the item set, but since we need to consider all possibilities when we come to a non-terminal, we must create a closure around the non-terminal E, in this case. (By extension, we must do the same for B as shown by the 5 th and 6 th items.)
Set 0 o S → E o E → E * B o E → E + B o E → B o B → 0 o B → 1 Set 1 o B → 0 Set 2 o B → 1 Set 3 o S → E o E → E * B o E → E + B Set 4 o E → B
Set 5 o E → E * B o B → 0 o B → 1 Set 6 o E → E + B o B → 0 o B → 1 Set 7 o E → E * B Set 8 o E → E + B
Item Set*+01EB Each of the transitions can be found by following the item sets to where the new item set is created from o Item Set 7 Spawned as a result of Item Set 5
After finishing creating the item sets and the transitions, follow the steps below to finish the table 1) The columns for nonterminals are copied to the goto table. 2) The columns for the terminals are copied to the action table as shift actions. 3) An extra column for '$' (end of input) is added to the action table that contains acc for every item set that contains S → E. 4) If an item set i contains an item of the form A → w and A → w is rule m with m > 0 then the row for state i in the action table is completely filled with the reduce action rm.
ActionGoto State*+01$EB 0 s1s2g3g4 1r4 2r5 3s5s6 acc 4 r3 5 s1s2 g7 6 s1s2 g8 7 r1 8 r2
“Lookahead” LR Parsing– Deterministic, shift-reduce parser Most practical (non-Natural) languages can be described by an LALR LALR Parser tables are fairly small Yacc is a Parser-Generation tool that creates LALR parsers
"LL Parser." Wikipedia. Wikimedia Foundation, 11 Sept Web. 12 Nov "LR Parser." Wikipedia. Wikimedia Foundation, 11 July Web. 12 Nov Rich, Elaine. "Context-Free Parsing." Automata, Computability and Complexity: Theory and Applications. Upper Saddle River, NJ: Pearson Prentice Hall, Print.