CS 404 Introduction to Compiler Design Lecture 4 Ahmed Ezzat Top-Down Parsing LL(1), Bottom-Up Parsing LR CS 404 Ahmed Ezzat
1. Top-down Parsing Predictive: try to guess which production rule to apply next, given The current non-terminal symbol One or more ‘look-ahead’ terminal symbols Two ways to do predictive parsing Use recursive procedures Use a predictive parsing table CS 404 Ahmed Ezzat
LL(1) Grammar A restrict set of grammars with no need to backtrack Uses an explicit stack rather than recursive calls to perform parsing LL(k) parsing means that k tokens of lookahead are used LL(1): L: scan input string from left to right L: left-most derivation is applied at each step 1: one input symbol for lookahead CS 404 Ahmed Ezzat
Two Separate Steps Construct LL(1) parsing table Compute FIRST and FOLLOW Construct the parsing table Parsing Stack: that holds grammar symbol: non-terminals and tokens. Parsing strings using the parsing table CS 404 Ahmed Ezzat
FIRST and FOLLOW sets FIRST(α) contains any symbol that might begin a sentence derived from α FOLLOW(A) includes all symbols that could appear immediately after A in a valid sentence CS 404 Ahmed Ezzat
Compute FIRST If x is a terminal, then FIRST(x) = {x} If xε, then add ε to FIRST(x) If x is non-terminal and XY1Y2…Yk, then add z to FIRST(x) if for some i, z is in FIRST(Yi) and ε is in FIRST(Yj) for all j<i CS 404 Ahmed Ezzat
Compute FIRST for a String α (FI4) For α = X1X2…Xn Add all non-ε symbols of FIRST(X1) to FIRST(α) Add all non- ε symbols of FIRST(Xj) to FIRST(α) if ε is in all FIRST(Xi) for i<j Add ε to FIRST(α) if ε is in all FIRST(Xi) for all i CS 404 Ahmed Ezzat
Compute FOLLOW (FO1) Put $ in FOLLOW(S) ($ is called endmarker) (FO2) If AαBβ, then put FIRST(β) (except ε) into FOLLOW(B) (FO3) If AαB, then put FOLLOW(A) into FOLLOW(B) (FO4) If AαBβ and βε, then put FOLLOW(A) into FOLLOW(B) CS 404 Ahmed Ezzat
Predictive Parsing and Left factoring Example Assume the following Grammar: E T + E | T T int | int * T | (E) Hard to predict because: For T, 2 productions start with int For E, it is not clear how to predict The Grammar must be left-factored before being used for predictive parsing CS 404 Ahmed Ezzat
Predictive Parsing and Left factoring Example Assume the following Grammar: E T + E | T T int | int * T | (E) Factor out common prefixes of productions, possibly introducing ε-productions E TX X + E | ε T (E) | int Y Y * T | ε int * + ( ) $ E TX T X X + E ε T Int Y ( E ) Y * T CS 404 Ahmed Ezzat
Construct the Parsing Table For each production rule Aα [M1] For each terminal a in FIRST(α), add Aα to M[A,a] [M2] If ε is in FIRST(α), add Aα to M[A,b] for each terminal b in FOLLOW(A). (b can be $) Unidentified entry of M are ‘error entries’ CS 404 Ahmed Ezzat
Use Parsing Table to Parse Push $S into the stack; attach $ to the end of the string. x is the stack top, a is the input If x=a=$, success If x=a<>$, pop x, advance input If x is non-terminal If M[x,a] = {xUVW}, replace x by WVU (U on top) If M[x,a] has no rule, error CS 404 Ahmed Ezzat
Use Parsing Table Example to Parse E TX X + E | ε T (E) | int Y Y * T | ε CS 404 Ahmed Ezzat
2. Bottom-up Parsing Start from the leaf nodes of a tree and works in upward direction till reaching the root node CS 404 Ahmed Ezzat
Bottom-up Parsing Start with string of terminals Builds up from leaves of parse tree Apply production rules backwards (reduction) When reach start symbol and exhausted input, done Shift-reduce is one common type of bottom-up parsing CS 404 Ahmed Ezzat
Bottom-up Parsing Shift-Reduce Parsing: Shift: advance input pointer to next input symbol; symbol is pushed into the stack Reduce: when parser finds complete grammar rule (RHS) and replace it to (LHS) LR Parser: it is non-recursive, shift-reduce, bottom-up parser SLR(1): Simple LR parser; works on smallest class of grammar LR(1): Works on complete set of LR(1) grammar LALR(1): Look-Ahead LR parser. Works on intermediate size of grammar. # of states is the same as in SLR(1). CS 404 Ahmed Ezzat
Bottom-up Parsing - Example Bottom-up parser traces rightmost derivation in reverse E T + int * int * int + int int * T + int T + int T + T T + E E CS 404 Ahmed Ezzat
Shift-reduce Parsing Use context-free grammar (may not be LL1) Use stack to keep track of tokens seen so far Hard to do manually, but best with Yacc Basic idea: Shift next symbol onto stack When stack top contains a good right-hand-side of a production, reduce by a rule CS 404 Ahmed Ezzat
When to Shift or Reduce? Reduce if top of stack represents the right hand side of a production rule (a handle) Need to recognize handles If cannot reduce and there are more inputs, shift If cannot shift or reduce, error Use Action and Goto tables to help decide CS 404 Ahmed Ezzat
Shift or Reduce Example Shift: Move | one place to the right Shifts a terminal to the left string ABC|XYZ ABCX|YZ Reduce: apply an inverse production rule at the right end of the left string If A XY is a production rule, then Cbxy|ijk CbA|ijk CS 404 Ahmed Ezzat
LR Parsing Left to right input (Left scan) Right-most derivation in reverse order Efficient, table based parsing by shift-reduce Can handles more grammar than LL(1) Can handle most programming languages CS 404 Ahmed Ezzat
LR Parsing Data Structure Stack of states {S0, …, Sm} Action Table: Action[S’,a], a is terminal. Tells the parser whether to: Shift (S’) Reduce (R) Accept (A) the source code, or Signal a syntactic error (E) Goto Table: Goto[S’,X], X is non-terminal. Defines the next state after a shift CS 404 Ahmed Ezzat
LR Parsing Data Structure Sm Sm-1 $ … a1 ai an ….. LR Parsing Program ACTION GOTO Output LR Parser Model Stack Input CS 404 Ahmed Ezzat
LR Parsing Algorithm Initially push S0 Given state S’ on top of stack, with input symbol a If (Action[S’,a] = shift S’) Push a, then S’ onto stack Move to next input symbol CS 404 Ahmed Ezzat
LR Parsing Algorithm (continue) If (Action[S’,a] = reduce AX1X2…Xn) Pop off n states (and n terminals) to find Su on top of stack Push A Push new state Goto[Su,A] Output production AX1X2…Xn If action[S’,a] = accept, done! If action[S’,a] = error, error! CS 404 Ahmed Ezzat
LR Parsing With Only States on Stack If (Action[S,a] = shift S) Push S onto stack If (Action[S,a] = reduce AX1X2…Xn) Pop off n states to find Su on top of stack CS 404 Ahmed Ezzat
END CS 404 Ahmed Ezzat