CS 321 Programming Languages and Compilers Bottom Up Parsing
Bottom-Up Parsing 2 Bottom-up Parsing: Shift-reduce parsing Grammar H: L E ; L | E E a | b Input: a;a;b has parse tree L L ; E L ; E b a a
Bottom-Up Parsing 3 Data for Shift-reduce Parser Input string: sequence of tokens being checked for grammatical correctness Stack: sentential form representing the input seen so far Trees Constructed: the parse trees that have been constructed so far at some point in the parsing
Bottom-Up Parsing 4 Operations of shift-reduce parser Shift: move input token to the set of trees as a singleton tree. Reduce: coalesce one or more trees into single tree (according to some production). Accept: terminate and accept the token stream as grammatically correct. Reject: Terminate and reject the token stream.
Bottom-Up Parsing 5 A parse for grammar H Input Stack Action Trees a;a;b Shift ;a;b a Reduce E a ;a;b E Shift a;b E ; Shift a EaEa EaEa ;
Bottom-Up Parsing 6 A parse for grammar H Input Stack Action Trees ;b E ;a Reduce E a ;b E;E Shift b E;E; Shift EaEa ;a EaEa ; EaEa EaEa ; EaEa ;
Bottom-Up Parsing 7 A parse for grammar H Input Stack Action Trees $ E;E; b Reduce E b $ E;E;E Reduce L E $ E;E;L Reduce L E;L EaEa ; EaEa ;b EaEa ; EaEa ; EbEb EaEa ; EaEa ; EbEb L
Bottom-Up Parsing 8 A parse for grammar H Input Stack Action Trees $ E;L Reduce L E;L $ L Acc EaEa ; EaEa ; EbEb L L EaEa ; EaEa ; EbEb L L L
Bottom-Up Parsing 9 Bottom-up Parsing Characteristic automata for an “LR” grammar: tells when to shift/reduce/accept or reject handles viable prefixes
Bottom-Up Parsing 10 A parse for grammar H Input Stack ActionStack+Input a;a;b Shift a;a;b ;a;b a Reduce E a a;a;b ;a;b E ShiftE ;a;b a;b E ; Shift E ;a;b ;b E ;a Reduce E a E ;a;b ;b E;E ShiftE ; E ;b b E;E; ShiftE ; E ;b $ E;E; b Reduce E b E ; E ;b $ E;E;E Reduce L EE ; E ; E $ E;E;L Reduce L E;LE ; E;L $ E;L Reduce L E;LE;L $ L AcceptL
Bottom-Up Parsing 11 “Bottom-up” parsing? Stack+Input L E;L E ; E;L E ; E ; E E ; E ;b E ;a;b a;a;b EaEa ; EaEa ; EbEb L L L EaEa ; EaEa ; EbEb L L L
Bottom-Up Parsing 12 Handles and viable prefixes Stack+remaining input = sentential form Handle = the part of the sentential form that is reduced in each step Viable prefix = the prefix of the sentential form in a right-most derivation that do not extend beyond the end of the handle E.g. viable prefixes for H: (E;)*(E | L | a | b) Viable prefixes form a regular set.
Bottom-Up Parsing 13 Characteristic Finite State Machine (CFSM) Viable prefixes of H are recognized by this CFSM: E L a b 56 E a b L ;
Bottom-Up Parsing 14 How a Bottom-Up Parser Works Run the CFSM on symbols in the stack If a transition possible on the incoming input symbol, then shift, else reduce. –Still need to decide which rule to use for the reduction.
Bottom-Up Parsing 15 Characteristic automaton E L a b 56 E a b L ; a;a;b leads to state 3 after a E ;a;b leads to state 3 after E ;a E;E; b leads to state 4 after E;E; b E;E;E leads to state 2 after E;E;E E;E;L leads to state 6 after E;E;L E;L leads to state 6 after E;L start Viable Prefixes
Bottom-Up Parsing 16 Characteristic automaton E L a b 56 E a b L ; start State Action 0,5shift (if possible) 1accept 2reduce L E, if EOF shift otherwise 3reduce E a 4reduce E b 6reduce L E;L
Bottom-Up Parsing 17 Example: expression grammar E E + T | T T T * P | P P id id+id+id+id has parse tree: E T+E T+E T+E T P id P P P
Bottom-Up Parsing 18 A parse in this grammar id+id+id+id Shift +id+id+id id Reduce +id+id+id P Reduce +id+id+id T Reduce +id+id+id E Shift id+id+id E + Shift +id+id E +id Reduce +id+id E + P Reduce +id+id E + T Reduce +id+id E Shift id+id E + Shift
Bottom-Up Parsing 19 A parse in this grammar (cont.) +id E + id Reduce +id E + P Reduce +id E + T Reduce +id E Shift id E + Shift $ E +id Reduce $ E + P Reduce $ E + T Reduce $ E Reduce $ E Accept
Bottom-Up Parsing 20 Characteristic Finite State Machine The CFSM recognizes viable prefixes (strings of grammar symbols that can appear on the stack): id E T P 58 * T + 79 P * P
Bottom-Up Parsing 21 Definitions Rightmost derivation Right-sentential form Handle Viable prefix Characteristic automaton
Bottom-Up Parsing 22 Rightmost Derivation Definition: A rightmost derivation in G is a derivation: such that for each step i, the rightmost non-terminal in is replaced to obtain is right-most is not right-most
Bottom-Up Parsing 23 Right-sentential Forms Definition: A right-sentential form is any sentential form that occurs in a right-most derivation. E.g., any of these
Bottom-Up Parsing 24 Handles Definition: Assume the i-th step of a rightmost derivation is: w i =u i Av i u i v i = w i+1 Then, ( , | u i |) is the handle of w i+1 In an unambiguous grammar, any sentence has a unique rightmost derivation, and so we can talk about “the” handle rather than “a” handle.
Bottom-Up Parsing 25 The Plan Construct a parser by first constructing the CFSM, then constructing GOTO and ACTION tables from it. Construction has two parts: –“LR(0)” construction of the CFSM –“SLR(1)” construction of tables
Bottom-Up Parsing 26 Constructing the CFSM: States The states in the CFSM are created by taking the “closure” of “LR(0) items” L E ; L Given a production “L E ; L”, these are all induced “LR(0) items”
Bottom-Up Parsing 27 What is “”? The “ ” in “L E ; L” represents the state of the parse. L E ; L Only this part of the tree is fully developed
Bottom-Up Parsing 28 A State in the CFSM: Closure of LR(0) Item For set I of LR(0) items, calculate closure(I): 1. if “A B ” is in closure(I), then for every production “B ”, “B ” is in closure(I) 2. closure(I) is the smallest set with property (1)
Bottom-Up Parsing 29 Closure of LR(0) Item: Example H: L E ; L | E E a | b closure({ L E ; L}) = {L E ; L, L E ; L, L E, E a, E b }
Bottom-Up Parsing 30 LR(0) Machine Given grammar G with goal symbol S, augment the grammar by adding new goal symbol S’ and production S’ S. States = sets of LR(0) items Start state = closure({S’ S}) All states are considered to be final (set of viable prefixes closed under prefix) transition(I, X) = closure({A X | A X I}).
Bottom-Up Parsing 31 Example: LR(0) CFSM Construction H: L’ L L E ; L | E E a | b Augment the grammar: Initial State is closure of this “augmenting rule”: closure({ L’ L}) = L’ L L E ; L L E E a E b
Bottom-Up Parsing 32 Example: Transitions from I0 transition(, L) = {L’ L } = transition(, E) = { L E ; L, L E } = transition(, a ) = {E a } = transition(, b ) = {E b } = There are no other transitions from I 0. There are no transitions possible from I 1. Now consider the transitions from I 2.
Bottom-Up Parsing 33 Transitions from I 2 transition(, ; ) = L E ; L L E E a E b transition(, L) = { L E ; L }= transition(, E) = transition(, a ) = transition(, b ) = New state:
Bottom-Up Parsing 34 The CFSM Transition Diagram for H L E ; L L E E a E b L’ L L E ; L L E E a E b L’ L L E ; L L E E a E b L E ; L L E a b E a b ; L
Bottom-Up Parsing 35 Characteristic Finite State Machine for H E L a b 56 E a b L ;
Bottom-Up Parsing 36 How LR(1) parsers work GOTO table: transition function of characteristic automaton in tabular form ACTION table: State Action Procedure: –Use the GOTO table to run the CFSM over the stack. Suppose state reached at top of stack is . –Take action given by ACTION( ,a), where a is incoming input symbol.
Bottom-Up Parsing 37 Action Table for H
Bottom-Up Parsing 38 SLR(1) Parser Construction GOTO table is the move function from the LR(0) CFSM. ACTION table is constructed from the CFSM as follows: –If state i contains A a , then ACTION(i,a) = Shift. –If state i contains A , then ACTION(i,a) = Reduce A (But, if A is L’, action is Accept.) –Otherwise, reject. But,...
Bottom-Up Parsing 39 SLR(1) Parser Construction Rules for the ACTION table can involve shift/reduce conflicts. So the actual rule for reduce actions is: If state i contains A , then ACTION(i,a) = Reduce A , for all a FOLLOW(A). E.g. state 2 for grammar H yields a shift/reduce conflict. Namely, should you shift the “;” or reduce by “L E”. This is resolved by looking at the “follow set” for L. Follow(L) = {$}
Bottom-Up Parsing 40 FIRST and FOLLOW sets FIRST( ) = {a | a } { | a } FOLLOW(A) = {a | S Aa }
Bottom-Up Parsing 41 Calculating FIRST sets Create table Fi mapping N to { }; initially, Fi(A) = for all A. Repeat until Fi does not change: –For each A P, Fi(A) := Fi(A) FIRST( , Fi) where FIRST( , Fi) is defined as follows: –FIRST( , Fi) = { } –FIRST(a , Fi) = {a} –FIRST(B , Fi) = Fi(B) FIRST(b,Fi), if Fi(B) Fi(B), o.w.
Bottom-Up Parsing 42 Calculating FOLLOW sets Calculate FOLLOW sets Create table Fo : N {$}; initially, Fo(A) = for all A, except Fo(S) = {$} Repeat until Fo does not change: –For each production A B , Fo(B) := Fo(B) FIRST( ) - { } –For each production A B Fo(B) := Fo(B) Fo(A) –For each production A B if FIRST( ), Fo(B) := Fo(B) Fo(A)