an efficient Bottom-up parser for a large and useful class of context-free grammars. the “ L ” stands for left-to-right scan of the input; the “ R ” for constructing a Rightmost derivation in reverse. The attractive reasons of LR parsers (1) LR parsers can be constructed for most programming languages. (2) LR parsing method is more general than LL parsing method. (3) LR parsers can detect syntactic errors as soon as possible. But, it is too much work to implement an LR parser by hand for a typical programming-language grammar. =====> Parser Generator
The techniques for producing LR parsing tables Simple LR(SLR) - LR(0) items, FOLLOW Canonical LR(CLR) - LR(1) items Lookahead LR(LALR) - ① LR(1) items ② LR(0), Lookahead
LR parser S 0 X 1 S 1 X 2 X m S m Stack : S 0 X 1 S 1 X 2 X m S m, where S i : state and X i V. Configuration of an LR parser : S m a i (S 0 X 1 S 1 X m S m, a i a i+1 a n $) stack contents unscanned input
LR Parsing Table (ACTION table + GOTO table) The LR parsing algorithm ::= same as the shift-reduce parsing algorithm. Four Actions : shift reduce accept error
1. ACTION[S m,a i ] = shift S ::= (S 0 X 1 S 1 X m S m, a i a i+1 a n $) (S 0 X 1 S 1 X m S m a i S, a i+1 a n $) 2. ACTION[S m,a i ] = reduce A α and |α| = r ::= (S 0 X 1 S 1 X m S m, a i a i+1 a n $) (S 0 X 1 S 1 X m-r S m-r, a i a i+1 a n $), GOTO(S m-r, A) = S (S 0 X 1 S 1 X m-r S m-r AS, a i a i+1 a n $) 3. ACTION [S m,a i ] = accept, parsing is completed. 4. ACTION [S m,a i ] = error, the parser has discovered an error and calls an error recovery routine.
G:1. LIST LIST, ELEMENT 2. LIST ELEMENT 3. ELEMENT a Parsing Table : where,sj means shift and stack state j, ri means reduce by production numbered i, acc means accept, and blank means error.
Input : = a, a Parsing Configuration : initial configuration
The method for constructing an LR parsing table from a grammar ① SLR ② LALR ③ CLR Definition : an LR(0) item a production with a dot at some position of the right side. ex) A XYZ P, [A .XYZ] [A X.YZ] [A XY.Z] [A XYZ.] mark symbol ::= the symbol after the dot if it exists. kernel item ::= [A α. ] if α , A = S'. closure item ::= [A . α ] : the result of performing the CLOSURE operation. reduce item ::= [A α.]
[A α.β] means that an input string derivable from α has just been seen, if next seeing an input string derivable from β, we may be able to reduce by the production A αβ. Definition : Augmented Grammar G = (V N, V T, P, S) G' = (V N {S'},V T, P {S' S}, S') where, S' is a new start symbol not in V N. The purpose of this new starting production is to indicate to the parser when it should stop parsing and announce acceptance of the input. That is, acceptance occurs when and only when the parser is about to reduce by S' S.
Definition :CLOSURE(I) = I ∪ {[B . ] | [A .B ] CLOSURE(I), B P} Meaning : [A .B ] in CLOSURE(I) indicates that, at some point in the parsing process, we next expect to see a substring derivable from B as input. If B is a production, we would also expect to see a substring from at this point. For this reason, we also include [B . ] in CLOSURE(I).
Computing Algorithm: Algorithm CLOUSURE(I) ; begin CLOUSURE := I ; repeat if [A .B ] CLOSURE and B P then if [B . ] CLOSURE then CLOSURE := CLOSURE ∪ {[B . ]} fi until no change end.
예 1) E' E E E + T | T T T F | F F (E) | id CLOSURE ({[E' .E]}) = {[E' .E], [E .E+T], [E .T], [T .T F], [T .F], [F .(E)], [F .id]}. CLOSURE({[E E.+T]}) = { [E E.+T] }.
GOTO(I,X) Definition : GOTO(I,X) = CLOSURE({[A X. ] | [A .X ] I}). Meaning : If I is the set of items that are valid for some viable prefix , then GOTO(I,X) is the set of items that are valid for the viable prefix X. ex) I = {[E' E.], [E E.+T]} GOTO(I,+) = CLOSURE({[E E+.T]}) = {[E E+.T], [T .T F], [T .F], [F .(E)], [F .id]}
C 0 C 0 = {CLOSURE ({[S' .S]})} ∪ {GOTO(I,X) | I ∈ C 0, X ∈ V} We are now ready to give the algorithm to construct C 0, the canonical collection of sets of LR(0) items for an augmented grammar; the algorithm is the following:
Construction algorithm of C 0. Algorithm Canonical_Collection; begin C 0 := { CLOSURE({[S' . S]}) }; repeat for I ∈ C 0 do Closure := CLOSURE(I); for each X ∈ MARK SYMBOL of Closure do J := GOTO(I,X); if J i = J then GOTO[I,X] := J i else GOTO[I,X] := J; C 0 := C 0 ∪ {J} fi end for until no change end.
G : LIST LIST, ELEMENT LIST ELEMENT ELEMENT a Augmented Grammar G' : ACCEPT LIST LIST LIST, ELEMENT LIST ELEMENT ELEMENT a
Co : I 0 : CLOSURE({[ACCEPT .LIST]}) = {[ACCEPT .LIST], [LIST .LIST,ELEMEMT], [LIST .ELEMENT], [ELEMENT .a]}. GOTO(I 0,LIST) = I 1 = {[ACCEPT LIST.], [LIST LIST.,ELEMEMT]}. GOTO(I 0,ELEMENT) = I 2 = {[LIST ELEMENT.]}. GOTO(I 0,a) = I 3 = {[ELEMENT a.]}. GOTO(I 1,,) = I 4 = {[LIST LIST,.ELEMEMT], [ELEMENT .a]}. GOTO(I 4,ELEMENT) = I 5 = {[LIST LIST,ELEMEMT.]}. GOTO(I 4,a) = I 3.
Definition ::= a directed graph in which the nodes are labeled by the sets of items and the edges by grammar symbols. Ex)
예 1) G : PR b DL ; SL e (PR P ) DL d ; DL | d (DL D ) SL s ; SL | s (SL S ) - 생성 규칙에 대 한 LR(0) 아이템 [A->.] 은 closure 아이템인 동시에 reduce 아이템이 된다. renaming
Three methods SLR(simple LR) - C 0, Follow CLR(Canonical LR) - C 1 LALR(Lookahead LR) - C 1 C 0. Lookahead Parsing Table
::= The method constructing the SLR parsing table from the C 0. Constructing Algorithm: C 0 = {I 0,I 1,I 2,...,I n } 1. ACTION[i,a] := "shift j" if [A .a ] ∈ I i and GOTO(I i,a) = I j. 2. ACTION[i,a] := "reduce A α", for all a ∈ FOLLOW(A) if [A .] ∈ I i. 3. ACTION[i,$] := "accept" if [S' S.] ∈ I i. 4. GOTO[i,A] := j if GOTO(I i, A) = I j. 5. "error" for all undefined entries and initial state is i if [S' .S] ∈ I i. reduce item 에 대해 FOLLOW 를 사용하여 resolve.
G : 0. A L (A : ACCEPT, L : LIST, E : ELEMENT) 1. L L, E 2. L E 3. E a FOLLOW(A) = {$} FOLLOW(L) = {,,$} FOLLOW(E) = {,,$}
Parsing Table :