Download presentation
Presentation is loading. Please wait.
1
Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc02.html Textbook:Modern Compiler Implementation in C Chapter 3
2
Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars bison context free grammar tokens parser “Ambiguity errors” parse tree
3
Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order – For every potential right hand side and token decide when a production is found
4
Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) – For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program
5
Plan Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars
6
Pushdown Automaton control parser-table input stack $ $utw V
7
Bottom-Up Parser Actions reduce A –Pop | | symbols and | | states from the stack –Apply the associated action –Push a symbol goto[top, A] on the stack shift X –Push X onto the stack –Advance the input accept –Parsing is complete error –Report an error
8
A Parser Table for S a S b| statedescriptionab$S 0initials1 r S 4 1parsed as1 r S 2 2parsed Ss3 3parsed a S b r S a S b 4finalacc
9
stateab$S 0s1 r S 4 1s1 r S 2 2s3 3 r S a S b 4acc stackinputaction 0($)aabb$s1 0($) 1(a)abb$s1 0($) 1(a)1(a)bb$ r S 0($) 1(a)1(a)2(S)bb$s3 0($) 1(a)1(a)2(S)3(b)b$ r S a S b 0($) 1(a)2(S)b$s3 0 ($)1(a)2(S)3(b)$ r S a S b 0($) 4(S)$acc
10
stateab$S 0s1 r S 4 1s1 r S 2 2s3 3 r S a S b 4acc stackinputaction 0($)abb$s1 0($) 1(a)bb$ r S 0($) 1(a)2(S)bb$s3 0($) 1(a)2(S)3(b)b$ r S a S b 0($) 4(S)b$err
11
A Parser Table for S AB | A A a B a statedescriptiona$ABS 0initials125 1parsed a from Ar A a 2parsed As3r S A 4 3parsed a from Br B a 4parsed ABr S AB 5finalacc
12
stackinputaction 0($)aa$s1 0 ($)1(a)a$ r A a 0 ($)2(A)a$s3 0 ($)2(A)3(a)$ r B a 0 ($)2(A)4(B)$ r S AB 0($)5(S)$acc statea$ABS 0s125 1 r A a 2s3 r S A 4 3 r B a 4 r S AB 5acc
13
A Parser Table for E E + E| id statedescriptionid+$E 0initials12 1parsed id r E id 2parsed Es3acc 3parsed E+s14 4parsed E+Es3 r E E + E
14
The Challenge How to construct a parser-table from a given grammar LR(1) grammars –Left to right scanning –Rightmost derivations (reverse) –1 token Different solutions –Operator precedence –SLR(1) Simple LR(1) –CLR(1) Canonic LR(1) –LALR(1) Look Ahead LR(1) Yacc, Bison, CUP
15
Grammar Hierarchy Non-ambiguous CFG CLR(1) LALR(1) SLR(1) LL(1)
16
Constructing an SLR parsing table Add a production S’ S$ Construct a finite automaton accepting “valid stack symbols” –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations Construct reduce entries by analyzing the grammar
17
A finite Automaton for S’ S$ S a S b| statedescriptionab$S 0initials14 1parsed as12 2parsed Ss3 3parsed a S b 4final 012 3 4 a b S S a
18
Constructing a Finite Automaton NFA –For X X 1 X 2 … X n [X X 1 X 2 …X i X i+1 … X n ] –“prefixes of rhs (handles)” –X 1 X 2 … X i is at the top of the stack and we expect X i+1 … X n The initial state [S’ . S$ ] – ([X X 1 …X i X i+1 … X n ], X i+1 ) = [X X 1 …X i X i+1 … X n ] –For every production X i+1 ([[X X 1 X 2 …X i X i+1 … X n ], ) = [X i+1 ] Convert into DFA
19
NFA S’ S$ S a S b| [S’ .S$] [S’ S.$] [S .aSb] [S .] [S a.Sb] [S aSb.] [S aS.b] S S b a
20
[S’ .S$] [S’ S.$] [S .aSb] [S .] [S a.Sb] [S aSb.] [S aS.b] S S b a [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] a [S’ S.$] S [S aS.b] S [S aSb.] b DFA a
21
stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S a.Sb$, S .aSb, S .] s12 2 [S aS.b] s3 3 [S aSb.] 4 [S’ S.$] acc [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] a [S’ S.$] S [S aS.b] S [S aSb.] b a
22
Filling reduce entries For an item [A .] we need to know the tokens that can follow A in a derivation from S’ Follow(A) = {t | S’ * At } See the textbook for an algorithm for constructing Follow from a given grammar
23
stateitemsab$S 0 [S .S$, S .aSb, S .] s14 1 [S a.Sb$, S .aSb, S .] s12 2 [S aS.b] s3 3 [S aSb.] 4 [S’ S.$] acc [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] a [S’ S.$] S [S aS.b] S [S aSb.] b a Follow(S) = {b, $} r S r S a S b
24
Example E E + E| E* E | id [S’ .E$] [E .E + E] [E .E *E] [E .id] id [E id.] E [S’ E.$] [E E. + E] [E E. *E] [E E +. E] [E .E +E] [E .E*E] [E .id] + [E E *. E] [E .E +E] [E .E*E] [E .id] * id E E [E E + E.] [E E.+E] [E E.*E] [E E *E.] [E E.+E] [E E.*E] E E 1 2 3 4 5 6 7 + * * +
25
Interesting Non SLR(1) Grammar S’ S$ S L = R | R L *R | id R L [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R L] [S L.=R] [R L.] L = Partial DFA [S L=.R] [R .L] [L .*R] [L .id] Follow(R)= {$, =}
26
LR(1) Parser Item [A . , t] – is at the top of the stack and we are expecting t LR(1) State –Sets of items LALR(1) State –Merge items with the same look-ahead
27
Interesting Non LR(1) Grammars Ambiguous –Arithmetic expressions –Dangling-else Common derived prefix –A B 1 a b | B 2 a c –B 1 –B 2 Optional non-terminals –St OptLab Ass –OptLab id : | –Ass id := Exp
28
A motivating example Create a desk calculator Challenges –Non trivial syntax –Recursive expressions (semantics) Operator precedence
29
Solution (lexical analysis) /* desk.l */ % [0-9]+ { yylval = atoi(yytext); return NUM ;} “+” { return PLUS ;} “-” { return MINUS ;} “/” { return DIV ;} “*” { return MUL ;} “(“ { return LPAR ;} “)” { return RPAR ;} “//”.*[\n] ; /* comment */ [\t\n ] ; /* whitespace */. { error (“illegal symbol”, yytext[0]); }
30
Solution (syntax analysis) /* desk.y */ %token NUM %left PLUS, MINUS %left MUL, DIV %token LPAR, RPAR %start P % P : E {printf(“%d\n”, $1) ;} ; E : NUM { $$ = $1 ;} | LPAR e RPAR { $$ = $2; } | e PLUS e { $$ = $1 + $3 ; } | e MINUS e { $$ = $1 - $3 ; } | e MUL e { $$ = $1 * $3; } | e DIV e {$$ = $1 / $3; } ; % #include “lex.yy.c” flex desk.l bison desk.y cc y.tab.c –ll -ly
31
Summary LR is a powerful technique Generates efficient parsers Generation tools exit –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.