Presentation is loading. Please wait.

Presentation is loading. Please wait.

4. Bottom-up Parsing Chih-Hung Wang

Similar presentations


Presentation on theme: "4. Bottom-up Parsing Chih-Hung Wang"— Presentation transcript:

1 4. Bottom-up Parsing Chih-Hung Wang
Compilers 4. Bottom-up Parsing Chih-Hung Wang References 1. C. N. Fischer and R. J. LeBlanc. Crafting a Compiler with C. Pearson Education Inc., 2009. 2. D. Grune, H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. John Wiley & Sons, 2000. 3. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, (2nd Ed. 2006)

2 Creating a bottom-up parser automatically
Left-to-right parse, Rightmost-derivation create a node when all children are present handle: nodes representing the right-hand side of a production IDENT rest_expression expression rest_expr term aap ( noot mies )

3 LR(0) Parsing Theoretically important but too weak to be useful.
running example: expression grammar input  expression EOF expression  expression ‘+’ term | term term  IDENTIFIER | ‘(’ expression ‘)’ short-hand notation Z  E $ E  E ‘+’ T | T T  i | ‘(’ E ‘)’

4 LR(0) Parsing keep track of progress inside potential
handles when consuming input tokens LR items: N     initial set S0 Z   E $ E   E ‘+’ T E   T T   i T   ‘(’ E ‘)’ Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’

5  Closure algorithm for LR(0)
The important part is the inference rule; it predicts new handle hypotheses from the hypothesis that we are looking for a certain non-terminal, and is sometimes called prediction rule; it corresponds to an  move, in that it allows the automation to move to another state without consuming input. Reduce item: an item with the dot at the end Shift item: the others

6 Transition Diagram T i E i ‘+’ $ T Z   E $ E   E ‘+’ T E   T

7 LR(0) parsing example (1)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ S0 stack i + i $ input shift input token (i) onto the stack compute new state

8 LR(0) parsing example (2)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 i S1 + i $ reduce handle on top of the stack compute new state Q: what does state S1 look like? A: write down on the blackboard, including transition. Do so for each new state in the remainder of the animation.

9 LR(0) parsing example (3)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 T S2 + i $ i reduce handle on top of the stack compute new state

10 LR(0) parsing example (4)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 + i $ T shift input token on top of the stack compute new state i

11 LR(0) parsing example (5)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 + S4 i $ T shift input token on top of the stack compute new state i

12 LR(0) parsing example (6)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 + S4 i S1 $ T reduce handle on top of the stack compute new state i Q: is it allowed to re-use state S1? A: yes.

13 LR(0) parsing example (7)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 + S4 T S5 $ T i reduce handle on top of the stack compute new state i Note we cannot re-use state S2.

14 LR(0) parsing example (8)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 $ E + T shift input token on top of the stack compute new state T i i

15 LR(0) parsing example (9)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 E S3 $ S6 E + T reduce handle on top of the stack compute new state T i i

16 LR(0) parsing example (10)
Z  E $ E  E ‘+’ T E  T T  i T  ‘(’ E ‘)’ stack input S0 Z E $ accept! E + T T i i

17 Precomputing the item set (1)
Initial item set

18 Precomputing the item set (2)
Next item set

19 Complete transition diagram

20 The LR push-down automation
Two major moves and a minor move Shift move Remove the first token from the present input and pushes it onto the stack Reduce move N ->   are moved from the stack N is then pushed onto the stack Termination The input has been parsed successfully when it has been reduced to the start symbol.

21 GOTO and ACTION tables

22 LR(0) parsing of the input i+i$

23 Another Example of LR(0) from Fischer (1)—closure0
G1 example

24 Another Example of LR(0) from Fischer (2)
G2 example S'S$ SID|

25 Another Example of LR(0) from Fischer (3)

26 Another Example of LR(0) from Fischer (4)

27 Algorithm of LR(0) Construction (1)-goto Function

28 Algorithm of LR(0) Construction (2)- CFSM

29 Algorithm of LR(0) Construction (3)- goto Table

30 LR comments The bottom-up parsing, unlike the top-down parsing, has no problems with left-recursion. On the other hand, bottom-up parsing has a slight problem with right-recursion.

31 LR(0) conflicts (1) shift-reduce conflict
Exist in a state when table construction cannot use the next k tokens to decide whether to shift the next input token or call for a reduction. array indexing: T  i [ E ] T  i  [ E ] (shift) T  i  (reduce) -rule: RestExpr   Expr  Term  RestExpr (shift) RestExpr   (reduce)

32 LR(0) conflicts (2) reduce-reduce conflict
Exist when table construction cannot use the next k tokens to distinguish between multiple reductions that cannot be applied in the inadequate state. assignment statement: Z  V := E $ V  i  (reduce) T  i  (reduce) (Different reduce rules) typical LR(0) table contains many conflicts

33 Handling LR(0) conflicts
Use a one-token look-ahead Use a two-dimensional ACTION table different construction of ACTION table SLR(1) – Simple LR LR(1) LALR(1) – Look-Ahead LR

34 SLR(1) parsing A handle should not be reduced to a non-terminal N if the look-ahead is a token that cannot follow N. reduce N   iff token  FOLLOW(N) FOLLOW(N) FOLLOW(Z) = { $ } FOLLOW(E) = { ‘+’, ‘)’, $ } FOLLOW(T) = { ‘+’, ‘)’, $ }

35 SLR(1) ACTION table shift

36 SLR(1) ACTION/GOTO table
1: Z  E $ 2: E  T 3: E  E ‘+’ T 4: T  i 5: T  ‘(’ E ‘)’ s7 sn – shift to state n rn – reduce rule n

37 Example of resolving conflicts (1)
A new rule T  i [E] state stack symbol / look-ahead token i + ( ) [ ] $ E T s5 s7 s1 s6 1 s3 s2 2 r1 3 s4 4 r3 5 r4 6 r2 7 s8 8 s9 9 r5 1: Z  E $ 2: E  T 3: E  E ‘+’ T 4: T  i 5: T  ‘(’ E ‘)’ 6: T  i ‘[‘ E ‘]’

38 Example of resolving conflicts (2)
state stack symbol / look-ahead token i + ( ) [ ] $ E T s5 s7 s1 s6 1 s3 s2 2 r1 3 s4 4 r3 5 r4 s10 6 r2 7 s8 8 s9 9 r5 1: Z  E $ 2: E  T 3: E  E ‘+’ T 4: T  i 5: T  ‘(’ E ‘)’ 6: T  i ‘[‘ E ‘]’ s5 T  i. T  i. [E]

39 Another Example of LR(0) Conflicts(1)

40 Another Example of LR(0) Conflicts(2)

41 Another Example of LR(0) Conflicts(3)
num plus num times num $

42 Another Example of LR(0) Conflicts(4)
Follow(E)= {plus, $}

43 Unfortunately … SLR(1) leaves many shift-reduce conflicts unsolved
problem: FOLLOW(N) set is a union of all all look- aheads of all alternatives of N in all states example S  A | x b A  a A b | B B  x Follow (S)={$} Follow(A) = {b, $} Follow(B) = {b, $}

44 SLR(1) automation

45 Another Example of SLR Problem
Follow(A)={b, c, $}

46 Make the Grammar SLR(1) Follow(A1)={b, $}

47 Another Example 2 of SLR(1) (1)
G3: SE$ EE+T|T TT*P|P PID|(E)

48 Another Example 2 of SLR(1) (2)
SLR(1) Action Table for G3 G3: SE$ EE+T|T TT*P|P PID|(E) Follow(E)={$+)} Follow(T)={$+)*}

49 SLR(1) Conflicts Ex2 Fellow(Elem)={“)”,”,”,….} Elem(List, Elem)
ElemScalar ListList,Elem List Elem Scalar ID Scalar(Scalar) Fellow(Elem)={“)”,”,”,….}

50 LR(1) parsing The LR(1) technique does not rely on FOLLOW sets, but rather keeps the specific look-ahead with each item LR(1) item: N     {}  - closure for LR(1) item sets: if set S contains an item P    N  {} then for each production rule N   S must contain the item N    {} where  = FIRST(  {} )

51 Creating look-ahead sets
Extended definition of FIRST stes If FIRST() does not contain , FIRST({}) is just equal to FIRST(); if  can produce , FIRST({}) contain all the tokens in FIRST(), excluding , plus the tokens in .

52 LR(1) automation

53 Another Example of LR(1) Construction (1)

54 Another Example of LR(1) Construction (2)

55 Another Example of LR(1) Construction (3)

56 Another Example of LR(1) Construction (4)

57 Another Example of LR(1) Construction (5)

58 Another Example 2 LR(1) (1)
LR(1) for G3 G3: SE$ EE+T|T TT*P|P PID|(E)

59 Another Example 2 LR(1) (2)

60 Another Example 2 LR(1) (3)
Action Table

61 LR(1) parsing comments LR(1) automation is more discriminating than the SLR(1). In fact, it is so strong that any language that can be parsed from left to right with a one-token look-ahead in linear time can be parsed using the LR(1). LR tables are big Combine “equal” sets by merging look-ahead sets: LALR(1).

62 LALR(1) S3 and S10 are similar in that they are equal if one ignores the look-ahead sets, and so are S4 and S9, S6 and S11, and S8 and S12.

63 LALR(1) automation

64 Another Example for LALR(1) (1)
G3: SE$ EE+T|T TT*P|P PID|(E)

65 Another Example for LALR(1) (2)
G3: SE$ EE+T|T TT*P|P PID|(E)

66 Practice Derive the LALR(1) ACTION/GOTO table for the grammar in Fig. 2.95

67 Making a grammar LR(1) – or not
Although the chances for a grammar to be LR(1) are much larger than those being SLR(1) or LL(1), one often encounters a grammar that still is not LR(1). The reason is generally that the grammar is ambiguous. For Example if_statement -> ‘if’ ‘(’ expression ‘)’ statement | ‘if’ ‘(’expression ‘)’ statement ‘else’ statement statement -> … | if_statement |… The statement: if (x>0) if (y>0) p=0; else q=0;

68 Possible syntax trees (1)

69 Possible syntax trees (2)

70 Other Examples of Ambiguous Grammar (1)

71 Other Examples of Ambiguous Grammar (2)

72 Resolving shift-reduce conflicts (1)
The longest possible sequence of grammar symbols is taken for reduction. In a shift-reduce conflict do shift. Another example E * + E + * input: i * i + i E  E  ‘+’ E E  E ‘*’ E  reduce shift

73 Resolving shift-reduce conflicts (2)
The use of precedences between tokens Example: a shift-reduce conflict on t: P -> t{…} (shift item) Q -> uR {…t…} (reduce item) where R is either empty or one non-terminal. If the look-ahead is t, we perform one of the following three actions: If symbol u has a higher precedence than symbol t, we reduce If t has a higher precedence than symbol u, we shift. If both have equal precedence, we also shift

74 Bottom-up parser: yacc/bison
The most widely used parser generator is yacc Yacc is an LALR(1) parser generator A yacc look-alike called bison, provided by GNU

75 A very high-level view of text analysis techniques

76 Yacc code example (constructing parser tree)

77 Yacc code example (auxiliary code)


Download ppt "4. Bottom-up Parsing Chih-Hung Wang"

Similar presentations


Ads by Google