Download presentation
Presentation is loading. Please wait.
Published byPeter Dickerson Modified over 9 years ago
1
CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU
2
CPSC46002 Predictive Parsing Summary First and Follow sets are used to construct predictive tables For non-terminal A and input t, use a production A a where t First( a ) For non-terminal A and input t, if e First(A) and t Follow( a ), then use a production A a where e First( a ) Recursive-descent without backtracking do not need the parse table explicitly
3
CPSC46003 Bottom-Up Parsing(1) Bottom-up parsing is more general than top-down parsing And just as efficient Builds on ideas in top-down parsing Bottom-up is the preferred method in practice
4
CPSC46004 Bottom-Up Parsing(2) Table-driven using an explicit stack (non-recursive) Stack can be viewed as containing both terminals and nonterminals Basic operations: Shift: Move terminals from input stream to the stack until the right-hand side of an appropriate production rule has been identified in the stack Reduce: Replace the sentential form appearing on the stack (considered from top that matched the right-hand side of an appropriate production rule) with the nonterminal appearing on the left-hand side of the production.
5
CPSC46005 An Introductory Example Bottom-up parsers don ’ t need left-factored grammars Hence we can revert to the “ natural ” grammar for our example: E T + E | T T num * T | num | (E) Consider the string: num * num + num
6
CPSC46006 The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: E E T + E T + E E T T + T T num T + num T num * T num * T + num T num num * num + num InputProductions Used
7
CPSC46007 Right-most Derivation In a right-most derivation, the rightmost nonterminal of a sentential form is replaced at each derivation step. Question: find the rightmost derivation of the string num* num + num
8
CPSC46008 Observation Read the productions found by bottom-up parse in reverse (i.e., from bottom to top) This is a rightmost derivation! E E T + E T + E E T T + T T num T + num T num * T num * T + num T num num * num + num
9
CPSC46009 Important Facts A bottom-up parser traces a rightmost derivation in reverse
10
CPSC460010 A Bottom-up Parse E T + E T + T T + num num * T + num num * num + num E TE + num * T T
11
CPSC460011 A Bottom-up Parse in Detail (1) + num * num * num + num
12
CPSC460012 A Bottom-up Parse in Detail (2) num * T + num num * num + num + num * T
13
CPSC460013 A Bottom-up Parse in Detail (3) T + num num * T + num num * num + num T + num * T
14
CPSC460014 A Bottom-up Parse in Detail (4) T + T T + num num * T + num num * num + num T + num * T T
15
CPSC460015 A Bottom-up Parse in Detail (5) T + E T + T T + num num * T + num num * num + num TE +num * T T
16
CPSC460016 A Bottom-up Parse in Detail (6) E T + E T + T T + num num * T + num num * num + num E TE + num * T T
17
CPSC460017 Bottom-up Parsing A trivial bottom-up parsing algorithm Let I = input string repeat pick a non-empty substring of I where X is a production if no such , backtrack replace one by X in I until I = “ S ” (the start symbol) or all possibilities are exhausted
18
CPSC460018 Observations The termination of the algorithm (when/if) Running time of the algorithm If there are more than one choices for the sub-string to be replaced (reduce) which one to choose?
19
CPSC460019 Where Do Reductions Happen Recall A bottom-up parser traces a rightmost derivation in reverse Let be a rightmost sentential form Assume the next reduction is by X Then is a string of terminals Why? Because X is a step in a right-most derivation
20
CPSC460020 Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift Reduce
21
CPSC460021 Shift Shift: Move # (marking the part of the input that has been processed) one place to the right Shifts a terminal to the left string ABC#xyz ABCx#yz
22
CPSC460022 Reduce Apply an inverse production at the right end of the left string If A xy is a production, then Cbxy#ijk CbA#ijk
23
CPSC460023 The Example with Shift-Reduce Parsing reduce T num T + num # shiftT + # num shiftnum # * num + num shiftnum * # num + num shift#num * num + num E # reduce E T + E T + E # reduce E T T + T # shiftT # + num reduce T num * T num * T # + num reduce T num num * num # + num
24
CPSC460024 A Shift-Reduce Parse in Detail (1) + num * #num * num + num
25
CPSC460025 A Shift-Reduce Parse in Detail (2) + num * num # * num + num #num * num + num
26
CPSC460026 A Shift-Reduce Parse in Detail (3) + num * num # * num + num num * # num + num #num * num + num
27
CPSC460027 A Shift-Reduce Parse in Detail (4) + num * num # * num + num num * # num + num #num * num + num num * num # + num
28
CPSC460028 A Shift-Reduce Parse in Detail (5) + num * T num # * num + num num * # num + num #num * num + num num * T # + num num * num # + num
29
CPSC460029 A Shift-Reduce Parse in Detail (6) T + num * T num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num
30
CPSC460030 A Shift-Reduce Parse in Detail (7) T + num * T T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num
31
CPSC460031 A Shift-Reduce Parse in Detail (8) T + num * T T + num # T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num
32
CPSC460032 A Shift-Reduce Parse in Detail (9) T +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + T # T # + num num * T # + num num * num # + num
33
CPSC460033 A Shift-Reduce Parse in Detail (10) TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + E # T + T # T # + num num * T # + num num * num # + num
34
CPSC460034 A Shift-Reduce Parse in Detail (11) E TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num E # T + E # T + T # T # + num num * T # + num num * num # + num
35
CPSC460035 The Stack Left string can be implemented by a stack Top of the stack is the # Shift pushes a terminal on the stack Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a non-terminal on the stack (production lhs)
36
CPSC460036 Key Issue (will be resolved by algorithms) How do we decide when to shift or reduce? Consider step: num # * num + num We could reduce by T num giving T # * num + num A fatal mistake: No way to reduce to the start symbol E
37
CPSC460037 Conflicts Generic shift-reduce strategy: If there is a handle on top of the stack, reduce Otherwise, shift But what if there is a choice? If it is legal to shift or reduce, there is a shift-reduce conflict If it is legal to reduce by two different productions, there is a reduce-reduce conflict
38
CPSC460038 Conflict Example Consider the ambiguous grammar: num| (E)| E * E| E + E E
39
CPSC460039 One Shift-Reduce Parse E # reduce E E + E E + E #... reduce E E * E E * E # + num shift#num * num + num reduce E num E + num# shiftE + # num shiftE # + num InputAction
40
CPSC460040 Another Shift-Reduce Parse E # reduce E E * E E * E #... shiftE * E # + num shift#num * num + num reduce E E + E E * E + E# reduce E num E * E + num # shiftE * E + # num Input Action
41
CPSC460041 Observations In the second step E * E # + num we can either shift or reduce by E E * E Choice determines associativity of + and * As noted previously, grammar can be rewritten to enforce precedence Precedence declarations are an alternative
42
CPSC460042 Overview LR(k) parsing L: scan input Left to right R: produce rightmost derivation k tokens of lookahead LR(0) zero tokens of look-ahead SLR Simple LR: like LR(0) but uses FOLLOW sets to build more “ precise ” parsing tables
43
CPSC460043 Basic Terminologies Handle A substring that matches the right side of a production whose reduction with that production ’ s left side constitutes one step of the rightmost derivation of the string from the start nonterminal of the grammar
44
CPSC460044 Model of Shift-Reduce Parsing Stack + input = current right-sentential form. Locate the handle during parsing: shift zero or more terminals (tokens) onto the stack until a handle is on top of the stack. Replace the handle with a proper non-terminal (Handle Pruning): reduce to A where A
45
CPSC460045 Model of an LR Parser
46
CPSC460046 Problem: when to shift, when to reduce? Recall grammar: E T + E | T T num * T | num | (E) how to know when to reduce and when to shift?
47
CPSC460047 Model of Shift-Reduce Parsing Stack + input = current right-sentential form. Locate the handle during the parsing: shift zero or terminals onto the stack until a handle is on top of the stack. Replace the handle with a proper non-terminal (Handle Pruning)
48
CPSC460048 What we need to know to do LR parsing LR(0) states describe states in which the parser can be Note: LR(0) states are used by both LR(0) and SLR parsers Parsing tables transitions between LR(0) states, actions to take at transition: shift, reduce, accept, error How to construct LR(0) states How to construct parsing tables How to drive the parser
49
CPSC460049 An LR(0) state = a set of LR(0) items An LR(0) item [X --> a.b] says that the parser is looking for an X it has an a on top of the stack expects to find in the input a string derived from b. Notes: [X --> a.ab] means that if a is on the input, it can be shifted. That is: a is a correct token to see on the input, and shifting a would not “ over-shift ” (still a viable prefix). [X -->a.] means that we could reduce X
50
CPSC460050 LR(0) states S’ . E E . T E .T + E T .(E) T .num * T T .num S’ E. E T. E T. + E T num. * T T num. T (. E) E .T E .T + E T .(E) T .num * T T .num E T + E. E T +. E E .T E .T + E T .(E) T .num * T T .num T num *.T T .(E) T .num * T T .num T num * T. T (E.) T (E). E T ( num * ) E E T ( ( T (
51
CPSC460051 SLR Parsing Remember the state of the automaton on each prefix of the stack Change stack to contain pairs Symbol, DFA State
52
CPSC460052 SLR Parsing (Contd.) For a stack sym 1, state 1 ... sym n, state n state n is the final state of the DFA on sym 1 … sym n Detail: The bottom of the stack is any,start where any is any dummy state start is the start state of the DFA
53
CPSC460053 Goto Table Define Goto[i,A] = j if state i A state j where A is a nonterminal Goto is just the transition function of the DFA One of two parsing tables
54
CPSC460054 Parser Moves Shift x Push a, x on the stack a is current input x is a DFA state Reduce A As before Accept Error
55
CPSC460055 Action Table For each state s i and terminal a If s i has item X .a and there is a transition on terminal a from state i to state j then Action[i,a] = shift j If s i has item X . and a Follow(X) and X != S’ then Action[i,a] = reduce X If s i has item S ’ S. then action[i,$] = accept Otherwise, action[i,a] = error
56
CPSC460056 SLR Parsing Algorithm Let I = w$ be initial input Let j = 0 Let DFA state 1 have item S ’ .S Let stack = dummy, 1 repeat case action[top_state(stack),I[j]] of shift k: push I[j++], k reduce X A: pop |A| pairs, I[--j] = X // prepend X to input accept: halt normally error: halt and report error
57
CPSC460057 Notes on SLR Parsing Algorithm Note that the algorithm uses only the DFA states and the input The stack symbols are never used! However, we still need the symbols for semantic actions
58
CPSC460058 The Compiler So Far Lexical analysis Detects inputs with illegal tokens Parsing Detects inputs with ill-formed parse trees Semantic analysis Last “ front end ” phase Catches all remaining errors
59
CPSC460059 Typical Semantic Errors multiple declarations: a variable should be declared (in the same scope) at most once undeclared variable: a variable should not be used before being declared. type mismatch: type of the left-hand side of an assignment should match the type of the right-hand side. wrong arguments: methods should be called with the right number and types of arguments.
60
CPSC460060 Sample Semantic Analyzer For each scope in the program: process the declarations add new entries to the symbol table (or a similar structure) and report any variables that are multiply declared process the statements find uses of undeclared variables, use the symbol-table information to determine the type of each expression, and to find type errors.
61
CPSC460061 Scope Rules for Pascal- Rule 6.1: All constants, types, variables, and procedures definedin the same block must have different names Rule 6.2: A constant, type, or variable defined in a block is normallyknown from the end of its declaration to the end of the block. A procedure defined in a block B is normally known from the beginning of the procedure to the end of the block B Rule 6.3: Consider a block Q that defines an object x. If Q contains a block R that defines another object named x, the first object is unknown in the scope of the second object.
62
CPSC460062 Pascal- Program (1) { 0 Begin Standard Block} 1 program P; 2 type T = array[1..100] of integer; 3 var x: T; 4 5 procedure Q(x: integer); 6 const c = 13; 7 begin... x... end{Q}; 8 9 procedure R; 10 var b, c: Boolean; 11 begin... x...end{R}; 12 13 begin... end.{P} 14 {End Standard block}
63
CPSC460063 Pascal- Program (2) {Constant = Numeral | ConstantName.} procedure Constant(Stop: Symbols); begin if Symbol = Numeral1 then Expect(Numeral, Stop) else if Symbol = Name1 then begin Find(Argument); Expect(Name1, Stop) end else SyntaxError(Stop) end;
64
CPSC460064 Pascal- Program (3) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Symbols[Semicolon1] + Stop); Define(Name); Expect(Semicolon1, Stop) end;
65
CPSC460065 Pascal- Program (4) {Program = 'program' ProgramName ';' BlockBody '.'} procedure Programx(Stop: Symbols); begin Expect(Program1, Symbols[Name1, Semicolon1, Period1] + BlockSymbols + Stop); Expect(Name1, Symbols[Semicolon1, Period1] + BlockSymbols + Stop); Expect(Semicolon1, Symbols[Period1] + BlockSymbols + Stop); NewBlock; BlockBody(Symbols[Period1] + Stop); EndBlock; Expect(Period1, Stop) end;
66
CPSC460066 Pascal- Program (5-1) {Constant = Numeral | ConstantName.} procedure Constant(var Value: integer; var Typex: Pointer; Stop: Symbols); begin if Symbol = Numeral1 then begin Value := Argument; Typex := TypeInteger; Expect(Numeral, Stop) end else if Symbol = Name1 then begin Find(Argument, Object); if Object@.Kind = Constantx then begin Value := Object@.ConstValue; Typex := Object@.ConstType; end
67
CPSC460067 Pascal- Program (5-2) else begin KindError(object); Value := 0; Typex := TypeUniversal; end; Expect(Name1, Stop) end else begin SyntaxError(Stop); Value := 0; Typex := TypeUniversal; end;
68
CPSC460068 Pascal- Program (6) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); var Name, Value: integer; Constx, Typex: Pointer; begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Value, Typex, Symbols[Semicolon1] + Stop); Define(Name, Constantx, Constx); Constx@.ConstValue := Value; Constx@.ConstType := Typex; Expect(Semicolon1, Stop) end;
69
CPSC460069 Static and Dynamic Scope #include int main() { int x = 1; char x = ‘ b ’ ; char y = ‘ a ’ ; q(); void p() { return 0 double x = 2.5; } printf( “ %c\n ”,y}; { int y[10]; } } void q() { int y = 42; printf(%d\n ”, x); p(); }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.