Download presentation
Presentation is loading. Please wait.
Published byTavion Billy Modified over 9 years ago
1
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011
2
Outline Overview LL(k) Grammars Recursive-Descent LL(1) Parsers Table-Driven LL(1) Parsers Obtaining LL(1) Grammars A Non-LL(1) Language Properties of LL(1) Parsers Parse Table Representation Syntactic Error Recovery and Repair
3
Overview Two forms of top-down parsers –Recursive-descent parsers –Table-driven LL parsers: LL(k) – to be explained later Compiler compilers (or parser generators) –CFG as a language’s definition, parsers can be automatically constructed –Language revision, update, or extension can be easily applied to a new parser –Grammar can be proved unambiguous if parser construction is successful
4
Top-Down Parsing Top-down –To grow a parse tree from root to leaves Predictive –Must predict which production rule to be applied LL(k) –Scan input left to right, leftmost derivation, k symbol lookahead Recursive descent –Can be implemented by a set of mutually recursive procedures
5
LL(k) Grammars Recall from Chap.2 –A parsing procedure for each nonterminal A –The procedure is responsible for accomplishing one step of derivation for the corresponding production –Choosing production by inspecting the next k tokens. Predict Set for production A is the set of tokens that trigger the production –Predict Set is determined by the right-hand side (RHS)
6
We need a strategy for choosing productions –Predict k (p): the set of length-k token strings that predict the application of rule p Input string: a * S=>* lm Ay 1 …y n –P={p ProductionsFor(A)|a Predict(p)} P: empty set -> syntax error P: more than one productions -> nondeterminism P: exactly one production
7
How to Compute Predict(p) To predict production p: A X 1 …X m, m>=0 –The set of terminal symbols that are first produced in some derivation from X 1 …X m –Those terminal symbols that can follow A –(Fig. 5.1)
9
For LL(1) grammar, the productions for each nonterminal A must have disjoint predict sets Not all CFGs are LL(1) –More lookahead may be needed: LL(k), k>1 –A more powerful parsing method may be required (Chap. 6) –The grammar may be ambiguous
12
S MATCH PEEK ADVANCE ERROR
13
Recursive-Descent LL(1) Parsers Input: token stream ts –PEEK(): to examine the next input token without advancing the input –ADVANCE(): to advances the input by one token To construct a recursive-descent parser –We write a separate procedure for each nonterminal A –For each production pi, we check each symbol in the RHS X 1 …X m Terminal symbol: MATCH( ts, X i ) Nonterminal symbol: call X i (ts)
14
PEEK
15
MATCH
16
Table-Driven LL(1) Parsers Creating recursive-descent parsers can be automated, but –Size of parser code –Inefficiency: overhead of method calls and returns To create table-driven parsers, we use stack to simulate the actions by MATCH() and calls to nonterminals’ procedures –Terminal symbol: MATCH –Nonterminal symbol: table lookup –(Fig. 5.8)
17
PUSH MATCH POP ERROR APPLY POP PUSH PEEK PARSER
18
How to Build LL(1) Parse Table The table is indexed by the top-of-stack (TOS) symbol and the next input token –Row: nonterminal symbol –Column: next input token –(Fig. 5.9)
19
ILL ABLE
21
Obtaining LL(1) Grammars It’s easy to violate the requirement of a unique prediction for each combination of nonterminal and lookahead symbols –Common prefixes –Left recursion
22
Common Prefixes Two productions for the same nonterminal begin with the same string of grammar symbols –Ex. (Fig. 5.12) Not LL(k) Factoring transformation –Fig. 5.13 –Ex. (Fig. 5.14)
23
ACTOR
24
LIMINATE EFT ECURSION
25
Left Recursion A production is left recursive if its LHS symbol is also the first symbol of its RHS –E.g. StmtList StmtList ; Stmt –A A | –(Fig. 5.15 & Fig. 5.16)
27
A Non-LL(1) Language Almost all common programming language constructs: LL(1) –One exception: if-then-else ( dangling else program) –Can be resolved by mandating that each else is matched to its closest unmatched then –(Fig. 5.17)
29
Ambiguous (Chap. 6) –E.g. if expr then if expr then other else other If expr then { if expr then other else other } If expr then { if expr then other } else other -> at least two distinct parses Dangling bracket language (DBL) –DBL={[ i ] j |i≥j≥0} if expr then Stmt -> [ (opening bracket) else Stmt -> ] (optional closing bracket)
30
Fig. 5.18(a) –S [ S CL | λ CL ] | λ E.g. [[] Fig. 5.18(b) –S [ S | T T [ T ] | λ
31
It’s not LL(k) –[ Predict( S [S ) [ Predict( S T ) [[ Predict 2 ( S [S ) [[ Predict 2 ( S T ) … [ k Predict k ( S [S ) [ k Predict k ( S T )
32
Properties of LL(1) Parsers A correct, leftmost parse is constructed All grammars in LL(1) are unambiguous All table-driven LL(1) parsers operate in linear time and space with respect to the length of the parsed input
33
Thanks for Your Attention!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.