3.2 Language and Grammar 3.2.7 Left Factoring Unclear productions Left factoring the grammar A A A 1 | 2
3.2 Language and Grammar Example Dangling Else stmt if expr then stmt else stmt | if expr then stmt | other Left factoring stmt if expr then stmt optional_else_part optional_else_part else stmt |
3.2 Language and Grammar 3.2.8 Non-Context-Free Language Constructs L1 = {wcw | w is in (a | b)*} Checking that identifiers are declared before used L2 = {anbmcndm | n 0, m 0} Number of formal parameters agrees with the number of actual parameters L3 = {anbncn | n 0} Requirements for typesetting L1= {wcwR | w(a|b)*} S aSa | bSb | c L2 = {a nbmcmdn | n 1, m 1 } S aSd | aAd A bAc | bc L 2 = {a nbncmdm | n 1,m 1 } S AB A aAb | ab B cBd | cd L3 ={a nb n | n 1 } S aSb | ab
3.2 Language and Grammar L3 ={a nb n | n 1 } S aSb | ab L3 could not be described using Regex If there exists an DFA D accepting L3’, with k states Suppose D reaches s0, s1, …, sk after reading , a, aa, …, ak , respectively For an input with more than k a’s, some state must be entered twice si …。。。 f s0 Path labeled ai Path labeled bi Path labeled aj i
3.2 Language and Grammar 3.2.9 Formal Language Grammer G = (VT , VN, S, P) Type 0: , , (VN VT)*, | | 1 Type 1:| | | |,except S Type 2:A ,AVN , (VN ∪VT)* Type 3:A aB or A a,A, BVN , a VT Unrestricted Context Sensitive Free Regex
3.2 Language and Grammar Example:L3={ a nb nc n| n 1} S aSBC S aBC CB BC aB ab bB bb bC bc cC cc Derivations of anbncn S * a n-1S (BC) n1 S + an(BC)n S + anBnCn S + a nbB n1C n S + a nbnC n S + anbncC n-1 S + a nbncn
Revisions A 1 | 2 Regex Left Factoring A+Aa Restrictions A+Aa Eliminating Left Recursion Context Free Grammar Definition Derivation Parse Tree Type 0 Type 1 Type 2 Type 3 Left most Right most Ambiguity Elimination of Ambiguity
3.3 Top-Down Parsing 3.3.1 Common Appoaches 例: S aCb C cd | c Given input w = acb S a C b S a C b c d S a C b c Not suitable for grammars with left recursions or common left-factors
3.3 Top-Down Parsing 3.3.2 LL(1) Grammar How to restrict the grammar to avoid backtrack? Define two functions, i.e., first and follow FIRST ( )={a | * a…, a VT} 1、esp., if * , define FIRST ( ) 2、if AB, then FIRST(B) should be added to FIRST(A) Given two choicesi and j, FIRST (i ) FIRST (j ) = On condition that not in FIRST (i ) or FIRST (j )
3.3 Top-Down Parsing 3.3.2 LL(1) Grammar How to restrict the grammar to avoid backtrack? Define two functions, i.e., first and follow FOLLOW (A) = {a | S * …Aa…,aVT} 1. If A is the right most symbol of a sentential form, place $ in FOLLOW(A) 2. If there is a production AB or AB, in which * , then everything in FOLLOW(A) is in FOLLOW(B)
3.3 Top-Down Parsing LL(1) Grammar Any production A | should agree with FIRST( ) FIRST( ) = If * , then FIRST() FOLLOW(A) =
3.3 Top-Down Parsing LL(1) Grammar Any production A | should agree with FIRST( ) FIRST( ) = If * , then FIRST() FOLLOW(A) = Characteristics of LL(1) No common left-factor Not ambiguous No left recursion
3.3 Top-Down Parsing 例 E TE E + TE | T FT F (E) | id FIRST(E) = FIRST(T) = FIRST(F) = { ( , id } FIRST(E ) = {+, } FRIST(T ) = {*, } FOLLOW(E) = FOLLOW(E ) = { ), $} FOLLOW(T) = FOLLOW (T ) = { +, ), $} FOLLOW(F) = {+, *, ), $}
7、Computing FIRST and FOLLOW If Xa.., Place terminalain FIRST(X) If X, then place in FIRST(X) If XY…,where Y is a non-terminal, then add everything of FIRST(Y)\{} to FIRST(X) If XY1Y2..YK, where Y1,Y2,..Yi-1 are non-terminals, and all the FIRST set of Y1,Y2,..Yi-1的FIRST contain , then place FIRST(Y)\{} in FIRST(X), (j=1,2,..i). Especially, if all Y1~YK contain production, then place in FIRST(X) FIRST(B)={a,b,c} FIRST(A)={a,b,c,d} FIRST(S)={a,b,c} SBA ABS|d BaA|bS|c
7、Computing FIRST and FOLLOW For the start symbol S, place $ in FOLLOW(S) If there exists AB, then place FIRST()\{} in FOLLOW(B). Note that could be empty If A B or A B , and * ( is in FIRST()), then place FOLLOW(A) in FOLLOW(B)( could be empty)。 FIRST(B)={a,b,c} FIRST(A)={a,b,c,d} FIRST(S)={a,b,c} FOLLOW(S)=? FOLLOW(A)=? FOLLOW(B)=? SBA ABS|d BaA|bS|c
3.3 Top-Down Parsing 3.3.3 Recursive Descent Parsing Example: A set of procedures, for each non-terminal The procedures could be recursive Example: type simple | id | array [simple] of type simple integer | char | num dotdot num
3.3 Top-Down Parsing An auxiliary procedure procedure match (t : token); begin if lookahead = t then lookahead := nexttoken( ) else error( ) end;
3.3 Top-Down Parsing type simple | id | array [simple] of type proccdure type; begin if lookahead in {integer, char, num} then simple( ) else if lookahead = then begin match (); match (id) end else if lookahead = array then begin match (array); match ( [ ); simple( ); match ( ] ); match (of ); type( ) else error( ) end; type simple | id | array [simple] of type
3.3 Top-Down Parsing simple integer | char | num dotdot num procedure simple; begin if lookahead = integer then match (integer) else if lookahead = char then match (char) else if lookahead = num then begin match (num); match (dotdot); match (num) end else error( ) end; simple integer | char | num dotdot num
Predictive Parsing Program 3.3 Top-Down Parsing 3.3.4 Non-recursive predictive parsing a + b $ Input Predictive Parsing Program Parsing Table M Output X Y Z Stack
3.3 Top-Down Parsing Non-terminal Input Symbol id + * . . . E E TE T FT T T T *FT F F id
3.3 Top-Down Parsing Moves made by parser to accept id * id + id Stack Input Output $E id * id + id$ $E T id * id + id$ E TE $E T F id * id + id$ T FT $E T id id * id + id$ F id $E T * id + id$ $E T F* * id + id$ T *FT $E T F id + id$ $E T id id + id$ F id
E TE E + TE | T FT T * FT | F (E) | id 输入:id*id 6、LL(1) Parsing E $ E stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E Depth first
T E’ $ stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E E T E
F T’ E’ $ stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E E T E F T
stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E id T’ E’ $ E T E F T id
stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E T’ E’ $ E T E F T id
栈 输 入 输 出 $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E * F T’ E’ $ E T E F T id * F T
栈 输 入 输 出 $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E F T’ E’ $ E T E F T id * F T
栈 输 入 输 出 $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E id T’ E’ $ E T E F T id * F T id
stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E T’ E’ $ E T E F T id * F T id
stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E E’ $ E T E F T id * F T id
stack input output $E id * id $ $E T E TE $E T F T FT $E T id F id $E T * id $ $E T F* T *FT id $ $ $E T E $ E T E F T id * F T id
3.3 Top-Down Parsing 3.3.5 Constructing predictive parsing table (1)For each production A , execute(2) (3) (2)For each terminal a of FIRST(), add A in M[A, a]。 (3)If is in FIRST(), then for each terminal b (including $) of OLLOW(A), add A in M[A, b]。 (4)Label undefined entries of M as error。
3.3 Top-Down Parsing Multiple defined entry Non-terminal Input Symbol other b else . . . stmt stmt other e_part e_part else stmt e_part expr expr b
3.3 Top-Down Parsing Multiple defined entry Non-terminal Input Symbol other b else . . . stmt stmt other e_part e_part else stmt expr expr b
习 题 3.4(b)(c), 3.6(a)(b), 3.8