6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes) No back tracking.
6/4/2016IT 3272 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Left to right scanning Rightmost derivations 1 symbol look-ahead Left to right scanning Leftmost derivations 1 symbol look-ahead Top-Down The syntax tree is built up from the root Example: LL(1) parser
6/4/2016IT 3273 LL(1) Grammar 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 LL(1) Parsing Table aaaccbbb S ASb aSb aASbb aaSbb aaASbbb aaaSbbb aaaCbbb aaacCbbb aaaccCbbb aaaccbbb A left-most derivation end-of-file symbol
6/4/2016IT 3274 Recursive-descent Parser 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 LL(1) Parsing Table S():Switch(token) { case a: A();S();get(b); build S ASb; break; case b: C(); build S C; break; case c: C(); built S C; break; case $: C(); built S C; break; } all possible terminal and end-of-file symbols
6/4/2016IT 3275 Recursive-descent Parser A():Switch(token) { case a: get(a); build A a; break; case b: error; break; case c: error; break; case $: error; break; } 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 LL(1) Parsing Table
6/4/2016IT 3276 Recursive-descent Parser C():Switch(token) { case a: error; break; case b: build C ; break; case c: get(c);C(); built C cC; break; case $: build C ; break; } 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 LL(1) Parsing Table
IT 3277 LL(1) Parsing 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 S(); A();S();get(b); get(a);S();get(b); S();get(b); A();S();get(b);get(b); get(a);S();get(b);get(b); S();get(b);get(b); A();S();get(b);get(b);get(b); get(a);S();get(b);get(b);get(b); S();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(c);C();get(b);get(b);get(b); C();get(b);get(b);get(b); get(b);get(b);get(b); get(b);get(b); get(b); aaaccbbb S ASb aSb aASbb aaSbb aaASbbb aaaSbbb aaaCbbb aaacCbbb aaaccCbbb aaaccbbb
6/4/2016IT 3278 LL(1) Grammar A grammar having an LL(1) parsing table. i.e., There is no conflict in the parsing table LL(1) Grammars allow -production. 1.S A S b 2.S C 3.A a 4.C c C 5.C abc$ S1222 A3 C545 LL(1) Parsing Table
6/4/2016IT 3279 Not every CFG is an LL(1) grammar ::= | s1 | s2 ::= if then else | if then ::= e1 | e2 if e1 then if e2 then s1 else s2 if (a > 2) if (b > 1) b++; else a++; if (a > 2) if (b > 1) b++; else a++;
6/4/2016IT The recursive-descent parser does not work for every CFG 1.E E + T 2.E T 3.T T * F 4.T F 5.F ( E ) 6.F id E():Switch(token) { case id: E();... } Left-recursions id+id*id
6/4/2016IT Left-recursions 1.A A 2.A A A A A A A’ 1.A A’ 2.A’ A’ 3.A’ A left-recursive grammarRemove left-recursion
6/4/2016IT Eliminating left-recursions 1.E E + T 2.E T 3.T T * F 4.T F 5.F ( E ) 6.F id 1.E T E’ 2.E’ + T E’ 3.E’ 4.T F T’ 5.T’ * F T’ 6.T’ 7.F ( E ) 8.F id
6/4/2016IT An Algorithm for Eliminating immediate left-recursions Given a CFG G, let A be one of its non-terminal symbols such that 1. Add a new non-terminal symbol A’ to G ; 2. For each production A such that A is not the 1 st symbol in add A A’ to G ; 3. For each production A A replace it by A A’ ; 4. Add A’ to G ; A A A 1.A A’ 2.A’ A’ 3.A’
6/4/2016IT Indirect left-recursions 1.S A a 2.S b 3.A S d 4.A e S a A d S a A d S b bdada
6/4/2016IT Indirect left-recursions 1.S A a 2.S b 3.A SdA’ 4.A eA’ 5.A’ cA’ 6.A’ find all immediate left recursions A A A A A’ A’ A’ A’ 1.S A a 2.S b 3.A A c 4.A S d 5.A e 1.S SdA’ a 2.S eA’a 3.S b 4.A’ cA’ 5.A’ if any, remove the last non-terminal symbol Z with rule Z X… find all immediate left recursions 1.S eA’aS’ 2.S bS’ 3.S’ dA’aS’ 4.S’ 5.A’ cA’ 6.A’ repeat
6/4/2016IT An Algorithm for Eliminating left-recursions Given a CFG G, let A 1, A 2,..... A n, be its nonterminal symbols for i:= n down to 1 do { for j := 1 to i-1 do { // find one level of indiretion For each production A i A j ω do { For each production A j , add A i ω to the grammar; Remove A i A j ω by } } // end for j Eliminate the immediate left-recursion caused by A i } // end for i
6/4/2016IT A Grammar for if statements 1.S iCtSE 2.S a 3.E eS 4.E 5.C b abeit$ S21 E3,44 C5 Is it an LL(1) grammar? Is there an LL(1) parsing table for it? No!
6/4/2016 IT A Grammar for if statements 1.S iCtSE 2.S a 3.E eS 4.E 5.C b abeit$ S 21 E 3,44 C 5 Why there is a conflict? S ... i b t S E…... i b t ibtSE E… i b t ibtaE E… i b t ibta E… i b t ibta eS… S ... i b t S E…... i b t ibtSE E… i b t ibtaE E… i b t ibtaeS E… 4: 3: ibtibtae……
6/4/2016IT A Grammar for if statements 1.S iCtSE 2.S a 3.E eS 4.E 5.C b abeit$ S21 E3,44 C5 Can we have an unambiguous equivalent grammar for this grammar? Yes! No! In general, No! Some inherently ambiguous languages exist. Can we write a program to get an unambiguous equivalent grammar from any grammar of a language that is known to be not inherently ambiguous? Can we write a program to test whether a given grammar is ambiguous? No!
6/4/2016IT Is there an LL(2) Grammar ? Yes! 1.S A B 2.A a A 3.A a 4.B b B 5.B c abc S1 Aabc 233 B45 LL(2) Parsing Table We need to look two symbols ahead in order to determine which rule should be used. { a m b n c | m ≥ 1 and n ≥ 0 } a a a a a b b b b c
6/4/2016IT LL(2) Parsing 1.S A B 2.A a A 3.A a 4.B b B 5.B c LL(2) Parsing Table S();a a a b c A();B();a a a b c get(a);A();B();a a a b c A();B(); a a b c get(a);A();B(); a a b c A();B(); a b c get(a);B() a b c B();b c get(b);B();b c B(); c get(c); c abc S1 Aabc 233 B45
6/4/2016IT Is there an LL(1) grammar equivalent to the following LL(2) grammar? 1.S a A B 2.A a A 3.A 4.B b B 5.B c { a m b n c | m ≥ 1 and n ≥ 0 } a a a a a b b b b c 1.S A B 2.A a A 3.A a 4.B b B 5.B c Yes
6/4/2016IT Every left-recursive grammar is not an LL(k) grammar 1.S a S’ 2.S’ AS’ 3.S’ 4.A b 1.S S A 2.S a 3.A b S SA SAA SAAA SAAAA a AAAAA .... abbbbbb 1.E E + T 2.E T 3.T T * F 4.T F 5.F ( E ) 6.F id 1.E T E’ 2.E’ + T E’ 3.E’ 4.T F T’ 5.T’ * F T’ 6.T’ 7.F ( E ) 8.F id But we can effectively find an equivalent one Are we happy with this?
6/4/2016IT Does any LL(2) grammar always has an equivalent LL(1) grammar? 1.S a S A 2.S 3.A abS 4.A c No no equivalent LL(1) grammar 1.S a S A 2.S 3.A a k-1 bS 4.A c no equivalent LL(k-1) grammar LL(2) grammar LL(k) grammar, k 2 LL(1) LL(2) LL(3) ..... LL(k) LL(k+1) ... KuriKi-Sunoi [1969]
6/4/2016IT There exists DCFL that is not LL(k) -- Stearns [1970] { a n | n ≥ 0 } { a n b n | n ≥ 0 } 1.S a S A 2.S 3.A a k-1 bS 4.A c LL(k) grammar, k 2 This grammar is inherently ambiguous. (KuriKi-Sunoi [1969 ]) Is there an unambiguous CFG that is not an LL(k) grammar? Yes
6/4/2016IT LL(1) Parser Implementation 1.E T E’ 2.E’ + T E’ 3.E’ 4.T F T’ 5.T’ * F T’ 6.T’ 7.F ( E ) 8.F n p.s. Let n be any positive integer less than n+*()$ E11 E’233 T44 T’6566 F87 Programming Assignment Details will be announced later.