Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parsing 4 Dr William Harrison Fall 2008

Similar presentations


Presentation on theme: "Parsing 4 Dr William Harrison Fall 2008"— Presentation transcript:

1 Parsing 4 Dr William Harrison Fall 2008 HarrisonWL@missouri.edu
CS4430 – Compilers I

2 Today Continuing the second phase “Predictive” parsing
“parsing” or grammatical analysis discovers the real structure of a program and represents it in a computationally useful way “Predictive” parsing also called “recursive descent” parsing Last time: basics of LL(1) parsing follow sets, table-driven predictive parsing, etc. Today: finish predictive parsing Reading: You should be reading Chapter 2 of “Modern Compiler Design”

3 Parsing concepts Language Grammar Parser
* all inter-related, but different notions

4 Review: Parse Trees from Derivations
S  if E then S else S S  begin S L S  print E L  end L ; S L E  num = num Parse Tree Associated with Derivation S begin S L print E end 1 = 1 S  begin S L  begin print E L  begin print 1=1 L  begin print 1=1 end

5 The Process of constructing Parsers
Language Design Construct CFG Recognize its Form Parser Language Parser start again if not in appropriate form (e.g., LL(1), LR(1),…)

6 Predictive Parsing S  if E then S else S S  begin S L S  print E
The first token on the RHS of each rule is unique. S  if E then S else S S  begin S L S  print E L  end L ; S L E  num = num a.k.a. “recursive descent” If the first symbol on the r.h.s. of the productions is unique, then this grammar is LL(1).

7 “Rolling your own” Predictive Parsers
Has one function for each non-terminal, and one clause for each production int tok = getToken(); void advance() { tok = getToken(); } void eat(int t) { if (tok = t) { advance() } else { error(); } } void S(){switch(tok) { case IF: eat(IF); E(); eat(THEN); S(); eat(ELSE); S(); break; case BEGIN: … void E() { eat(NUM); eat(EQ); eat(NUM); S  if E then S else S S  begin S L S  print E L  end L ; S L E  num = num

8 Road map for today Determining if a grammar is LL(1)
First sets “Follow” sets “Massaging” a grammar into LL Using the technique of “left factoring” …and eliminating “left recursion”

9 Review: Table-driven Predictive Parsing
S  E $ E  T E’ E’  + T E’ E’  - T E’ E’   T  F T’ T’  * F T’ T’  / F T’ T’   F  id F  num F  ( E ) front of token stream + id $ * S E E’ T T’ F 1 2 3 5 6 9 7 9 10 Actions are: predict(production) match, accept, and error top of the “parse stack” empty=error

10 Remaining Questions about Predictive Parsing
We were “given” that grammar and prediction table Can all grammars be given a similar table Alas, no – only LL(1) grammars How did we come up with that table? “First” and “Follow” sets Techniques for converting some grammars into LL(1) Eliminating left-recursion Left factoring We’ll discuss these now

11 FIRST(g) g is a sequence of symbols (terminal or non-terminal)
For example, the right hand side of a production FIRST returns the set of all possible terminal symbols that begin any string derivable from g. Consider two productions X  g1 and X  g2 If FIRST (g1 ) and FIRST (g2 ) have symbols in common, then the prediction mechanism will not know which way to choose.  The Grammar is not LL(1)! If FIRST (g1 ) and FIRST (g2 ) have no symbols in common, then perhaps LL(1) can be used. We need some more formalisms.

12 FOLLOW sets and nullable productions
FOLLOW(X) X is a non-terminal The set of terminals that can immediately follow X. nullable(X) True if, and only if, X can derive to the empty string λ. FIRST, FOLLOW & nullable can be used to construct predictive parsing tables for LL(1) parsers. Can build tables that support LL(2), LL(3), etc. X  A X B C  X d B  a b FOLLOW(X) = ? X  a B X  B B  d B  λ nullable(X) = ?

13 FOLLOW sets and nullable productions
FOLLOW(X) X is a non-terminal The set of terminals that can immediately follow X. nullable(X) True if, and only if, X can derive to the empty string λ. FIRST, FOLLOW & nullable can be used to construct predictive parsing tables for LL(1) parsers. Can build tables that support LL(2), LL(3), etc. X  A X B C  X d B  a b FOLLOW(X) = { d, a } X  a B X  B B  d B  λ nullable(X) = true

14 Constructing the parse table
For every non-terminal X and token t: t X X  g Enter the production (X  g) if t FIRST(g), If X is nullable, enter the production (X  g) if t FOLLOW(g)

15 Constructing the parse table
What if there are more than one production? t X X  g1,X  g

16 Constructing the parse table
What if there are more than one production? t X  g1,X  g X Then the grammar cannot be parsed with “predictive parsing”, and it is (by definition) not LL(1).

17 Shortcomings of LL Parsers
Recursive descent renders a readable parser. depends on the first terminal symbol of each sub-expression providing enough information to choose which production to use. But consider a predictive parser for this grammar E  E + T E  T void E(){switch(tok) { case ?: E(); eat(TIMES); T();  no way of choosing production case ?: T(); } void T(){eat(ID);}

18 Eliminating left recursion
Consider this grammar it’s “left-recursive” Can not use LL(1) – why? Consider this alternative different grammar This derives the same language Now there is no left recursion. There is a generalization for more complex grammars. E  E + T E  T E  T E’ E’  + T E’ E’  λ

19 Removing Common Prefixes
Consider It’s not LL(1) – why? We can transform this grammar, and make it LL(1). S  if E then S else S S  if E then S S  if E then S X X  λ X  else S * Remark: The resulting grammar is not as readable.

20 In Class Discussion (1) S  S ; S S  id := E E  id E  num E  E + E
How can we transform this grammar So that it accepts the same language, but Has no left recursion We may have to use left factoring

21 In Class Discussion (2) S  S1 ; S  was S  S ; S S  S1 S1  id := E E  E + E1  was E  E+E E  E1 E1  id E1  num Prefix elimination doesn’t work here Write a grammar that accepts the same language, but Is not ambiguous Has no left recursion

22 Class Discussion (3) S  S1 S2 S1  id := E S2  ; S1 S2 S2 
E  E1 E2 E1  id E1  num E2  + E1 E2 E2  Write a grammar that accepts the same language, but Has no left recursion Before: (# is a terminal) X  X # Y X  Y After: X  Y X2 X2  # Y X2 X2  This final grammar is LL(1).

23 Next time LR Parsing LR(k) grammars are more expressive than LL(k) grammars, …but it’s not as obvious how to parse with them.


Download ppt "Parsing 4 Dr William Harrison Fall 2008"

Similar presentations


Ads by Google