Computing Follow(A) : All Non-Terminals

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Top-Down Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Top-Down Parsing.
– 1 – CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL (1) Table construction Readings: 4.4 Homework: Program.
CPSC 388 – Compiler Design and Construction
COP4020 Programming Languages Computing LL(1) parsing table Prof. Xin Yuan.
Syntax and Semantics Structure of programming languages.
Top-Down Parsing - recursive descent - predictive parsing
Chapter 5 Top-Down Parsing.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Syntax and Semantics Structure of programming languages.
Joey Paquet, 2000, Lecture 5 Error Recovery Techniques in Top-Down Predictive Syntactic Analysis.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
1 Problems with Top Down Parsing  Left Recursion in CFG May Cause Parser to Loop Forever.  Indeed:  In the production A  A  we write the program procedure.
Pembangunan Kompilator.  The parse tree is created top to bottom.  Top-down parser  Recursive-Descent Parsing ▪ Backtracking is needed (If a choice.
CH4.1 CSE244 Sections : Syntax Analysis Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser –Recursive-Descent Parsing.
Parsing Top-Down.
LL(1) Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
More Parsing CPSC 388 Ellen Walker Hiram College.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)
1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.
Top-Down Parsing.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Syntax and Semantics Structure of programming languages.
Error recovery in predictive parsing An error is detected during the predictive parsing when the terminal on top of the stack does not match the next input.
Problems with Top Down Parsing
Compiler Construction
Programming Languages Translator
Context free grammars Terminals Nonterminals Start symbol productions
Chapter 4 Syntax Analysis
Parsing IV Bottom-up Parsing
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Table-driven parsing Parsing performed by a finite state machine.
Introduction to Top Down Parser
Top-down parsing cannot be performed on left recursive grammars.
SYNTAX ANALYSIS (PARSING).
Top-Down Parsing.
4 (c) parsing.
Chapter 4: Syntax Analysis Part 2: Top-Down Parsing
Top-Down Parsing CS 671 January 29, 2008.
Lecture 7 Predictive Parsing
CS 540 George Mason University
Chapter 4: Syntax Analysis
Compiler Design 7. Top-Down Table-Driven Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
Ambiguity, Precedence, Associativity & Top-Down Parsing
Ambiguity in Grammar, Error Recovery
Compiler design Error recovery in top-down predictive parsing
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Designing a Predictive Parser
Bottom-Up Parsing “Shift-Reduce” Parsing
Syntax Analysis - Parsing
Lecture 7 Predictive Parsing
Nonrecursive Predictive Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Predictive Parsing Program
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Computing Follow(A) : All Non-Terminals 1. Place $ in Follow(S), where S is the start symbol and $ signals end of input 2. If there is a production A B, then everything in First() is in Follow(B) except for . 3. If A B is a production, or A B and    (First() contains  ), then everything in Follow(A) is in Follow(B) (Whatever followed A must follow B, since nothing follows B from the production rule) * We’ll calculate Follow for two grammars.

The Algorithm for Follow – pseudocode 1. Initialize Follow(X) for all non-terminals X to empty set. Place $ in Follow(S), where S is the start NT. 2. Repeat the following step until no modifications are made to any Follow-set For any production X  X1 X2 … Xm For j=1 to m, if Xj is a non-terminal then: Follow(Xj)=Follow(Xj)(First(Xj+1,…,Xm)-{}); If First(Xj+1,…,Xm) contains  or Xj+1,…,Xm=  then Follow(Xj)=Follow(Xj) Follow(X);

Computing Follow : 1st Example Recall: S  i E t SS’ | a S’  eS |  E  b First(S) = { i, a } First(S’) = { e,  } First(E) = { b }

Computing Follow : 1st Example Recall: S  i E t SS’ | a S’  eS |  E  b First(S) = { i, a } First(S’) = { e,  } First(E) = { b } Follow(S) – Contains $, since S is start symbol Since S  i E t SS’ , put in First(S’) – not  Since S’  , Put in Follow(S) Since S’  eS, put in Follow(S’) So…. Follow(S) = { e, $ } Follow(S’) = Follow(S) HOW? Follow(E) = { t } *

Example 2 Compute Follow for: E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id

Example 2 Compute Follow for: Follow First E $ ) E ( id E’ $ ) E’  + E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id Follow E $ ) E’ $ ) T + $ ) T’ + $ ) F + * $ ) First E ( id E’  + T ( id T’  * F ( id

Constructing Parsing Table Algorithm: Table has one row per non-terminal / one column per terminal (incl. $ ) Repeat Steps 2 & 3 for each rule A Terminal a in First()? Add A to M[A, a ] 3.1  in First()? Add A  to M[A, b ] for all terminals b in Follow(A). 3.2  in First() and $ in Follow(A)? Add A  to M[A, $ ] 4. All undefined entries are errors.

Constructing Parsing Table – Example 1 S  i E t SS’ | a S’  eS |  E  b First(S) = { i, a } First(S’) = { e,  } First(E) = { b } Follow(S) = { e, $ } Follow(S’) = { e, $ } Follow(E) = { t }

Constructing Parsing Table – Example 1 S  i E t SS’ | a S’  eS |  E  b First(S) = { i, a } First(S’) = { e,  } First(E) = { b } Follow(S) = { e, $ } Follow(S’) = { e, $ } Follow(E) = { t } S  i E t SS’ S  a E  b First(i E t SS’)={i} First(a) = {a} First(b) = {b} S’  eS S   First(eS) = {e} First() = {} Follow(S’) = { e, $ } Non-terminal INPUT SYMBOL a b e i t $ S S a S iEtSS’ S’ S’  S’ eS S  E E b

Constructing Parsing Table – Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id First(E,F,T) = { (, id } First(E’) = { +,  } First(T’) = { *,  } Follow(E,E’) = { ), $} Follow(F) = { *, +, ), $ } Follow(T,T’) = { +, ) , $}

Constructing Parsing Table – Example 2 E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id First(E,F,T) = { (, id } First(E’) = { +,  } First(T’) = { *,  } Follow(E,E’) = { ), $} Follow(F) = { *, +, ), $ } Follow(T,T’) = { +, ) , $} Expression Example: E  TE’ : First(TE’) = First(T) = { (, id } M[E, ( ] : E  TE’ M[E, id ] : E  TE’ (by rule 2) E’  +TE’ : First(+TE’) = + : M[E’, +] : E’  +TE’ (by rule 3) E’   :  in First( ) T’   :  in First( ) by rule 2 M[E’, )] : E’   (3.1) M[T’, +] : T’   (3.1) M[E’, $] : E’   (3.2) M[T’, )] : T’   (3.1) (Due to Follow(E’) M[T’, $] : T’   (3.2)

LL(1) Grammars L : Scan input from Left to Right L : Construct a Leftmost Derivation 1 : Use “1” input symbol as lookahead in conjunction with stack to decide on the parsing action LL(1) grammars == they have no multiply-defined entries in the parsing table. Properties of LL(1) grammars: Grammar can’t be ambiguous or left recursive Grammar is LL(1) when A  1.  &  do not derive strings starting with the same terminal a 2. Either  or  can derive , but not both. Note: It may not be possible for a grammar to be manipulated into an LL(1) grammar

Predictive Parsing Program Error Recovery When Do Errors Occur? Recall Predictive Parser Function: a + b $ Y X Z Input Predictive Parsing Program Stack Output Parsing Table M[A,a] If X is a terminal and it doesn’t match input. If M[ X, Input ] is empty – No allowable actions Consider two recovery techniques: A. Panic Mode B. Phrase-level Recovery

Panic-Mode Recovery Assume a non-terminal on the top of the stack. Idea: skip symbols on the input until a token in a selected set of synchronizing tokens is found. The choice for a synchronizing set is important. some ideas: define the synchronizing set of A to be FOLLOW(A). then skip input until a token in FOLLOW(A) appears and then pop A from the stack. Resume parsing... add symbols of FIRST(A) into synchronizing set. In this case we skip input and once we find a token in FIRST(A) we resume parsing from A. Productions that lead to  if available might be used. If a terminal appears on top of the stack and does not match to the input == pop it and and continue parsing (issuing an error message saying that the terminal was inserted).

Panic Mode Recovery, II General Approach: Modify the empty cells of the Parsing Table. if M[A,a] = {empty} and a belongs to Follow(A) then we set M[A,a] = “synch” Error-recovery Strategy : If A=top-of-the-stack and a=current-input, If A is NT and M[A,a] = {empty} then skip a from the input. If A is NT and M[A,a] = {synch} then pop A. If A is a terminal and A!=a then pop token (essentially inserting it).

Revised Parsing Table / Example Non-terminal INPUT SYMBOL id + * ( ) $ E E’ T T’ F ETE’ TFT’ Fid E’+TE’ T’ T’*FT’ F(E) E’ Skip input symbol From Follow sets. Pop top of stack NT “synch” action

Revised Parsing Table / Example(2) $E’T $E’T’F $E’T’id $E’T’ $E’T’F* $E’ $E’T+ $ + id * + id$ id * + id$ * + id$ + id$ id$ STACK INPUT Remark error, skip + Possible Error Msg: “Misplaced + I am skipping it” error, M[F,+] = synch F has been popped Possible Error Msg: “Missing Term”

Writing Error Messages Keep input counter(s) Recall: every non-terminal symbolizes an abstract language construct. Examples of Error-messages for our usual grammar E = means expression. top-of-stack is E, input is + “Error at location i, expressions cannot start with a ‘+’” or “error at location i, invalid expression” Similarly for E, * E’= expression ending. Top-of-stack is E’, input is * or id “Error: expression starting at j is badly formed at location i” Requires: every time you pop an ‘E’ remember the location

Writing Error-Messages, II Messages for Synch Errors. Top-of-stack is F input is + “error at location i, expected summation/multiplication term missing” Top-of-stack is E input is ) “error at location i, expected expression missing”

Writing Error Messages, III When the top-of-the stack is a terminal that does not match… E.g. top-of-stack is id and the input is + “error at location i: identifier expected” Top-of-stack is ) and the input is terminal other than ) Every time you match an ‘(‘ push the location of ‘(‘ to a “left parenthesis” stack. this can also be done with the symbol stack. When the mismatch is discovered look at the left parenthesis stack to recover the location of the parenthesis. “error at location i: left parenthesis at location m has no closing right parenthesis” E.g. consider ( id * + (id id) $

Incorporating Error-Messages to the Table Empty parsing table entries can now fill with the appropriate error-reporting techniques.

Phrase-Level Recovery Fill in blanks entries of parsing table with error handling routines that do not only report errors but may also: change/ insert / delete / symbols into the stack and / or input stream + issue error message Problems: Modifying stack has to be done with care, so as to not create possibility of derivations that aren’t in language infinite loops must be avoided Essentially extends panic mode to have more complete error handling

How Would You Implement TD Parser Stack – Easy to handle. Write ADT to manipulate its contents Input Stream – Responsibility of lexical analyzer Key Issue – How is parsing table implemented ? One approach: Assign unique IDS Non-terminal INPUT SYMBOL id + * ( ) $ E E’ T T’ F ETE’ TFT’ Fid E’+TE’ T’ T’*FT’ F(E) E’ synch All rules have unique IDs Also for blanks which handle errors Ditto for synch actions

Revised Parsing Table: INPUT SYMBOL Non-terminal id + * ( ) $ E 1 18 19 1 9 10 E’ 20 2 21 22 3 3 T 4 11 23 4 12 13 T’ 24 6 5 25 6 6 F 8 14 15 7 16 17 1 ETE’ 2 E’+TE’ 3 E’ 4 TFT’ 5 T’*FT’ 6 T’ 7 F(E) 8 Fid 9 – 17 : Sync Actions 18 – 25 : Error Handlers

Revised Parsing Table: (2) Each # ( or set of #s) corresponds to a procedure that: Uses Stack ADT Gets Tokens Prints Error Messages Prints Diagnostic Messages Handles Errors

How is Parser Constructed ? One large CASE statement: state = M[ top(s), current_token ] switch (state) { case 1: proc_E_TE’( ) ; break ; … case 8: proc_F_id( ) ; case 9: proc_sync_9( ) ; case 17: proc_sync_17( ) ; case 18: case 25: } Combine  put in another switch Some sync actions may be same Some error handlers may be similar Procs to handle errors

Final Comments – Top-Down Parsing So far, We’ve examined grammars and language theory and its relationship to parsing Key concepts: Rewriting grammar into an acceptable form Examined Top-Down parsing: Brute Force : Transition diagrams & recursion Elegant : Table driven We’ve identified its shortcomings: Not all grammars can be made LL(1) ! Bottom-Up Parsing - Future