Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.

Slides:

Advertisements

Similar presentations

lec02-parserCFG March 27, 2017 Syntax Analyzer

Advertisements

Compiler Construction

Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.

Exercise 1: Balanced Parentheses Show that the following balanced parentheses grammar is ambiguous (by finding two parse trees for some input sequence)

About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.

Lecture # 11 Grammar Problems.

Top-Down Parsing.

By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.

Parsing III (Eliminating left recursion, recursive descent parsing)

ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.

104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.

1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.

Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)

1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.

1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.

Professor Yihjia Tsai Tamkang University

Top-Down Parsing.

1 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice Grammars and Parsing.

CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Top-Down Parsing - recursive descent - predictive parsing

Chapter 5 Top-Down Parsing.

BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.

Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.

Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.

Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:

Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.

Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.

Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.

Context Free Grammar. Introduction Why do we want to learn about Context Free Grammars?  Used in many parsers in compilers  Yet another compiler-compiler,

1 Chapter 4 Grammars and Parsing. 2 Context-Free Grammars: Concepts and Notation A context-free grammar G = (Vt, Vn, S, P) –A finite terminal vocabulary.

Left Recursion Lecture 7 Fri, Feb 4, 2005.

6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)

COP4020 Programming Languages Parsing Prof. Xin Yuan.

Parsing Top-Down.

Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.

Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.

TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.

1 Context free grammars  Terminals  Nonterminals  Start symbol  productions E --> E + T E --> E – T E --> T T --> T * F T --> T / F T --> F F --> (F)

1 Nonrecursive Predictive Parsing  It is possible to build a nonrecursive predictive parser  This is done by maintaining an explicit stack.

Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.

Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.

1 Chapter 6 Simplification of CFGs and Normal Forms.

Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.

Top-Down Parsing.

Syntax Analyzer (Parser)

1 Pertemuan 7 & 8 Syntax Analysis (Parsing) Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.

LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.

Parsing methods: –Top-down parsing –Bottom-up parsing –Universal.

COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Exercises on Chomsky Normal Form and CYK parsing

COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.

Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.

Syntax Analysis By Noor Dhia Left Recursion: Example1: S → S0s1s | 01 The grammar after eliminate left recursion is: S → 01 S’ S' → 0s1sS’

CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.

WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.

lec02-parserCFG May 8, 2018 Syntax Analyzer

CS510 Compiler Lecture 4.

Context free grammars Terminals Nonterminals Start symbol productions

Introduction to Parsing (adapted from CS 164 at Berkeley)

Compiler Construction

Top-Down Parsing.

Parsing Techniques.

Lecture 7: Introduction to Parsing (Syntax Analysis)

Grammar design: Associativity

Theory of Computation Lecture #

Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing

lec02-parserCFG May 27, 2019 Syntax Analyzer

Parsing CSCI 432 Computer Science Theory

Presentation transcript:

Lecture # 10 Grammar Problems

Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for the parser in the phase of syntax analysis

Grammar Problem Consider S  if E then S else S | if E then S – What is the parse tree for if E then if E then S else S – There are two possible parse trees! This problem is called ambiguity A CFG is ambiguous if one or more terminal strings have multiple leftmost derivations from the start symbol.

Ambiguity There is no general algorithm to tell whether a CFG is ambiguous or not. There is no standard procedure for eliminating ambiguity. Some languages are inherently ambiguous. – In those cases, any grammar we come up with will be ambiguous.

How to eliminate Ambiguity? Method 1 If ambiguity is of the form: S  α S β S  | α1 |……| αn Rewrite: S  α S β S’  | S’ S’  α1 |……| αn

How to eliminate Ambiguity? Method2: Binding with parenthesis: S  S v S | S ^ S | ~ S | A A  p| q| r The two parse trees for the string pvq^r would be :

How to eliminate Ambiguity? Ambiguity can be eliminated by parenthesizing the right hand side of the two rules as shown below: S  (S v S) | (S ^ S) | ~ S | A A  p| q| r

How to eliminate Ambiguity? The parenthesizing technique is simple but has serious drawbacks because we are altering the language by adding new terminal symbols However this technique is very useful in programming languages

How to eliminate Ambiguity? Method 3 Fixing the order of applying rules: The language generated by following grammar is ambiguous because bcb can be derived in two different ways: S  bS | Sb | c

How to eliminate Ambiguity? We can simply modify the grammar to such that left side b’s, if any, are always generated first. Figure shown is the only parse tree for string bcb. Grammar is unambiguous. S  bS | A A  Ab | c

How to eliminate Ambiguity? Method 4 Eliminate redundant rules: The CFG below is ambiguous because it can generate ab either by B or D. S  B | D B  ab|b D  ab | d We can simply delete one of the two and make the grammar unambiguous as follows: S  B | D B  ab|b D  d

Grammar problems Because we try to generate a leftmost derivation by scanning the input from left to right, grammars of the form A  A x may cause endless recursion. Such grammars are called left-recursive and they must be transformed if we want to use a top-down parser.

Left Recursion A grammar is left recursive if for a non- terminal A, there is a derivation A  + A  There are three types of left recursion: – direct (A  A x) – indirect (A  B C, B  A ) – hidden (A  B A, B   )

How to eliminate Left recursion? To eliminate direct left recursion replace A  A  1 | A  2 |... | A  m |  1 |  2 |... |  n with A   1 B |  2 B |... |  n B B   1 B |  2 B |... |  m B | 

Left recursion How about this: S  E E  E+T E  T T  E-T T  id There is direct recursion: E  E+T There is indirect recursion: T  E+T, E  T Algorithm for eliminating indirect recursion List the nonterminals in some order A 1, A 2,...,A n for i=1 to n for j=1 to i-1 if there is a production A i  A j , replace A j with its rhs eliminate any direct left recursion on A i

Eliminating indirect left recursion S  E E  E+T E  T T  E-T T  F F  E*F F  id i=Sordering: S, E, T, F S  E E  E+T E  T T  E-T T  F F  E*F F  id i=E S  E E  TE' E'  +TE'|  T  E-T T  F F  E*F F  id i=T, j=E S  E E  TE' E'  +TE'|  T  TE'-T T  F F  E*F F  id S  E E  TE' E'  +TE'|  T  FT' T'  E'-TT'|  F  E*F F  id

Eliminating indirect left recursion i=F, j=E S  E E  TE' E'  +TE'|  T  FT' T'  E'-TT'|  F  TE'*F F  id i=F, j=T S  E E  TE' E'  +TE'|  T  FT' T'  E'-TT'|  F  FT'E'*F F  id S  E E  TE' E'  +TE'|  T  FT' T'  E'-TT'|  F  idF' F'  T'E'*FF'| 

Grammar problems Consider S  if E then S else S | if E then S – Which of the two productions should we use to expand non-terminal S when the next token is if? – We can solve this problem by factoring out the common part in these rules. This way, we are postponing the decision about which rule to choose until we have more information (namely, whether there is an else or not). – This is called left factoring

Left factoring A   1 |  2 |...|  n |  becomes A   B|  B   1 |  2 |...|  n

Grammar problems A symbol X  V is useless if – there is no derivation from X to any string in the language (non-terminating) – there is no derivation from S that reaches a sentential form containing X (non-reachable) Reduced grammar = a grammar that does not contain any useless symbols.

Useless symbols In order to remove useless symbols, apply two algorithms: – First, remove all non-terminating symbols – Then, remove all non-reachable symbols. The order is important! – For example, consider S  +  X  where  contains a non-terminating symbol. What will happen if we apply the algorithms in the wrong order? Concrete example: S  AB | a, A  a

Useless symbols Example Initial grammar: S  AB | CA A  a B  CB | AB C  cB | b D  aD | d Algorithm 1 (terminating symbols): A is in because of A  a C is in because of C  b D is in because of D  d S is in because A, C are in and S  AC

Useless symbols Example continued After algorithm 1: S  CA A  a C  b D  aD | d Algorithm 2 (reachable symbols): S is in because it is the start symbol C and A are in because S is in and S  CA Final grammar: S  CA A  a C  b