AUTOMATA THEORY. Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES.

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
CS5371 Theory of Computation
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Transparency No. P2C4-1 Formal Language and Automata Theory Part II Chapter 4 Parse Trees and Parsing.
Context-Free Grammars Lecture 7
COP4020 Programming Languages
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Prof. Busch - LSU1 Context-Free Languages. Prof. Busch - LSU2 Regular Languages Context-Free Languages.
1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Context-Free Grammars
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
PART I: overview material
1 Chapter 5 Context-Free Grammars and Languages Cathedral of St. Basil the Blessed, Red Square, Moscow, Russia.
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Dept. of Computer Science & IT, FUUAST Automata Theory 2 Automata Theory V Context-Free Grammars andLanguages.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CONTEXT FREE GRAMMAR presented by Mahender reddy.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
LESSON 04.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Syntax Analyzer (Parser)
10/16/081 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
1 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5.
Syntax analysis Jianguo Lu School of Computer Science University of Windsor.
Context-Free Languages & Grammars (CFLs & CFGs) (part 2)
5. Context-Free Grammars and Languages
CONTEXT-FREE LANGUAGES
lec02-parserCFG May 8, 2018 Syntax Analyzer
Context-Free Languages & Grammars (CFLs & CFGs)
Formal Language & Automata Theory
CS 404 Introduction to Compiler Design
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Compiler Construction
5. Context-Free Grammars and Languages
Finite Automata and Formal Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Derivations and Languages
Theory of Computation Lecture #
lec02-parserCFG May 27, 2019 Syntax Analyzer
COMPILER CONSTRUCTION
Presentation transcript:

AUTOMATA THEORY

Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES

Introduction  Context-free grammars (CFG) have played a central role in compiler technology since the 1960’s.  They turned the implementation of parsers, ad- hoc implementation task.  Parsers: functions that discover the structure of a program.

An informal example  Let us consider the language of palindromes.  A palindrome is a string that reads the same forward and backward, such as otto, madamimadam.  Let’s consider describing only the palindromes with alphabet {0,1}. EX: 0110,11011 etc.

A Context-free Grammar for Palindromes 1.P  є 2.P  0 3.P  1 4.P  0P0 5.P  1P1 Only for binary strings.

Definition of CFG  A CFG is a way of describing language by recursive rules called productions.  A CFG consists of … 1.A finite set of symbols/terminals/terminal symbols. 2.A finite set of variables/nonterminals. 3.A start symbol/start variable. 4.A finite set of productions/rules.

Definition of CFG (continue)  Each productions consists of: a.the head of the production. b.the production symbol  c.The body of the production, a string of zero or more terminals and variables.

Definition of CFG (continue)  The four components of CFG G can be represent as follows: G = (V, T, P, S) Variables terminals productions Start variable

A Context-free Grammar for Palindromes  The grammar G for the palindrome is represented by.. G = ({P},{0,1},A,P) pal where A represents the set of five productions:  P  є  P  0  P  1  P  0P0  P  1P1 only for binary string

Example of CFG  A CFG for simple expressions where the operators ‘+’ and ‘*’ present. It allows only the letters ‘a’ and ’b’ and the digits ‘0’ and ‘1’. Every identifiers must begin with a and b which may be followed by any other string in {a,b,0,1}*  G=({E,I},T,P,E)  T={0,1,a,b,+,*,(,)} productions:  E  I  E  E+E  E  E*E  E  (E)  I  a 6. I  b 7. I  Ia 8. I  Ib 9. I  I0 10 I  I1

Derivation using grammar  (ab+ab0) 1.E  (E) E  (E+E) E  (I+E) E  (Ib+E) E  (ab+E) E  (ab+I) E  (ab+I0) E  (ab+Ib0) E  (ab+ab0) productions: 1.E  I 2.E  E+E 3.E  E*E 4.E  (E) 5.I  a 6. I  b 7. I  Ia 8. I  Ib 9. I  I0 10 I  I1

Example of CFG  A CFG for syntactically correct infix algebraic expressions in the variables x, y and z.infix  G=({S},T,P,S)  T={x, y, z,-,+,*,/,(,)} productions: S → x S → y S → z S → S + S S → S - S S → S * S S → S / S S → ( S )

Derivation using grammar S → S * S S → S / S S → ( S ) productions: S → x S → y S → z S → S + S S → S - S

An informal example

An example of CFG

LMD and RMD  LMD (Left Most Derivation): At each step we replace the left most variable by one of its production bodies. Such a derivation is called a leftmost derivation. A derivation is leftmost by using the relations => and => for one or many steps.  RMD (Right Most Derivation): At each step we replace the right most variable by one of its production bodies. Such a derivation is called a rightmost derivation. A derivation is leftmost by using the relations => and => for one or many steps. lm rm

Left Most Derivation  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  LMD: a*(a+b00):  E =>E*E lm=>I*E lm=>a*E lm=>a*(E) lm=>a*(E+E) lm=>a*(I+E) lm=>a * (a+E) lm=>a*(a+I) lm=>a*(a+I0) lm=>a*(a+I00) lm=>a*(a+b00)

Right Most Derivation  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  RMD: a*(a+b00):  E =>E*E rm=>E*(E) rm=>E*(E+E) rm=>E*(E+I) rm=>E*(E+I0) rm=>E*(E+I00) rm=>E * (E+b00) rm=>E*(I+b00) rm=>E*(a+b00) rm=>I*(a+I00) rm=>a*(a+b00)

The Language of a Grammar  If G(V,T,P,S) is a CFG, the language of G, denoted L(G), is the set of terminal strings that have derivations from the start symbol. That is, L(G)={w in T | S  w} If a language L is the language of some context-free grammar, then L is said to be a context-free language, or CFL. G *

Parse Tree  A tree representation for derivations which shows clearly has the symbols of a terminal string are grouped into substrings.  Parse tree used in a compiler, data structure.  In a compiler, the tree structure of the source program facilities the translation of the source program into executable code by allowing natural, recursive functions to perform this translation process.  Graphical representation for a derivations.

Constructing Parse Tree  Let us fix on a grammar G=(V,T,P,S). The parse trees for G are trees with the following conditions: 1.Each interior node is labeled by a variable V. 2.Each leaf is labeled by either variable, a terminal or є. 3.If an interior node is labeled A, and its children are labeled X1, X2………………….,Xk respectively, from the left, then A  X1X2…Xk is a production.

Parse Tree Example  A parse tree showing the derivation of I+E from E. E E+ E I

Parse Tree Example (Continue..)  A parse tree showing the derivation P  * 1.P  є 2.P  0 3.P  1 4.P  0P0 5.P  1P1 0 0P P 1 P 1 є

The Yield of a Parse Tree  If we look at the leaves of any parse tree and concatenate them from left, we get a string called the yield of a parse tree, which is always a string that is derived from the root variable. 1.The yield is a terminal string. That is, all leaves are labeled either with a terminal or with є. 2.The root is labeled by the start symbol.

Parse tree showing a*(a+b00) E E * E I a ()E E+E I a I I0 I 0 b

Parse tree showing ( x + y ) * x - z * y / ( x + x )

Parse tree showing The man read this book

Inference, Derivations, and Parse Trees Leftmost Derivation Rightmost Derivation Recursive Inference Parse Tree Derivation

Self Study   Theorem 5.12, 5.14, 5.18

Ambiguous Grammar  A grammar uniquely determines a structure for each string in its language. Not every grammar does provide unique structures.  When a grammar fails to provide unique structure, it is known as ambiguous grammar.  More than one derivation/parse tree.

Ambiguous Grammar example  Let us consider a CFG:  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 Expression: a + a*a LMD: E  E+E  I+E  a+ E  a+ E*E  a+ I*E  a+ a*E  a+ a*I  a+ a*a RMD: E  E*E  E*I  E*a  E+E*a  E+I*a  E+ a*a  I+ a*a  a+ a*a rm lm

LMD E E + I a E * E I a I a E Fig: Trees yield a+a*a

RMD E E * I a E + E I a I a E Fig: Trees yield a+a*a

Removing Ambiguity from Grammar  Two causes of ambiguity in the grammar : 1.The precedence of operator is not respected. 2.A sequence of identical operators can group either from the left or from the right.

Prof. Busch - LSU36 Two derivation trees for

Prof. Busch - LSU37 take

Prof. Busch - LSU38 Good Tree Bad Tree Compute expression result using the tree

The solution of the problem of enforcing precedence is to introduce several different variables. 1.A factor- is an expression that cannot be broken apart by any adjacent operators. The only factors in our expression language are: i. Identifiers: It is not possible to separate the letters of identifier by attaching an operator. ii. Any parenthesized expression, no matter what appears inside the parenthesis. 2.A term- is an expression that cannot be broken by the ‘+’ operator. Term is product of one or more factors. 3.An expression-is a sum of one or more terms. Removing Ambiguity from Grammar

 Let us consider a CFG:  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  An unambiguous expression grammar : I  a| B| Ia |Ib |I0 | I1 F  I| (E) T  F| T*F E  T| E+T Removing Ambiguity from Grammar

Unambiguous Grammar example CFG: I  a| B| Ia |Ib |I0 | I1 F  I| (E) T  F| T*F E  T| E+T Expression: a + a*a Derivation: E  E+T  T+T  F+ T  I+ T  a+ T  a+ T*F  a+ F*F  a+ I*I  a+ a*a

Inherent Ambiguity Topic L={a n b n c m d m |n>=1, m>=1}U{a n b m c m d m | n>=1, m>=1}

E T + T a T * F I a I a E F I F E  E+T  T+T  F+ T  I+ T  a+ T  a+ T*F  a+ F*F  a+ I*I  a+ a*a Fig: Trees yield a+a*a Unambiguous Grammar example

Example of CFG  A CFG for generates prefix expressions with operands x and y and binary operators +, -, *. productions: E → x E → y E → +EE E → -EE E → *EE

Example of CFG  Design A CFG for the set of all strings with an equal number of a’s and b’s. productions: S→ aSbS | bSaS | Є

Example of CFG  Design A CFG on the string length that no string in L(G) has ba as a substring. productions: S→ aS | Sb | a| b

Example of CFG  Design A CFG for the regular expression 0*1(0+1)*. productions: S→ A1B A → 0A | Є B → 0B | 1B| Є

Example of CFG

Application of CFG  CFG- a way to describe natural language  Two of these uses:  1. Parsers  2. Markup language (HTML,XML)  Parsers:  A parse tree-as a graphical representation for derivations.  Parsing is the process of determining if a string of tokens can be generated by a grammar.  A complier may not actually construct a parse tree. However a parser must be capable of constructing such tree.  A parser can be constructed for any grammar. The CFG is an essential concept for the implementation of parsers.

YACC Parser Generator  Tools such as YACC take a CFG as input and produce a parser  Exp: Id {…} | Exp ‘+’ Exp {…} | Exp ‘*’ Exp {…} | ‘(’ Exp ‘)’ {…} Id: ‘a’ {…} |’b’ {…} |Id ‘a’ {…} |Id ‘b’ {…} |Id ‘0’ {…} |Id ‘1’ {…} ;

Rules for YACC Parser Generator  Rules: 1.Colon is used as the production symbol,  2.Productions-grouped together by the vertical bar 3.List of bodies for a given head ends with semicolon. 4.Terminals are quoted with single quotes 5.Variable names unquoted.

Markup Language  A family of language called markup languages. The string in these languages are documents with certain marks (called tags) in them.  Tags  semantics of various string within the documents.  The things I hate : 1. ABC xyz 2. AB ABC XYZ xy a) The text as viewed The things I hate ABC xyz AB ABC XYZ xy b) the HTML source EM  Emphasized string P  Paragraph OL  Ordered Lists LI  List Index

1.Char  a|A|… 2.Text  є |Char Text 3.Doc  є|Element Doc 4.Element  Text| Doc | List | 5. ListItem  Doc 6. List  є|ListItem List

Thank You