Lecture 17: Theory of Automata:2014 Context Free Grammars.

Slides:



Advertisements
Similar presentations
So far... A language is a set of strings over an alphabet. We have defined languages by: (i) regular expressions (ii) finite state automata Both (i) and.
Advertisements

SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free
CS5371 Theory of Computation
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.1: Context-Free Grammars) David Martin With some.
1 Module 28 Context Free Grammars –Definition of a grammar G –Deriving strings and defining L(G) Context-Free Language definition.
1 Lecture 29 Context Free Grammars –Examples of “real-life” grammars –Definition of a grammar G –Deriving strings and defining L(G) Context-Free Language.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Normal forms for Context-Free Grammars
Transparency No. P2C1-1 Formal Language and Automata Theory Part II Pushdown Automata and Context-Free Languages.
Chapter 3: Formal Translation Models
Specifying Languages CS 480/680 – Comparative Languages.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
L ECTURE 2 Chapter 3 Recursive Definitions. R ECURSIVE D EFINITION It is method of defining sets.
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Theory Of Automata By Dr. MM Alam
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Lecture # 19. Example Consider the following CFG ∑ = {a, b} Consider the following CFG ∑ = {a, b} 1. S  aSa | bSb | a | b | Λ The above CFG generates.
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
CS 3240: Languages and Computation Context-Free Languages.
So far... A language is a set of strings over an alphabet. We have defined languages by: (i) regular expressions (ii) finite state automata Both (i) and.
Lecture # 5 Pumping Lemma & Grammar
Complexity and Computability Theory I Lecture #9 Instructor: Rina Zviel-Girshin Lea Epstein.
Regular Grammars Chapter 7. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
CMSC 330: Organization of Programming Languages Context-Free Grammars.
Context Free Grammars.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Lecture 11 Theory of AUTOMATA
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
CS 3240 – Chapter 5. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Recursive Definations Regular Expressions Ch # 4 by Cohen
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Context-Free Languages
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Lecture 16: Modeling Computation Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output.
Formal Languages and Grammars
Discrete Structures ICS252 Chapter 5 Lecture 2. Languages and Grammars prepared By sabiha begum.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
Transparency No. 1 Formal Language and Automata Theory Homework 5.
Chomsky Normal Form.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Lecture 02: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Recap lecture 31 Context Free Grammar, Terminals, non- terminals, productions, CFG, context Free language, examples.
Context Free Grammars & Parsing CPSC 388 Fall 2001 Ellen Walker Hiram College.
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Context-Free Grammars: an overview
Formal Language & Automata Theory
Even-Even Devise a grammar that generates strings with even number of a’s and even number of b’s.
Theory Of Automata By Dr. MM Alam
CHAPTER 2 Context-Free Languages
Finite Automata and Formal Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Recap lecture 30 Deciding whether two languages are equivalent or not, example, deciding whether an FA accept any string or not, method 3, examples, finiteness.
Presentation transcript:

Lecture 17: Theory of Automata:2014 Context Free Grammars

Lecture 17: Theory of Automata: Context Free Grammars Three fundamental areas covered in the book are 1. Theory of Automata 2. Theory of Formal Languages 3. Theory of Turing Machines We have completed the first area. We begin exploring the second area in this chapter.

Lecture 17: Theory of Automata: Chapter topics Syntax As a Method for Defining Languages Symbolism for Generative Grammars Trees Lukasiewicz Notation Ambiguity The Total Language Tree

Lecture 17: Theory of Automata: Syntax as a Method for Defining Languages In Chapter 3 we recursively defined the set of valid arithmetic expressions as follows: Rule 1: Any number is in the set AE. Rule 2: If x and y are in AE, then so are (x), -(x), (x + y), (x - y), (x * y), (x/y), (x ** y) where ** is our notation for exponentiation Note that we use parentheses around every component factor to avoid ambiguity expressions such as and 8/4/2. There is a different way for defining the set AE: using a set of substitution rules similar to the grammatical rules.

Lecture 17: Theory of Automata: Defining AE by substitution rules Start → AE AE → (AE + AE) AE → (AE - AE) AE → (AE * AE) AE → (AE/AE) AE → (AE ** AE) AE → (AE) AE → -(AE) AE → AnyNumber

Lecture 17: Theory of Automata: Example We will show that ((3 + 4) * (6 + 7)) is in AE Start  AE  (AE * AE)  ((AE + AE) * (AE + AE))  ((3 + 4) * (6 + 7))

Lecture 17: Theory of Automata: Definition of Terms A word that cannot be replaced by anything is called terminal. – In the above example, the terminals are the phrase AnyNumber, and the symbols + - * / ** ( ) A word that must be replaced by other things is called nonterminal. – The nonterminals are Start and AE. The sequence of applications of the rules that produces the finished string of terminals from the starting symbol is called a derivation or a generation of the word. The grammatical rules are referred to as productions.

Lecture 17: Theory of Automata: Symbolism for Generative Grammars Definition: A context-free grammar (CFG) is a collection of three things: 1. An alphabet ∑ of letters called terminals from which we are going to make strings that will be the words of a language. 2. A set of symbols called nonterminals, one of which is the symbol S, standing for “start here”. 3. A finite set of productions of the form: One nonterminal → finite string of terminals and/or nonterminals where the strings of terminals and nonterminals can consist of only terminals, or of only nonterminals, or of any mixture of terminals and nonterminals, or even the empty string. We require that at least one production has the nonterminal S as its left side.

Lecture 17: Theory of Automata: Definition: The language generated by a CFG is the set of all strings of terminals that can be produced from the start symbol S using the productions as substitutions. A language generated by a CFG is called a context-free language (CFL). Notes: The language generated by a CFG is also called the language defined by the CFG, or the language derived from the CFG, or the language produced by the CFG. We insist that nonterminals be designated by capital letters, whereas terminals are designated by lowercase letters and special symbols.

Lecture 17: Theory of Automata: Example Let the only terminal be a and the productions be Prod1 S → aS Prod2 S → Λ If we apply Prod 1 six times and then apply Prod 2, we generate the following: S  aS  aaS  aaaS  aaaaS  aaaaaS  aaaaaaS  aaaaaaΛ = aaaaaa If we apply Prod2 without Prod1, we find that Λ is in the language generated by this CFG. Hence, this CFL is exactly a*. Note: the symbol “→” means “can be replaced by”, whereas the symbol “  ” means “can develop to”.

Lecture 17: Theory of Automata: Let the terminals be a and b, the only nonterminal be S, and the productions be Prod1 S → aS Prod2 S → bS Prod3 S → Λ The word ab can be generated by the derivation S  aS  abS  abΛ = ab The word baab can be generated by S  bS  baS  baaS  baabS  baabΛ = baab Clearly, the language generated by the above CFG is (a + b)*.

Lecture 17: Theory of Automata: Example Let the terminals be a and b, the the nonterminal be S and X, and the productions be Prod 1 S → XaaX Prod 2 X → aX Prod 3 X → bX Prod 4 X → Λ We already know from the previous example that the last three productions will generate any possible strings of a’s and b’s from the nonterminal X. Hence, the words generated from S have the form anything aa anything

Lecture 17: Theory of Automata: Hence, the language produced by this CFG is (a + b)*aa(a + b)* which is the language of all words with a double a in them somewhere. For example, the word baabb can be generated by S  XaaX  bXaaX  baaX  baaX  baabX  baabbX  baabbΛ = baabb

Lecture 17: Theory of Automata: Example Consider the CFG: S → aSb S → Λ It is easy to verify that the language generated by this CFG is the non-regular language {a n b n}. For example, the word a 4 b 4 is derived by S  aSb  aaSbb  aaaSbbb  aaaaSbbbb  aaaaΛbbbb = aaaabbbb

Lecture 17: Theory of Automata: Derivation and some Symbols If v and w are strings of terminals and non-terminals »denotes the derivation of w from v of length n steps » derivation of w from v in one or more steps » derivation of w from v in zero or more steps of application of rules of grammar G.

Lecture 17: Theory of Automata: Sentential Form A string w ε(n U ∑)* is a sentential form of G if there is a derivation A string w is a sentence of G if there is a derivation in G The language of G, denoted by L(G) is the set

Lecture 17: Theory of Automata: Example It is not difficult to show that the following CFG generates the non-regular language {a n ba n }: S → aSa S → b Can you show that the CFG below generates the language PALINDROME, another non-regular language? S → aSa S → bSb S → a S → b S → Λ

Lecture 17: Theory of Automata: Disjunction Symbol | Let us introduce the symbol | to mean disjunction (or). We use this symbol to combine all the productions that have the same left side. For example, the CFG Prod 1 S → XaaX Prod 2 X → aX Prod 3 X → bX Prod 4 X → Λ can be written more compactly as Prod 1 S → XaaX Prod 2 X → aX|bX|Λ

Lecture 17: Theory of Automata: Example S  aSa | aBa B  bB | b First production builds equal number of a’s on both sides and recursion is terminated by S  aBa Recursion of B  bB may add any number of b’s and terminates with B  b L(G) = {a n b m a n n>0, m>0}

Lecture 17: Theory of Automata: example L(G) = {a n b m c m d 2n | n>0, m>0} Consider relationship between leading a’s and trailing d’s. S  aSdd In the middle equal number of b’s and c’s S  A A  bAc This middle recursion terminates by A  bc. Grammar will be S  aSdd | aAdd A  bAc | bc

Lecture 17: Theory of Automata: Example Consider another CFG S  aSb | aSbb | Λ Language defined is L(G) = {a n b m | 0 ≤ n≤ m≤ 2n}

Lecture 17: Theory of Automata: Example A grammar that generates the language consisting of even-length string over {a, b} S  aO | bO | Λ O  aS | bS S and O work as counters i.e. when an S is in a sentential form that marks even number of terminals have been generated Presence of O in a sentential form indicates that an odd number of terminals have been generated. The strategy can be generalized, say for string of length exactly divisible by 3 we need three counters to mark 0, 1, 2 S  aP | bP | Λ P  aQ | bQ Q  aS | bS