CS5371 Theory of Computation

Slides:



Advertisements
Similar presentations
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Advertisements

Closure Properties of CFL's
CFGs and PDAs Sipser 2 (pages ). Long long ago…
LR-Grammars LR(0), LR(1), and LR(K).
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
CFGs and PDAs Sipser 2 (pages ). Last time…
Chap 2 Context-Free Languages. Context-free Grammars is not regular Context-free grammar : eg. G 1 : A  0A1substitution rules A  Bproduction rules B.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.1: Context-Free Grammars) David Martin With some.
1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Introduction to the Theory of Computation John Paxton Montana State University Summer 2003.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
January 15, 2014CS21 Lecture 61 CS21 Decidability and Tractability Lecture 6 January 16, 2015.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
CS5371 Theory of Computation Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
نظریه زبان ها و ماشین ها فصل دوم Context-Free Languages دانشگاه صنعتی شریف بهار 88.
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of CFL was to formalize.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
CSCI 2670 Introduction to Theory of Computing September 20, 2005.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free Grammars – Chomsky Normal Form Lecture 16 Section 2.1 Wed, Sep 26, 2007.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
Pushdown Automata (PDAs)
Context-free Grammars [Section 2.1] - more powerful than regular languages - originally developed by linguists - important for compilation of programming.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
CSCI 2670 Introduction to Theory of Computing September 15, 2005.
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Dept. of Computer Science & IT, FUUAST Automata Theory 2 Automata Theory V Context-Free Grammars andLanguages.
CS 3240: Languages and Computation Context-Free Languages.
Complexity and Computability Theory I Lecture #9 Instructor: Rina Zviel-Girshin Lea Epstein.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Context-Free Languages
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Exercises on Chomsky Normal Form and CYK parsing
Chomsky Normal Form.
Theory of Languages and Automata By: Mojtaba Khezrian.
Theory of Computation Automata Theory Dr. Ayman Srour.
CSCI 2670 Introduction to Theory of Computing September 16, 2004.
Lecture 17: Theory of Automata:2014 Context Free Grammars.
Computability Joke. Context-free grammars Parsing. Chomsky
Context-Free Grammars: an overview
Formal Language & Automata Theory
Context-Free Languages
CHAPTER 2 Context-Free Languages
فصل دوم Context-Free Languages
Presentation transcript:

CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF)

Objectives Introduce Context-free Grammar (CFG) and Context-free Language (CFL) Show that Regular Language can be described by CFG Terminology related to CFG Leftmost Derivation, Ambiguity, Chomsky Normal Form (CNF) Converting a CFG into CNF

Context-free Grammar (Example) A  0A1 A  B B  # Substitution Rules Variables A, B Terminals 0,1,# Start Variable A Important: Substitution Rule in CFG has a special form: Exactly one variable (and nothing else) on the left side of the arrow

How does CFG generate strings? A  0A1 A  B B  # Write down the start symbol Find a variable that is written down, and a rule that starts with that variable; Then, replace the variable with the rule Repeat the above step until no variable is left

How does CFG generate strings? A  0A1 A  B B  # Step 1. A (write down the start variable) Step 2. 0A1 (find a rule and replace) Step 3. 00A11 (find a rule and replace) Step 4. 00B11 (find a rule and replace) Step 5. 00#11 (find a rule and replace) Now, the string 00#11 does not have any variable. We can stop.

How does CFG generate strings? The sequence of substitutions to generate a string is called a derivation E.g., A derivation of the string 000#111 in the previous grammar is A  0A1  00A11  000A111  000B111  000#111 The same information can be represented pictorially by a parse tree (next slide)

Parse Tree A A A A 1 A 1 B 1 #

Language of the Grammar In fact, the previous grammar can generate strings #, 0#1, 00#11, 000#111, … The set of all strings that can be generated by a grammar G is called the language of G, denoted by L(G) The language of the previous grammar is {0n#1n | n  0 }

CFG (Formal Definition) A CFG is a 4-tuple (V,T, R, S), where V is a finite set of variables T is a finite set of terminals R is a set of substitution rules, where each rule consists of a variable (left side of the arrow) and a string of variables and terminals (right side of the arrow) S 2 V called the start variable

CFG (terminology) Let u and v be strings of variables and terminals We say u derives v, denoted by u  v, if u = v, or there exists u1, u2, …, uk, k  0 such that u  u1  u2  …  uk  v In other words, for a grammar G = (V,T,R,S), L(G) = { w 2 T*| S  w } * *

CFG (more examples) Let G = ( {S}, {a,b}, R, S ), and the set of rules, R, is S  aSb | SS |  This notation is an abbreviation for S  aSb S  SS S   What will this grammar generate? If we think of a as “(” and b as “)”, G generates all strings of properly nested parentheses

Is the following a CFG? G = { {A,B}, {0,1}, R, A } A  0B1 | A | 0 B  1B0 | 1 0B  A

Designing CFG Can we design CFG for {0n1n | n  0} [ {1n0n | n  0} ? Do we know CFG for {0n1n | n  0}? Do we know CFG for {1n0n | n  0}?

Designing CFG CFG for the language L1 = {0n1n | n  0} S  0S1 |  CFG for L1 [ L2 S  S1 | S2 S1  0S11 |  S2  1S20 | 

Designing CFG Can we design CFG for {02n13n | n  0}? Yes, by “linking” the occurrence of 0’s with the occurrence of 1’s The desired CFG is: S  00S111 | 

Can we construct the CFG for the language { w | w is a palindrome } ? Assume that the alphabet of w is {0,1} Examples for palindrome: 010, 0110, 001100, 01010, 1101011, …

Regular Language & CFG Theorem: Any regular language can be described by a CFG. How to prove? (By construction)

Assume that q0 is the start state of D Regular Language & CFG Proof: Let D be the DFA recognizing the language. Create a distinct variable Vi for each state qi in D. Make V0 the start variable of CFG Add a rule Vi  aVj if (qi,a) = qj Add a rule Vi   if qi is an accept state Assume that q0 is the start state of D Then, we can show that the above CFG generates exactly the same language as D (how to show?)

Regular Language & CFG (Example) 1 DFA 1 q0 q1 start CFG G = ( {V0, V1}, {0,1}, R, V0 ), where R is V0  0V0 | 1V1 |  V1  1V1 | 0V0

Leftmost Derivation A derivation which always replace the leftmost variable in each step is called a leftmost derivation E.g., Consider the CFG for the properly nested parentheses ( {S}, {(,)}, R, S ) with rule R: S  ( S ) | SS |  Then, S  SS  (S)S  ( )S  ( ) ( S )  ( ) ( ) is a leftmost derivation But, S  SS  S(S)  (S)(S)  ( ) ( S )  ( ) ( ) is not a leftmost derivation However, we note that both derivations correspond to the same parse tree

Ambiguity Sometimes, a string can have two or more leftmost derivations!! E.g., Consider CFG ( {S}, {+,x,a}, R, S) with rules R: S  S + S | S x S | a The string a + a x a has two leftmost derivations as follows: S  S + S  a + S  a +S x S  a + a x S  a + a x a S  S x S  S + S x S  a +S x S  a + a x S

Ambiguity If a string has two or more leftmost derivations in a CFG G, we say the string is derived ambiguously in G A grammar is ambiguous if some strings is derived ambiguously Note that the two leftmost derivations in the previous example correspond to different parse trees (see next slide) In fact, each leftmost derivation corresponds to a unique parse tree

Two parse trees for a + a x a

Fun Fact: Inherently Ambiguous Sometimes when we have an ambiguous grammar, we can find an unambiguous grammar that generates the same language However, some language can only be generated by ambiguous grammar E.g., { anbncm | n, m  0} [ {anbmcm | n, m  0} Such language is called inherently ambiguous See last year’s bonus exercise in Homework 2

Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule is of the form A  BC A  a where a is any terminal A,B,C are variables B, C cannot be start variable However, S   is allowed

Converting a CFG to CNF Theorem: Any context-free language can be generated by a context-free grammar in Chomsky Normal Form. Hint: When is a general CFG not in Chomsky Normal Form?

Proof Idea The only reasons for a CFG not in CNF: Start variable appears on right side It has  rules, such as A   It has unit rules, such as A  A, or B  C Some rules does not have exactly two variables or one terminal on right side Prove idea: Convert a grammar into CNF by handling the above cases

The Conversion (step 1) Proof: Let G be the context-free grammar generating the context-free language. We want to convert G into CNF. Step 1: Add a new start variable S0 and the rule S0  S, where S is the start variable of G This ensures that start variable of the new grammar does not appear on right side

The Conversion (step 2) Step 2: We take care of all  rules. To remove the rule A  , for each occurrence of A on the right side of a rule, we add a new rule with that occurrence deleted. E.g., R  uAvAw causes us to add the rules: R  uAvw, R  uvAw, R  uvw If we have the rule R  A, we add R   unless we had previously removed R   After removing A  , the new grammar still generates the same language as G.

The Conversion (step 3) Step 3: We remove the unit rule A  B. To do so, for each rule B  u (where u is a string of variables and terminals), we add the rule A  u. E.g., if we have A  B, B  aC, B  CC, we add: A  aC, A  CC After removing A  B, the new grammar still generates the same language as G.

The Conversion (step 4) Step 4: Suppose we have a rule A  u1 u2 …uk, where k > 2 and each ui is a variable or a terminal. We replace this rule by A  u1A1, A1  u2A2, A2  u3A3, …, Ak-2  uk-1uk After the change, the string on the right side of any rule is either of length 1 (a terminal) or length 2 (two variables, or 1 variable + 1 terminal, or two terminals)

The Conversion (step 4 cont.) To remove a rule A  u1u2 with some terminals on the right side, we replace the terminal ui by a new variable Ui and add the rule Ui  ui After the change, the string on the right side of any rule is exactly a terminal or two variables

The Conversion (example) Let G be the grammar on the left side. We get the new grammar on the right side after the first step. S  ASA | aB A  B | S B  b |  S0  S S  ASA | aB A  B | S B  b | 

The Conversion (example) After that, we remove B   S0  S S  ASA | aB | a A  B | S |  B  b S0  S S  ASA | aB A  B | S B  b |  Before removing B   After removing B  

The Conversion (example) After that, we remove A   S0  S S  ASA | aB | a A  B | S |  B  b S0  S S  ASA | aB | a | SA | AS | S A  B | S B  b Before removing A   After removing A  

The Conversion (example) Then, we remove S  S and S0  S S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  B | S B  b S0  S S  ASA | aB | a | SA | AS A  B | S B  b After removing S  S After removing S0  S

The Conversion (example) Then, we remove A  B S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  B | S B  b S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  b | S B  b Before removing A  B After removing A  B

The Conversion (example) Then, we remove A  S S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  b | ASA | aB | a | SA | AS B  b S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  b | S B  b Before removing A  S After removing A  S

The Conversion (example) Then, we apply Step 4 S0  AA1 | UB | a | SA | AS S  AA1 | UB | a | SA | AS A  b | AA1 | UB | a | SA | B  b A1  SA U  a S0  ASA | aB | a | SA | AS S  ASA | aB | a | A  b | ASA | aB | a | SA | AS B  b After Step 4 Grammar is in CNF Before Step 4