20 G M aaba acba aaba.. What is it about? Models of Language Generation Models of Language Recognition.

Slides:



Advertisements
Similar presentations
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Advertisements

About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Transforming Context-Free Grammars to Chomsky Normal Form 1 Roger L. Costello April 12, 2014.
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
CS5371 Theory of Computation
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.1: Context-Free Grammars) David Martin With some.
1 Homework #6 (Models of Computation, Spring, 2001) Due: Section 1; March 29 Section 2; March Let L be the language of the following grammar G 1.
1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest.
Discussion #31/20 Discussion #3 Grammar Formalization & Parse-Tree Construction.
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Chapter 3: Formal Translation Models
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
Context-free grammars are a subset of context-sensitive grammars
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
1 Homework #7 (Models of Computation, Spring, 2001) Due: Section 1; April 16 (Monday) Section 2; April 17 (Tuesday) 2. Covert the following context-free.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
The CYK Algorithm Presented by Aalapee Patel Tyler Ondracek CS6800 Spring 2014.
Introduction to Language Theory
Chapter 4. Syntax Analysis (1). 2 Application of a production  A  in a derivation step  i   i+1.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Phrase-structure grammar A phrase-structure grammar is a quadruple G = (V, T, P, S) where V is a finite set of symbols called nonterminals, T is a set.
Regular Grammars Chapter 7. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Grammar G = (V N, V T, P, S) –V N : Nonterminal symbols –V T : Terminal symbols V N  V T = , V N ∪ V T = V – P : a finite set of production rules α 
CMSC 330: Organization of Programming Languages Context-Free Grammars.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Regular Grammars Chapter 7 1. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Context Free Grammars.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Regular Grammars Reading: 3.3. What we know so far…  FSA = Regular Language  Regular Expression describes a Regular Language  Every Regular Language.
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 10 Automata, Grammars and Languages.
Formal Languages and Grammars
Discrete Structures ICS252 Chapter 5 Lecture 2. Languages and Grammars prepared By sabiha begum.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
Transparency No. 1 Formal Language and Automata Theory Homework 5.
Lecture 17: Theory of Automata:2014 Context Free Grammars.
1 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5.
Lecture #2 Advanced Theory of Computation. Languages & Grammar Before discussing languages & grammar let us deal with some related issues. Alphabet: is.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
PROGRAMMING LANGUAGES
Context-Free Grammars: an overview
Formal Language & Automata Theory
L-systems L-systems are grammatical systems introduced by Lyndenmayer to describe biological developments such as the growth of plants and cellular organisms.
Complexity and Computability Theory I
Natural Language Processing - Formal Language -
Context free grammar.
CS314 – Section 5 Recitation 3
A HIERARCHY OF FORMAL LANGUAGES AND AUTOMATA
Regular Grammar.
Chapter 7 Regular Grammars
CHAPTER 2 Context-Free Languages
Midterm (Models of Computation, Fall, 2000)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Regular Grammars.
Presentation transcript:

20 G M aaba acba aaba.. What is it about? Models of Language Generation Models of Language Recognition

21 Language: (1)The words, their pronunciation, and the methods of combining them used and understood by a community. (2) A system of signs and symbols and rules for using them that is used to carry information. - from a Webster dictionary -

22 Formal Languages and Grammars: definition (1) A phrase structured (also called type 0) grammar is a 4-tuple G =, where V T : terminal alphabet (called morphemes by linguists), V N : nonterminals (also called variables, or syntactic categories), V = V T  V N : total alphabet, S  V N : the start symbol, and P : a finite set of production (also called rewriting) rules of the form  which means  generates (or produces) , where  V * V N V * and  V *. Practice1 Practice2

23 Formal Languages and Grammars: definition (cont’ed) Notice that V * V N V * is the set of strings of total alphabet which has at least one nonterminal symbol. For two strings w 1 and w 2, we write w 1  w 2 to denote w 2 can be derived from w1 by applying a production rule of a grammar G. We write w 1  w 2 to denote w 2 can be derived by applying some finite number of production rules including zero. The language of a grammar G, denoted by L(G), is the set of strings over V T that can be generated by G staring with the start symbol S, i.e., L(G) = { x | x  V T * and S  x }. Following the convention we will use uppercase letters for nonterminal symbols and lowercase letters for terminal symbols. * *

24 Foramal Languages and Grammars: definition (cont’ed) (2) Context-sensitive (type 1) grammars are type 0 grammars with the the following restriction: |  |   | (i.e., noncontracting) except for S . (3) Context-free (type 2) grammars are type 0 grammars with the restriction |  | = 1, i.e., the left side of every production rule has only one symbol, which is nonterminal. (4) Regular (type 3) grammars are type 2 grammars with the restriction  = xB or  = x, for some x   * and B  V N.

25 EXAMPLES type 0 : G =, where P = { S  ACaB | aAD  AC Ca  aaCaE  Ea CB  DB | E AE   aD  Da } L(G) = { | n  0 }

26 EXAMPLES type 1 : G = P = { S  aSBC | aBC CB  BCbB  bb aB  abbC  bc cC  cc } L(G) = {a i b i c i | i  1 }

27 EXAMPLES type 2 : G = P = { S  ASB |  A  0B  1 } L(G) = {0 i 1 i | i  0 } type 3 : G = P = { S  0S | A A  1A |  } L(G) = { 0 i 1 j | i, j  0 }

28 Remarks on Grammars The following remarks summarize subtle conceptual aspects concerning formal grammars and their languages that we have defined in the class. Let G = be a grammar. The set of rules P does not have any order explicitly defined that must be observed when a string is derived. Recall that the language L(G) is the set of terminal strings that can be generated by applying a finite sequence of production rules. However, it is not true that every sequence of production rules produces a terminal string. We may end up with a string which has a nonterminal symbol that can never derive a terminal (or null) string. For example, consider the grammar below, which is type 1. ( For convenience, we will only show the set of production rules written according to the convention, because we can identify V T, V N and the start symbol, which is S.) (1) S  ABC (2) AB  ab (3) BC  bc (4) bC  bc Clearly, only rules (1) (2) (4) applied in this order will derive terminal string abc, which is the only member of the language of the grammar. If you apply (1) followed by (3), you will be stuck with Abc, which cannot be a member of the language because the string has a nonterminal symbol A.

29 Remarks on Grammars (cont’ed) Rule (3) of the grammar above is useless in the sense that it does not contribute to the generation of the language. We can delete the rule from the grammar without affecting the language of the grammar. In general, the decision problem of whether an arbitrary grammar has a useless rule or not is unsolvable. However, if we restrict the problem to the class of context-free grammars (type 2), we can effectively clean up such useless rules, if any. We will learn how to do this. The grammars that we have defined in the class are sequential in the sense that only one rule is allowed to apply at a time. Notice that in the above grammar, if we apply rules AB  ab and BC  bc simultaneously on string ABC, which is derived from S, we will get terminal string abbc, which is not a member of the language according to our definition. There is a class of grammar where more than one rule can be applied simultaneously. We call such rules parallel rewriting rules. In general it is very difficult to study parallel rewriting grammars. However, the language of a context-free grammar does not depend on how you apply the rules. We get the same language independent of the mode of rule application, sequential or parallel. Why? The answer is left for the reader.

30 Remarks on Grammars (cont’ed) Context-free grammars are defined as type 0 ( not type 1) grammar with the restriction of |  | = 1. It follows that a context-free grammar can have a contracting rule, like A  , while type 1 grammars cannot have contracting rules except for S   Later we will see that all context-free grammars which have  -production rules can be converted to a grammar which has production S   if the grammar produces the null string. By definition, a regular grammar cannot have rules of either one of the following from, where A, B, C are arbitrary nonterminals, and a, b are terminals. A  bBCA  abBaA  Ba We can define the same class of regular languages using production rules restricted to the forms A  Bx or A  x. Notice that the nonterminal symbols on the right side of a production rule, if any, must be at the left end of the string. We call these rules left linear and the rules defined in the class right linear. However, the definition does not allow a type 3 grammar to have both left linear and right linear forms (e.g., S  aB, B  Sb | b ).