So far... A language is a set of strings over an alphabet. We have defined languages by: (i) regular expressions (ii) finite state automata Both (i) and.

Slides:



Advertisements
Similar presentations
So far... A language is a set of strings over an alphabet. We have defined languages by: (i) regular expressions (ii) finite state automata Both (i) and.
Advertisements

1 Pushdown Automata (PDA) Informally: –A PDA is an NFA-ε with a stack. –Transitions are modified to accommodate stack operations. Questions: –What is a.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Context Free Grammars.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
CS5371 Theory of Computation
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.1: Context-Free Grammars) David Martin With some.
1 Module 28 Context Free Grammars –Definition of a grammar G –Deriving strings and defining L(G) Context-Free Language definition.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Normal forms for Context-Free Grammars
Transparency No. P2C1-1 Formal Language and Automata Theory Part II Pushdown Automata and Context-Free Languages.
Chapter 3: Formal Translation Models
Specifying Languages CS 480/680 – Comparative Languages.
COP4020 Programming Languages
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Context-free Grammars
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Theory Of Automata By Dr. MM Alam
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
Formal Methods in SE Theory of Automata Qasiar Javaid Assistant Professor Lecture # 06.
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of CFL was to formalize.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
Pushdown Automata (PDAs)
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
1 Context-Free Languages Not all languages are regular. L 1 = {a n b n | n  0} is not regular. L 2 = {(), (()), ((())),...} is not regular.  some properties.
Lecture # 19. Example Consider the following CFG ∑ = {a, b} Consider the following CFG ∑ = {a, b} 1. S  aSa | bSb | a | b | Λ The above CFG generates.
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
PART I: overview material
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Context-Free Grammars Chapter 11. Languages and Machines.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
Introduction to Language Theory
CMSC 330: Organization of Programming Languages Context-Free Grammars.
Context Free Grammars.
Recursive Definitions & Regular Expressions (RE)
Introduction to Parsing
Lecture 11 Theory of AUTOMATA
Discrete Mathematical Structures 4 th Edition Kolman, Busby, Ross © 2000 by Prentice-Hall, Inc. ISBN
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
Lecture 02: Theory of Automata:08 Theory of Automata.
Context Free Grammars 1. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 3813: Introduction to Formal Languages and Automata
Recursive Definations Regular Expressions Ch # 4 by Cohen
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Context-Free Languages
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
Transparency No. 1 Formal Language and Automata Theory Homework 5.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
CSE 311 Foundations of Computing I Lecture 19 Recursive Definitions: Context-Free Grammars and Languages Spring
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
CSE 311 Foundations of Computing I Lecture 20 Context-Free Grammars and Languages Autumn 2012 CSE
Lecture 17: Theory of Automata:2014 Context Free Grammars.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Even-Even Devise a grammar that generates strings with even number of a’s and even number of b’s.
Theory Of Automata By Dr. MM Alam
Pushdown Automata Reading: Chapter 6.
Context-Free Languages
CSCI 432 Computer Science Theory
Presentation transcript:

So far... A language is a set of strings over an alphabet. We have defined languages by: (i) regular expressions (ii) finite state automata Both (i) and (ii) give us exactly the same class of languages. Languages serve two purposes in computing: (a) communicating instructions or information (b) defining valid communications What about languages outwith this class?

Specifying Non-Regular Languages We have already seen a number of languages that are not regular. In particular, {a n b n : n ≥ 0} the language of matched round brackets arithmetic expressions standard programming languages are not regular. However, these languages are all systematic constructions, and can be clearly and explicitly defined. Consider L = {a n b n : n ≥ 0}: (i)  L (ii) if x  L, then axb  L (iii) nothing else is in L This is a clear and concise specification of L. Can we use it to generate members of L?

Generating Languages Using the previous definition of L, and the notion of string substitution, we can give a generative definition of L. Let X be a new symbol. 1) X -> 2) X -> aXb This definition says that if we have a symbol X, we can replace it by the empty string, or by aXb. We now define L to be all strings over {a,b} formed by starting with X and applying rules 1) and 2) until we get a string with no X's. Example: X => aXb => aaXbb => aabb X => X => aXb => aaXbb => aaaXbbb => aaabbb

Grammar Formalising the previous notion of a generative definition based on string substitution, we get: A grammar is a 4-tuple, G = (N, T, S, P), where N is a finite alphabet called the non-terminals; T is a finite alphabet, called the terminals; N  T =  ; S  N is the start symbol; and P is a finite set of productions of the form , where  (N  T) +,  has at least one member from N, and   (N  T)* Thus the previous example is a grammar where N = {X} T = {a, b} S = X P = { X ->, X -> aXb} so G = ({X}, {a,b}, {X}, {X ->, X -> aXb})

Definitions and Notation Let G = (N,T,S,P) be a grammar. If s, t, x, y, u and v are strings s.t. s = xuy, t = xvy, and (u -> v )  P then s directly derives t., written s => t. If there is a sequence of strings s 0, s 1,..., s n s.t. s 0 => s 1 =>... => s n-1 => s n, then s 0 derives s n, written s 0 =>* s n. A sentential form of G is a string w  (N  T)* s.t. S =>* w. A sentence of G is a sentential form w  T* i.e. one with no non-terminals. The language defined by G is the set of all sentences of G, denoted L(G). aaaSbbb => aaaaSbbbb. S =>* aaaabbbb. aaaSbbb is a sentential form of G aaaabbbb is a sentence of G. L(G) = {, ab, aabb, aaabbb,...}, which is {a n b n : n ≥ 0}

Definitions and Notation (cont.) Notation: we normally order the set of productions, and assign them numbers. If x => y by using rule number i, then we write x => i y  ->  1 |  2 |  3... |  n is shorthand for  ->  1  ->  2 :  ->  n In general, non-terminals will be uppercase, while terminals will be lowercase. A context-free grammar (CFG) is one in which all productions are of the form  -> , where   N - i.e. the left-hand side is a single non-terminal. A context-free language (CFL) is one that can be defined by a context-free grammar.

Context-Free Grammars A CFG is called context-free because the left-hand side of all productions contain only single symbols, and so a production can be applied to a symbol without needing to consider the symbol's context. We only consider context-free grammars in this course. Some languages are not context-free. Example: {a n b n c n : n ≥ 0} Some languages cannot be defined by any grammar. It is believed that these are the same languages that cannot be defined by any algorithm or effective procedure.

Example CFG G = ({S}, {a, +, *, (, )}, S, { S -> S+S | S*S | (S) | a} )

Example CFG G = ({S}, {a, +, *, (, )}, S, { S -> S+S | S*S | (S) | a} ) This is a grammar of algebraic expressions. The productions are: 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a. Example derivation: S => S * S => a * S => a * (S) => a * (S + S) => a * (a + S) => a * (a + a). Note that there are many other ways of deriving the same string.

Why Grammar? In English, the grammar is the set of conventions defining the structure of sentences - e.g. a sentence must have a subject and an object verbs must agree with nouns e.g. "John walks" & "John and Mary walk" adjectives come before nouns e.g. "the red car" and not "the car red" We have shown a formalisation of this notion. We now can write explicit clear statements of what sentences are in a language. Grammars can be used in the processing of natural language by computer (4th year option), in formalising design, in pattern recognition, and many other areas.

A grammar for a small part of English S -> NP VP NP -> Det NP1 | PN NP1 -> Adj NP1| N Det -> a | the PN -> peter | paul | mary Adj -> large | black N -> dog | cat | horse VP -> V NP V -> is | likes | hates Can you derive: peter is a large black cat

A grammar for a small part of English S -> NP VP NP -> Det NP1 | PN NP1 -> Adj NP1| N Det -> a | the PN -> peter | paul | mary Adj -> large | black N -> dog | cat | horse VP -> V NP V -> is | likes | hates Example derivations: S => NP VP => PN VP => mary VP => mary V NP => mary hates NP => mary hates Det NP1 => mary hates the NP1 => mary hates the N => mary hates the dog S => NP VP => NP V NP => NP V Det NP1 => NP V a NP1 => NP V a Adj NP1 => NP is a Adj NP1 => NP is a Adj Adj NP1 => NP is a large Adj NP1 => NP is a large Adj N => NP is a large black N => NP is a large black cat => PN is a large black cat => peter is a large black cat

Regular Grammars A grammar is regular if each production is of the form: (i) A -> t or (ii) A -> tB (iii) A -> where A, B  N, t  T. Example: S -> aA | bB A -> aS | a B -> bS | b Is this s sentence of the language? aaaabb

Regular Grammars A grammar is regular if each production is of the form: (i) A -> t or (ii) A -> tB (iii) A -> where A, B  N, t  T. Example: S -> aA | bB A -> aS | a B -> bS | b S => aA => aaS => aaaA => aaaaS => aaaabB => aaaabb

Regular Grammars A grammar is regular if each production is of the form: (i) A -> t or (ii) A -> tB (iii) A -> where A, B  N, t  T. Example: S -> aA | bB A -> aS | a B -> bS | b S => aA => aaS => aaaA => aaaaS => aaaabB => aaaabb The language generated by this grammar is the language denoted by …..

Regular Grammars A grammar is regular if each production is of the form: (i) A -> t or (ii) A -> tB (iii) A -> where A, B  N, t  T. Example: S -> aA | bB A -> aS | a B -> bS | b S => aA => aaS => aaaA => aaaaS => aaaabB => aaaabb The language generated by this grammar is the language denoted by (aa + bb) +

Regular Grammars and Regular Languages Thus we now have three different definitions of the one class of languages:  regular expressions  finite state automata  regular grammars Theorem: (stated here without proof) A language is regular iff it can be defined by a regular grammar. All three are useful in Computing Science

Example CFG (2) 1) S -> XaaX 2) X -> aX 3) X -> bX 4) X -> S => XaaX => bXaaX => baXaaX => babXaaX => babaaX => babaaaX => babaaabX => babaaab This grammar defines the language: ………

Example CFG (2) 1) S -> XaaX 2) X -> aX 3) X -> bX 4) X -> S => XaaX => bXaaX => baXaaX => babXaaX => babaaX => babaaaX => babaaabX => babaaab This grammar defines the language (a + b)*aa(a + b)*

...as a Regular Grammar 1) S -> aS 2) S -> bS 3) S -> aM 4) M -> aB 5) B -> aB 6) B -> bB 7) B -> S => bS => baS => babS => babaM => babaaB => babaaaB => babaaabB => babaaab S => bS => baM => baaB => baa

Backus-Naur Form A notation devised for defining the language Algol 60. PASCAL syntax rules are often presented in this form. Example: ::= ::= real | integer | boolean ::= identifier | identifier This formalism is equivalent to CFG's, where names enclosed in are non-terminals, names in bold are terminals, and ::= is the same as the -> notation.

Constructing Grammars Suppose we wanted to construct a grammar for the language of all strings of the form accc...cb or abab...abcc....cabab...ab n times We need to find rules to create: (i) sequences of strings - ccc....c (ii) bracketed strigs - accc...cb, and (iii) nested strings - abab...ab abab...ab Sequencing A -> aA | or A -> Aa |  e.g. A => aA => aaA =>... => aaaaaA => aaaaa Bracketing A -> aBb or A -> Bb B ->  xB  B -> ax | Bx e.g. A => aBb => axBb => axxBb =>... => axxxxxb

S -> abSab | abBab B -> cB | c What language does this generate? (Say it precisely) Constructing Grammars (cont.) Nesting A -> aAb | B B -> xB | e.g. A => aAb => aaAbb => aaaAbbb =>... => aaaaaAbbbbb => aaaaaBbbbbb =>... => aaaaaxxxBbbbbb => aaaaaxxxbbbbb Example:

S -> abSab | abBab B -> cB | c What language does this generate? The language (ab) n +c m +(ab) n (where n>0 and m>0) Constructing Grammars (cont.) Nesting A -> aAb | B B -> xB | e.g. A => aAb => aaAbb => aaaAbbb =>... => aaaaaAbbbbb => aaaaaBbbbbb =>... => aaaaaxxxBbbbbb => aaaaaxxxbbbbb Example:

S -> abSab | abBab B -> cB | c Example derivations: S => abBab => abcBab =>... abccccab S => abSab => ababSabab =>abababSababab => abababBababab => abababcBababab =>... => abababccccababab Constructing Grammars (cont.) Nesting A -> aAb | B B -> xB | e.g. A => aAb => aaAbb => aaaAbbb =>... => aaaaaAbbbbb => aaaaaBbbbbb =>... => aaaaaxxxBbbbbb => aaaaaxxxbbbbb Example: