CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.

Slides:



Advertisements
Similar presentations
CFGs and PDAs Sipser 2 (pages ). Long long ago…
Advertisements

COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation Qiang HUO Department of Computer.
CFGs and PDAs Sipser 2 (pages ). Last time…
Context-Free Grammars Sipser 2.1 (pages 99 – 109).
CS5371 Theory of Computation
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
1 Grammars. 2 Grammars express languages Example: the English language.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Chapter 3: Formal Translation Models
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
Markup Languages & XML - BY VISHAL KAMTAM VENKATESH.
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Context-Free Grammars
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Ambiguity.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
CS 3240: Languages and Computation Context-Free Languages.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
CMSC 330: Organization of Programming Languages Context-Free Grammars.
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
CSE 105 Theory of Computation Alexander Tsiatas Spring 2012 Theory of Computation Lecture Slides by Alexander Tsiatas is licensed under a Creative Commons.
Context Free Grammars 1. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 2670 Introduction to Theory of Computing September 14, 2005.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Limitations.
Context Free Grammars and Regular Grammars Needs for CFG Grammars and Production Rules Context Free Grammars (CFG) Regular Grammars (RG)
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong LR(0) grammars.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
Theory of Computation Automata Theory Dr. Ayman Srour.
CSCI 2670 Introduction to Theory of Computing September 16, 2004.
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Context-Free Grammars: an overview
Ambiguity Parsing algorithms
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Pushdown automata and CFG ↔ PDA conversions
Context-Free Grammars
LR(1) grammars The Chinese University of Hong Kong Fall 2010
CS21 Decidability and Tractability
CHAPTER 2 Context-Free Languages
Context-Free Grammars 1
Theory of Computation Lecture #
LR(1) grammars The Chinese University of Hong Kong Fall 2011
Limitations of pushdown automata
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Pushdown automata The Chinese University of Hong Kong Fall 2011
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COSC 3340: Introduction to Theory of Computation
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of formalizing the structure.
Context-Free Grammars
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Context Free Grammars-II
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Presentation transcript:

CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free languages Fall 2010

Context-free grammar A → 0 A 1 A → B B → # A, B are variables A  0A1 0A1  00 A 11  000 A 111  000 B 111  000#111 0, 1, # are terminals A is the start variable this is a derivation

Context-free grammar A context-free grammar (CFG) is (V, , R, S) where –V is a finite set of variables or non-terminals –  is a finite set of terminals ( V  =  ) –R is a set of productions or substitution rules of the form where A is a variable V and  is a string of variables and terminals –S is a variable called the start variable A → 

The grammar of English a girl with a flower likes the boy ARTNOUNPREPARTNOUNVERBARTNOUN SENTENCE VERB-PHRASE NOUN-PHRASE CMPLX-VERB PREP-PHRASE NOUN-PHRASE CMPLX-NOUN

The grammar of English SENTENCE → NOUN-PHRASE VERB-PHRASE NOUN-PHRASE → CMPLX-NOUN NOUN-PHRASE → CMPLX-NOUN PREP-PHRASE VERB-PHRASE → CMPLX-VERB VERB-PHRASE → CMPLX-VERB PREP-PHRASE PREP-PHRASE → PREP CMPLX-NOUN CMPLX-NOUN → ARTICLE NOUN CMPLX-VERB → VERB NOUN-PHRASE CMPLX-VERB → VERB ARTICLE → a ARTICLE → the NOUN → boy NOUN → girl NOUN → flower VERB → likes VERB → touches VERB → sees PREP → with variables: SENTENCE, NOUN-PHRASE, … terminals: a, the, boy, girl, flower, likes, touches, sees, with start variable: SENTENCE This grammar describes (a part of) English

Derivations in English SENTENCE → NOUN-PHRASE VERB-PHRASE NOUN-PHRASE → CMPLX-NOUN NOUN-PHRASE → CMPLX-NOUN PREP-PHRASE VERB-PHRASE → CMPLX-VERB VERB-PHRASE → CMPLX-VERB PREP-PHRASE PREP-PHRASE → PREP CMPLX-NOUN CMPLX-NOUN → ARTICLE NOUN CMPLX-VERB → VERB NOUN-PHRASE CMPLX-VERB → VERB ARTICLE → a ARTICLE → the NOUN → boy NOUN → girl NOUN → flower VERB → likes VERB → touches VERB → sees PREP → with  NOUN-PHRASE VERB-PHRASE (1)  CPLX-NOUN VERB-PHRASE(2) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) SENTENCE  ARTICLE NOUN VERB-PHRASE(7)  a NOUN VERB-PHRASE(10)  a boy VERB-PHRASE(12)  a boy CPLX-VERB(4)  a boy VERB(9)  a boy sees(17)

Grammars for programming languages E  E + E E  E * E E  ( E ) E  0 E  1 … E  9 Variables: E Terminals: +*()  E * E  ( E ) * E E  ( E + E ) * E  (2 + E ) * E  (2 + 3) * E  (2 + 3) * 5 (2 + 3) * 5 meaning: “add 2 and 3, and then multiply by 5 ” bash-3.2$ python Python (r265:79359, Mar , 01:32:55) >>> (2+3)*5 25

Notation and conventions E  E + E E  E * E E  ( E ) E  N E  E + E | E * E | ( E ) | N N  0 N | 1 N | 0 | 1 Variables: E, N Terminals: +, *, (, ), 0, 1 Start variable: E N  0 N N  1 N N  0 N  1 Variables in UPPERCASE Start variable comes first conventions : shorthand :

Derivation A derivation is a sequential application of productions: E derivation  E * E  ( E )* E  ( E )* N  ( E + E )* 1  ( E + N )* 1  ( N + N )* 1  ( N + 1 N )* 1  ( N + 10)* 1  (1 + 10)* 1   obtained from  in one production  *  obtained from  in zero or more productions E  E + E | E * E | ( E ) | N N  0 N | 1 N | 0 | 1 E  (1 + 10)* 1 *

Context-free languages The language of a CFG is the set of all strings of terminals that can be derived from the start variable L(G) = {w : w   * and S  w } * Questions we will ask: I give you a CFG, what is the language? I give you a language, write a CFG for it

Analysis example 1 Can you derive: A → 0 A 1 | B B → # 00#11 00#111 00##11 # A  0A1 0A1  00 A 11  00 B 11  00#11 A  B B  # # No, there is an uneven number of 0 s and 1 s No, there are too many # L(G) = { 0 n #1 n : n ≥ 0}

Analysis example 1 Can you derive: What is the language of this CFG? A → 0 A 1 | B B → # variables: A, B terminals: 0, 1, # start variable: A L = { 0 n #1 n : n ≥ 0} 00#11 00#111 00##11 #

Analysis example 2 Can you derive S  SS | ( S ) |  S  ( S ) (2)  () (3) S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) () (()())

Parse trees A parse tree gives a more compact representation: S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) (()()) S S  SS | ( S ) |  SS () S  () S  S ( )

Parse trees S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) S S S () S  S ( ) One parse tree can represent several derivations () S  S  ( S )  ( SS )  (( S ) S )  (() S )  (()( S ))  (()()) S  ( S )  ( SS )  ( S ( S ))  (( S )( S ))  (()( S ))  (()()) S  ( S )  ( SS )  ( S ( S ))  ( S ())  (( S )())  (()())

Analysis example 2 Can you derive S  SS | ( S ) |  (()() No, because there is an uneven number of ( and ) ())()) No, because there is a prefix with an excess of )

Analysis example 2 S  SS | ( S ) |  L(G) = {w: w has the same number of ( and ) no prefix of w has more ) than ( } ( ( ) ( ) ) ( ) Parsing rules: Divide w up in blocks with same number of ( and ) Each block is in L(G) Parse each block recursively S S S S S S S   S S 

Design example 1 L = {0 n 1 n | n  0} S  These strings have recursive structure:  0S1| 

Design example 2 L = numbers without leading zeros 0, 109, 2, 23 , 01, 003 allowednot allowed L → 1|2|3|4|5|6|7|8|9 S → 0|LN D → 0|L N → ND|  any number N leading digit L

Design examples L = {0 n 1 n 0 m 1 m | n  0, m  0} These strings have two parts: L 1 = {0 n 1 n | n  0} L 2 = {0 m 1 m | m  0} L = L 1 L 2 rules for L 1 :S 1  0S 1 1|  L 2 is the same as L 1 S  S 1 S 1 S 1  0S 1 1 | 

Design examples L = {0 n 1 m 0 m 1 n | n  0, m  0} These strings have nested structure: inner part: 1 m 0 m outer part: 0 n 1 n S  0S1|I I  1I0 | 

Design examples L = {x: x has two 0-blocks with same number of 0s} 01011, , , allowednot allowed initial part middle partfinal part ABC A : , or ends in 1 C : , or begins with 1

Design examples ABC A : , or ends in 1 C : , or begins with 1 A →  | U1 U → 0U | 1U |  C →  | 1U D → 1U1 | 1 S → ABC B has recursive structure: D same number of 0 s at least one 0 B → 0D0 | 0B0 U : any string D : begins and ends in 1

Context-free versus regular Write a CFG for the language (0 + 1)*111 Can you do so for every regular language? S  U111 U  0U | 1U |  Every regular language is context-free regular expression DFANFA

From regular to context-free regular expression   a (alphabet symbol) E 1 + E 2 CFG E1E2E1E2 E1*E1* grammar with no rules S  →  S →  a S  → S 1 | S 2 S  → S 1 S 2 S  → SS 1 |  In all cases, S becomes the new start symbol

Context-free versus regular Is every context-free language regular? S → 0S1 |  L = {0 n 1 n : n ≥ 0} Is context-free but not regular regularcontext-free