Context-Free Languages

Slides:



Advertisements
Similar presentations
Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
Advertisements

Chapter 5 Pushdown Automata
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
CS21 Decidability and Tractability
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
CS5371 Theory of Computation
Applied Computer Science II Chapter 2 : Context-free languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Courtesy Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 4 Updated by Marek Perkowski.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2009 with acknowledgement.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
Normal forms for Context-Free Grammars
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
Prof. Busch - LSU1 Context-Free Languages. Prof. Busch - LSU2 Regular Languages Context-Free Languages.
Fall 2005Costas Busch - RPI1 Context-Free Languages.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
1 Computer Language Theory Chapter 2: Context-Free Languages.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 8 Mälardalen University 2010.
Pushdown Automata (PDAs)
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2010.
Grammars CPSC 5135.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS355 - Theory of Computation Regular Expressions.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Formal Languages, Automata and Models of Computation
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 8 Mälardalen University 2011.
Theory of Languages and Automata By: Mojtaba Khezrian.
Complexity and Computability Theory I Lecture #12 Instructor: Rina Zviel-Girshin Lea Epstein.
Formal Languages, Automata and Models of Computation
Closed book, closed notes
Context-Free Grammars: an overview
Context-Free Languages
Formal Language & Automata Theory
G. Pullaiah College of Engineering and Technology
CSE 105 theory of computation
Syntax Specification and Analysis
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz
NPDAs Accept Context-Free Languages
CSE 105 theory of computation
Pushdown Automata.
Lecture 17 Oct 25, 2011 Section 2.1 (push-down automata)
PARSE TREES.
NPDAs Accept Context-Free Languages
Pushdown Automata Reading: Chapter 6.
Context-free Languages
Non-Deterministic Finite Automata
CHAPTER 2 Context-Free Languages
Context-Free Grammars
Context-Free Languages
فصل دوم Context-Free Languages
CS21 Decidability and Tractability
Chapter 2 Context-Free Language - 01
CSE 105 theory of computation
COSC 3340: Introduction to Theory of Computation
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Computer Language Theory
… NPDAs continued.
CSE 105 theory of computation
CSE 105 theory of computation
COMPILER CONSTRUCTION
CSE 105 theory of computation
CSE 105 theory of computation
Presentation transcript:

Context-Free Languages Theory of Computation Context-Free Languages

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Last Time Is L = {0ix | i ≥ 0, x {0,1}* and |x| ≤ i} regular? Is L = {0i | i is prime} regular? Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Languages Context-Free Languages (CFL) are described using Context-Free Grammars (CFG). A CFG is a simple recursive method of specifying grammar rules which can generate strings in a language – these languages are the CFL’s. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars The following is an example of a CFG, call it G1: A → 1A0 (A and B = variables) A → B B → # (0, 1 and # = terminals) A grammar consists of a collection of substitution rules (projections). A is start variable in this case – usually occurs on left hand side of topmost rule. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars Use grammar to describe a language by generating each string of language: Write down start variable. Find a variable and a rule which starts with that variable. Replace written variable with the right hand side of this rule. Repeat the second step until no variables remain. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars G1 generates the string 1111#0000 using the following sequences: A → 1A0 → 11A00 → 111A000 → 1111A0000 → 1111B0000 → 1111#0000 Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars All strings generated in this manner constitute the language of the grammar. L(G1) = language of grammar G1. Can show that L(G1) is {1n#0n | n ≥ 0}. Any language that can be generated by some context-free grammar is called a context-free language. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars Note: For convenience if a variable has several rules they are often abbreviated: A → 1A0 and A → B may be represented as: A → 1A0 | B, where “|” represents “or”. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Definition of a CFG A context-free grammar is a 4-tuple (V, Σ , R, S), where V are the variables (finite set) Σ are the terminal states (finite set) R is the set of rules S is the start variable, S V Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars In grammar G1, V = {A, B}, Σ = {0, 1, #}, S = A, and R is the collections of the rules: A → 1A0 A → B B → # Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars Consider G3 = ({S}, {a,b}, R, S). The set of rules R, is S → aSb | SS | ε This grammar generates strings such as ab, abab, aababb and aaabbb. Note that L(G3) is the language of all strings of properly nested parnetheses. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Context-Free Grammars Consider the language of palindromes: L = {w {a,b}* | w = wr} This is not regular, but the language can be generated by the following rules: S → aSa S → bSb S → a S → b S → ε V = {S}, S = S, Σ = {a,b} and R are rules above. Check to ensure if produces palindromes. What about a CFG for the non-palindromes? Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Derivation Trees Consider the same example grammar: And a derivation of : Fall 2006 Costas Busch - RPI

yield Costas Busch - RPI Fall 2006

yield Costas Busch - RPI Fall 2006

yield Costas Busch - RPI Fall 2006

yield Costas Busch - RPI Fall 2006

Derivation Tree (parse tree) yield Costas Busch - RPI Fall 2006

Sometimes, derivation order doesn’t matter Leftmost derivation: Rightmost derivation: Give same derivation tree Costas Busch - RPI Fall 2006

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Designing CFG’s Many CFL’s are the union of simpler ones. Construct the smaller simpler CFG’s and then construct them to give the larger CFG for the CFL. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Designing CFG’s Construct a grammar for the language: {0n1n | n ≥ 0} {1n0n | n ≥ 0}. Firstly construct the grammar S1 → 0 S11 | ε for the language {0n1n | n ≥ 0}, and the grammar S2 → 1 S20 | ε for the language {1n0n | n ≥ 0}, and then add the rule S → S1| S2 to give the grammar: S → S1| S2 S1 → 0 S11 | ε S2 → 1 S20 | ε Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Designing CFG’s We can construct a CFG for a regular language by first constructing a DFA for the language. A DFA may be converted into an equivalent CFG as follows: Make a variable Ri for each state qi of the DFA. Add the rule Ri → aRj to the CFG if δ(qi, a) = qj is a transition in the DFA. Add the rule Ri → ε if qi is an accept state of the DFA. Make R0 the start state of the grammar where q0 is the start state of the machine. Verify that it works!! Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Union, concatenation and closure of CFG’s Theorem: If L1 and L2 are CFL’s then L1 L2, L1L2 and L*1 are also CFL’s. That is, the context-free languages are closed under union, concatenation and Kleene-closure. The proofs are constructive. Begin with two grammars: G1 = (V1, Σ , R1, S1) and G2 = (V2, Σ , R2, S2), generating CFL’s L1 and L2 respectively. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Union of CFG’s The new CFG Gx is made as: Σ remains the same Sx is the new start variable Vx = V1 V2 {Sx} Rx = R1 R2 {Sx → S1|S2} Explanation: All we have done is augment the variable set with a new start state and then allowed the new start state to map to either of the two grammars. So, we’ll generate strings from either L1 or L2, i.e. L1 L2 Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Concatenation of CFG’s The new CFG Gy is made as: Σ remains the same Sy is the new start variable Vy = V1 V2 {Sy} Ry = R1 R2 {Sx → S1S2} Explanation: Again, all we have done is to augment the variable set with a new start state, and then allowed the new start state to map to the concatenation of the two original start symbols. So, we will generate strings that begin with strings from L1 and end with strings from L2, i.e. L1L2 Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Kleene-Closure of CFG’s The new CFG Gz is made as: Σ remains the same Sz is the new start variable Vz = V1 {Sz} Rz = R1 {Sz → S1Sz | ε} Explanation: Again we have augmented the variable set with a new start state, and then allowed the new start state to map to either S1Sz or ε. This means we can generate strings with zero or more strings made from expanding the variable S1, i.e. L*1 Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Fall 2006 Ambiguity Costas Busch - RPI

Grammar for mathematical expressions Example strings: Denotes any number Fall 2006 Costas Busch - RPI

A leftmost derivation for Costas Busch - RPI Fall 2006

Another leftmost derivation for Costas Busch - RPI Fall 2006

Two derivation trees for Costas Busch - RPI Fall 2006

take Costas Busch - RPI Fall 2006

Compute expression result using the tree Good Tree Bad Tree Compute expression result using the tree Fall 2006 Costas Busch - RPI

Two different derivation trees may cause problems in applications which use the derivation trees: Evaluating expressions In general, in compilers for programming languages Fall 2006 Costas Busch - RPI

Ambiguous Grammar: (Two different derivation trees give two A context-free grammar is ambiguous if there is a string which has: two different derivation trees or two leftmost derivations (Two different derivation trees give two different leftmost derivations and vice-versa) Fall 2006 Costas Busch - RPI

this grammar is ambiguous since Example: this grammar is ambiguous since string has two derivation trees Costas Busch - RPI Fall 2006

this grammar is ambiguous also because string has two leftmost derivations Costas Busch - RPI Fall 2006

Another ambiguous grammar: IF_STMT if EXPR then STMT if EXPR then STMT else STMT Variables Terminals Very common piece of grammar in programming languages Fall 2006 Costas Busch - RPI

IF_STMT if expr1 then STMT if expr2 then stmt1 else stmt2 IF_STMT if If expr1 then if expr2 then stmt1 else stmt2 IF_STMT if expr1 then STMT if expr2 then stmt1 else stmt2 Two derivation trees IF_STMT if expr1 then STMT else stmt2 if expr2 then stmt1 Fall 2006 Costas Busch - RPI

In general, ambiguity is bad and we want to remove it Sometimes it is possible to find a non-ambiguous grammar for a language But, in general we cannot do so Fall 2006 Costas Busch - RPI

A successful example: Equivalent Ambiguous Grammar Non-Ambiguous generates the same language Costas Busch - RPI Fall 2006

Unique derivation tree for Costas Busch - RPI Fall 2006

An un-successful example: is inherently ambiguous: every grammar that generates this language is ambiguous Fall 2006 Costas Busch - RPI

Example (ambiguous) grammar for : Costas Busch - RPI Fall 2006

has always two different derivation trees (for any grammar) The string has always two different derivation trees (for any grammar) For example Fall 2006 Costas Busch - RPI

Pushdown Automata (PDA) Pushdown Automata are similar to nondeterministic finite automata but have an extra element – stack. This stack provided extra memory space. Also allows pushdown automata to recognise some nonregular languages. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA Finite Automata Pushdown Automata …… state control state control stack input input Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA Why is a stack useful? It can hold an unlimited amount of information. Remember that a FA was unable to recognise the language {0n1n | n ≥ 0} because it can’t store large numbers. However, a PDA does not have this problem, due to the presence of a stack: it can use the stack to store how many 0’s it has seen. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA A PDA can write symbols on the stack and read them back later. Writing a symbol “pushes down” all the other symbols on the stack. Only the top symbol in the stack can ever be read – once read it is removed. Writing a symbol is known as “pushing” and reading a symbol known as “popping”. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA The PDA has no way of checking for an empty stack. Gets around this by placing a special character, $, on the stack initially. Then if it ever sees the $ again it knows that the stack is effectively empty. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA Important: Deterministic and non-deterministic PDA’s are not equivalent in power. Non deterministic PDA’s recognise certain languages which no deterministic pushdown automata can recognise. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Formal Definition of PDA’s The formal definition of a PDA is similar to that of a FA, except for the stack. The stack contains symbols drawn from some alphabet. The machine may use different alphabets for its input and the stack We need to specify an input alphabet Σ and a stack alphabet Γ Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Formal Definition of PDA’s In order to formally define a PDA we need to determine the transition function. Recall: Σε = Σ {ε} and Γε = Γ {ε} The domain of the transition function is Q × Σε × Γε Therefore, the current state, next input symbol read and top state of the stack determine the next move of the PDA. Note that either symbol may be ε meaning that the machine may move without reading a symbol from the input or the stack. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Formal Definition of PDA’s What is the range of the transition function? The machine may enter some new state and possibly write to the top of the stack. The function δ can represent this by returning a member of Q along with a member of Γε, i.e. a member of Q × Γε A number of legal next moves may be allowed The transition function incorporates this nondeterminism in the usual way – i.e. returning a set of members of Q × Γε, that is, a member of P(Q × Γε). Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Formal Definition of PDA’s A pushdown automata is a 6-tuple (Q,Σ,Γ,δ,q0,F), where Q, Σ, Γ, and F are all finite sets, and Q is the set of states, Σ is the input alphabet, Γ is the stack alphabet, δ: Q×Σε×Γε → P(Q × Γε) is the transition function, q0 Q is the start state, and F Q is the set of accept states. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA The computation of a PDA, M, is as follows: It accepts input w if w can be written as w = w1w2…wm where each w1 Σε and sequences of states r0,r1,…,rm Q and strings s0, s1,…, sm Γ* exist that satisfy the following: r0 = q0 and s0 = ε: M starts out properly, in start state and empty stack. For i = 0, 1,.., m-1, we have (ri+1, b) δ(ri, wi+1, a), where si = at and si+1 = bt for some a,b Γε and t Γ*: M moves properly according to the state, stack and next input symbols. rm F: Accept states occurs when the input end is reached. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth PDA In a state transition we write “a, b → c” to signify that when machine reads input a it may replace symbol b on top of the stack with a c. Any of a, b, c can be ε. If a is ε, the machine may make this transition without reading any input symbol. If b is ε the machine performs transition without reading and popping any stack symbol. If c is ε machine does not write any symbol to stack. Can we design a PDA to recognise the language: {0n1n | n ≥ 0} ? Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Context Free A language is context free if and only if some pushdown automata recognises it. Every regular language is recognised by a finite automaton and every finite automaton is automatically a pushdown automaton that ignores the stack, we can note that every regular language is also a context-free language. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Regular and Context-Free Languages Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Non-Context Free Languages Recall that we looked at the pumping lemma previously for showing that certain languages are not regular. We will look at a similar lemma for context-free languages. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Pumping Lemma The pumping lemma states that every context-free language has a special value called the pumping length such that all longer strings in the language can be “pumped”. Pumped, this time, means that the string can be divided into five parts such that the 2nd and 4th parts of the string may be repeated together any number of times and the resulting string still be part of the language. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Pumping Lemma If A is a context-free language, then there is a number p (the pumping length) where, if s is a string in A of length at least p, then s may be divided into 5 parts, s = uvxyz such that: for each i ≥ 0, uvixyiz A |vy| > 0, and |vxy| ≤ p. Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth

Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth Pumping Lemma Let us look at some examples: Use the Pumping Lemma to show that the following languages are not context-free: B = {anbncn | n ≥ 0} C = {aibjck | 0 ≤ i ≤ j ≤ k} D = {ww | w {0,1}* } E = {0n1n0n1n | n ≥ 0} Dr. A. Mooney, Dept. of Computer Science, NUI Maynooth