CSCI 2670 Introduction to Theory of Computing September 15, 2005
Agenda Yesterday –Introduce context-free grammars Today –No quiz! Quiz postponed until Monday –Regular expressions, pumping lemma, CFG’s –Build CFG’s
Context-free grammar definition A context-free grammar is a 4-tuple (V, ,R,S), where 1.V is a finite set called the variables, 2. is a finite set, disjoint from V, called the terminals, 3.R is a finite set of rules, with each rule being a variable and a string of variables and terminals, and 4.S V is the start variable.
More definitions If u, v, and w are strings of variables and terminals, and A w is a rule of the grammar, we say uAv yields uwv –Denoted uAv uwv If a sequence of rules leads from u to v – i.e., u u 1 u 2 … v, we denote this u * v (I can’t do the actual notation in powerpoint – the * should be over the double bars) The language of the grammar is {w * | s * w}
Example A Ab | Bb B aBb | ab V = {A,B} = {a,b} R is the set of rules listed above S = A The language of this grammar is {w {a,b} * | w = a n b m, m > n > 0}
Designing CFG’s Requires creativity There are some guidelines to help –Union of two CFG’s –Converting a DFA to a CFG –Linked terminals –Recursive behavior
Designing the union of two CFG’s For the union of k CFG’s, design each CFG separately with starting variables S 1, S 2, …, S k and combine using the rule S S 1 | S 2 | … | S k What is the CFG for the following language {a i b j c k | i, j, k ≥ 0 and i = j or j = k} {a i b j c k | i, j, k ≥ 0 and i = j} {a i b j c k | i, j, k ≥ 0 and j = k}
Example First design {a i b j c k | i, j, k ≥ 0 and i=j} S 1 S 1 c | A A aAb | Then design {a i b j c k | i, j, k ≥ 0 and j=k} –(use different variables) S 2 aS 2 | B B bBc | Finally, add the “unifying” rule S S 1 | S 2
Converting DFA’s into CFG’s For each state q i in the DFA, make a variable R i for your CFG. For each transition rule (q i,a)=q k in your DFA, add the rule R i aR k to your CFG For each accept state q a in your DFA, add the rule R a ε If q 0 is the start state in your DFA, then R 0 is the starting variable in your CFG
Example q1q1 0, 1 q2q q3q3 1 V = {R 1, R 2, R 3 } R 1 0R 3 | 1R 2 R 2 0R 1 | 1R 3 R 3 1R 3 | 1R 3 R 2 ε R 1 is the start state
Linked terminals Terminals may be “linked” to one another in that they have the same number of occurrences (or a related number) –{0 n 1 n | n ≥ 0}, {x n y 2n | n > 0} Add terminals simultaneously –S 0S1 | ε –S xSyy | xyy
Recursive behavior Some languages may be built of pieces that are within the language –Legal pairing of parenthesis For these languages, you will want a recursive rule –S SS Not all recursive rules will be that easy!
Example Construct a CFG accepting all strings in {0,1} * that have equal numbers of 0’s and 1’s S S0S1S | S1S0S | ε
Ambiguity Consider the CFG ({S},{0,1},R,S), where the rules of R are S 0 | 1 | S + S | S * S Derive the string 0 * 1 + 1
Ambiguity S 0 | 1 | S + S | S * S 0 * S SS* S+S 0 11 S SS+ S*S 1 01 Different parse trees!
Definition of ambiguity A context-free grammar G generates a string w if there are two different parse trees that generate w –Different derivations that differ only in order do not indicate ambiguity
Derivation & ambiguity A derivation of a string w in a grammar G is a leftmost derivation if every step of the derivation replaced the leftmost variable A string is derived ambiguously in CFG G if it has two or more different leftmost derivations The grammar G is ambiguous if it generates some string ambiguously –Some grammars are inherently ambiguous