Download presentation
Presentation is loading. Please wait.
1
Chapter 4 Normal Forms for CFGs
2
2 4.5 Chomsky Normal Form n Defn 4.4.1 A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of the following forms: i) A BC ii) A a iii) S where A, B, C V and B, C V - {S} and a n A simplified normal form which restricts the length & composition of the R.H.S. of a rule in CFG n The derivation tree for a string generated by a CFG in chomsky normal form is a binary tree
3
3 4.5 Chomsky Normal Form n Theorem 4.4.2 Let G = (V, , P, S) be a CFG. There is an algorithm to construct a grammar G’ = (V’, ’, P’, S’) in chomsky normal form that is equivalent to G Proof (sketch): (i) For each rule A w, where |w| > 1, replace each terminal a in w by a distinct variable Y & create new rule Y a (ii) For each modified rule X w’, w is either a terminal or a string in V +. Rules in the latter form must be broken into a sequence of rules, each of whose R.H.S. consists of two variables. n Example 4.4.1 n One of the applications of using CFGs that are in Chomsky Normal Form Constructing binary search trees to accomplish “optimal” time & space search complexity for parsing an input string
4
4 4.1 Grammar Transformations n Lemma 4.1.1 Let G = (V, , P, S) be a CFG. There is a CFG G’ = (V’, , P’, S’) that satisfies i) L(G) = L(G’) ii) Rules in P’ are of the form A w A w where A V’ and w ((V – {S’}) )*. Proof. If S is a recursive variable, then construct G’ by creating a new start symbol S’ & adding S’ S to P’, i.e., G’ = ( V {S’}, , P {S’ S}, S’). If S u, then S’ S u, where u *. Example. 4.1.1 Assume that P in G includes S aS | AB | AC, then P’ in G’ should include S’ S, S aS | AB | AC G * G’ *
5
5 4.2 Elimination of -rules n Nullable variables are variables that can derive. n A grammar w/o nullable variables is called non- contracting grammar since the application of a rule cannot decrease the length of the sentenial form. n It is desirable to avoid the generation of (nullable) variables that are subsequently removed by -rules. n The removal of nullable variables in a grammar guarantees that during process of the deriving a terminal string, each variable generates terminal symbol(s). n The derivation of a terminal string in a grammar G is more cost-effective if G is noncontrating than if G is contracting.
6
6 4.2 Elimination of -rules n Algorithm 4.2.1 Construction of Sets of Nullable Variables Input: A CFG G = (V, , P, S) Output: Set of Nullable Variables 1. NULL := { A | A P } 2. Repeat 2.1. PREV := NULL 2.2. For each variable A V do If A w, where w PREV* do NULL := NULL { A } Until NULL = PREV. ( Note: w PREV* indicates that A ( w) produces entirely nullable variables)
7
7 Example 4.2.1 Given the following CFG S ACA A aAa | B | C B bB | b C cC | Using Algorithm 4.1.2, the set of nullable variables can be computed 4.2 Elimination of -rules IterationNULLPREV 0 { C } 1 { A, C} { C } 2 { S, A, C} { A, C} 3 { S, A, C}{ S, A, C}
8
8 4.2 Elimination of -rules n An essentially noncontracting grammar G ● includes S , if L(G) ● excludes any -rules ● yields only noncontracting derivations, with the exception of S . n Theorem 4.2.3 Let G = (V, , P, S) be a CFG. There is a CFG G L = (V L, , P L, S L ) w/o -rules that satisfies ● L(G L ) = L(G). ● S L is not a recursive variable. ● A P L iff L(G) & A = S L.
9
9 4.2 Elimination of -rules n Constructing G L from G. (i) S L is not a recursive variable. Any recursive start variable can be made nonrecursive according to Lemma 4.1.1. (ii) If L(G), then S L P L. (iii) Delete all the -rules in G, except S L , from P L as follows: If A w P, w = w 1 A 1 w 2 A 2 … w k A k w k+1, and A 1, …, A k are nullable variables, then create 2 k - 1 new rules in P L, i.e., a set of 2 k possible combinations with the inclusion and exclusion of A 1, A 2, …, A k. n Example 4.2.2 { S, A, C } are nullable variables of G, where G: S ACA G L : S ACA | AC | CA | AA | A | C | A aAa | B | C A aAa | aa | B | C B bB | b B bB | b C cC | C cC | c
10
10 4.2 Elimination of -rules n Example 4.2.3 Let G be the grammar S ABC A aA | B bB | C cC | G generates a*b*c*. The nullable variables of G are S, A, B, and C. The equivalent grammar of G w/o -rules is G L, where S ABC | AB | AC | BC | A | B | C | A aA | a B bB | b C cC | c
11
11 4.3 Elimination of Chain rules n Definition. A rule of the form A B, where A, B V, which simply renames a variable in a derivation, is a chain rule. n The removal of chain rules increase the number of rules in the grammar, but reduce the length of derivations n Example. Consider the set of rules P A aA | a | B B bB | b | C Eliminating the chain rule A B by (i) adding A w, for every rule B w, and (ii) delete A B. Hence, P is modified as A aA | a | bB| b | C B bB | b | C However, another chain rule A C was created.
12
12 4.3 Elimination of Chain rules n Algorithm 4.3.1. Construction of the Set CHAIN(A) Input: A CFG G = (V, , P, S) and variable A Output: The set of chain rules of A 1. CHAIN(A) := { A } 2. PREV := 3. Repeat 3.1. NEW := CHAIN(A) – PREV 3.2. PREV := CHAIN(A) 3.3. For each variable B NEW do For each rule B C do CHAIN(A) := CHAIN(A) { C } Until CHAIN(A) = PREV.
13
13 4.3 Elimination of Chain rules n Theorem 4.3.3 Let G = (V, , P, S) be a CFG. There is a CFG G C = (V, , P C, S) that satisfies 1) L(G C ) = L(G). 2) G C has no chain rules. Proof. Using P & CHAIN(A), we compute the A rules in G C. The rule A w is in P C if variable B & string w. . i) B CHAIN(A) ii) B w P iii) w V (ensures that P C does not contain chain rules). Let w L(G) & A B be a maximal sequence of chain rules used to derive w, which can be generated by S uAv uBv upv w, and S uAv upv w where B p P is not a chain rule. G * G * G * G G * GcGc * GcGc GcGc *
14
14 4.3 Elimination of Chain rules Example 4.3.1. Given the following CFG S ACA | CA | AA | AC | A | C | A aAa | aa | B | C B bB | b C cC | c Using Algorithm 4.3.1, we generate CHAIN(S) = { S, A, C, B } CHAIN(A) = { A, C, B } CHAIN(B) = { B } CHAIN(C) = { C } Using these CHAIN sets, we generate G C, where P C G C is S ACA | CA | AA | AC | aAa | aa | bB | b | cC | c | A aAa | aa | bB | b | cC | c B bB | b C cC | c
15
15 4.7 Removal of Direct Left Recursion n Recursion is necessary to generate strings of arbitrary length. Left recursion causes problems, not recursion in general n Directly left recursive rules (e.g. A Aa) introduce the possibility of unending computations since Repeated applications of them fail to generate a prefix that can terminate the parse. n To avoid the possibility of a non-terminating parse, directly left recursive rules must be removed. n Example. Direct left-recursive rules No direct left-recursive rules G 1 :A Aa | b G 2 :A Aa | Ab | b | c G 3 :A AB | BA | a G 1 ’ : A bZ | b, Z aZ | a G 2 ’ : A bZ | cZ | b | c Z aZ | bZ | a | b B b | c G 3 ’ : A BAZ | aZ | BA | a Z BZ | B B b | c
16
16 4.7 Removal of Direct Left Recursion n The removal of any direct left recursion requires the addition of a new variable V, which introduces a set of directly right recursive rules n To remove direct left recursion on variable A, A is divided into directly left recursive rules: A Au 1 | Au 2 | … | Au j non-directly left recursive rules: A v 1 | v 2 | … | v k, where A is not the first symbol in v i (1 i k) n Strategy: build a string in a left-to-right manner by applying (i) non-recursive rules first, followed by (ii) constructing the remaining symbols by right recursion. A v 1 | … | v k | v 1 Z | … | v k Z Z u 1 | … | u j | u 1 Z | … | u j Z n Example 4.5.1 Given the rules A Aa | Aab | bb | b The A rules are: A bb | b | bbZ | bZ The Z rules are: Z a | ab | aZ | abZ
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.