Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example.

Similar presentations


Presentation on theme: "1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example."— Presentation transcript:

1 1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example. The productions take the form , where  and  are strings over an alphabet of terminals and nonterminals. Read  as, “  produces ,” “  derives ,” or “  is replaced by .” The following four expressions are productions for a grammar. Terminals: {a, b}, the alphabet of the language. Nonterminals: {S, B}, the grammar symbols (uppercase letters), disjoint from terminals. Start symbol: S, a specified nonterminal alone on the left side of some production. Sentential form: any string of terminals and/or nonterminals. Derivation: a transformation of sentential forms by means of productions as follows: If x  y is a sentential form and    is a production, then the replacement of  by  in x  y to obtain x  y is a derivation step, which we denote by x  y  x  y. Example Derivation: S  aSB  aaSBB  aaBB  aabBB  aabbB  aabbb. This is a leftmost derivation, where each step replaces the leftmost nonterminal. The symbol  + means one or more steps and  * means zero or more steps. So we could write S  + aabbb or S  * aabbb or aSB  * aSB, and so on. S  aSB S  S   B  bB B  b. Alternative Short Form S  aSB |  B  bB | b.

2 2 The Language of a Grammar The language of a grammar is the set of terminal strings derived from the start symbol. Example. Can we find the language of the grammar S  aSB |  and B  bB | b? Solution: Examine some derivations to see if a pattern emerges. S  S   S  aSB  aB  ab S  aSB  aB  abB  abbB  abbb S  aSB  aaSBB  aaBB  aabB  aabb S  aSB  aaSBB  aaBB  aabBB  aabbBB  aabbbB  aabbbb. So we have a pretty good idea that the language of the grammar is {a n b n+k | n, k  N}. Quiz (1 minute). Describe the language of the grammar S  a | bcS. Solution: {(bc) n a | n  N}. Construction of Grammars Example. Find a grammar for {a n b | n  N}. Solution: We need to derive any string of a’s followed by b. The production S  aS can be used to derive strings of a’s. The production S  b will stop the derivation and produce the desired string ending with b. So a grammar for the language is S  aS | b. Quiz (1 minute). Find a grammar for {ba n | n  N}. Solution: S  Sa | b. Quiz (1 minute). Find a grammar for {(ab) n | n  N}. Solution: S  Sab |  or S  abS | .

3 3 Rules for Combining Grammars Let L and M be two languages with grammars that have start symbols A and B, respectively, and with disjoint sets of nonterminals. Then the following rules apply. L  M has a grammar starting with S  A | B. LM has a grammar starting with S  AB. L* has a grammar starting with S  AS | . Example. Find a grammar for {a m b m c n | m, n  N}. Solution: The language is the product LM, where L = {a m b m | m  N} and M = {c n | n  N}. So a grammar for LM can be written in terms of grammars for L and M as follows. S  AB A  aAb |  B  cB | . Example. Find a grammar for the set Odd, of odd decimal numerals with no leading zeros, where, for example, 305  Odd, but 0305  Odd. Solution: Notice that Odd can be written in the form Odd = (PD*)*O, where O = {1, 3, 5, 7, 9}, P = {1, 2, 3, 4, 5, 6, 7, 8, 9}, and D = {0}  P. Grammars for O, P, and D can be written with start symbols A, B, and C as: A  1 | 3 | 5 | 7 | 9, B  A | 2 | 4 | 6 | 8, and C  B | 0. Grammars for D* and PD* and (PD*)* can be written with start symbols E, F, and G as: E  CE | , F  BE, and G  FG | . So a grammar for Odd with start symbol S is S  GA.

4 4 Example. Find a grammar for the language L defined inductively by, Basis: a, b, c  L. Induction: If x, y  L then ƒ(x), g(x, y)  L. Solution: We can get some idea about L by listing some of its strings. a, b, c, ƒ(a), ƒ(b), …, g(a, a), …, g(ƒ(a), ƒ(a)), …, ƒ(g(b, c)), …, g(ƒ(a), g(b, ƒ(c))), … So L is the set of all algebraic expressions made up from the letters a, b, c, and the function symbols ƒ and g of arities 1 and 2, respectively. A grammar for L can be written as S  a | b | c | ƒ(S) | g(S, S). For example, a leftmost derivation of g(ƒ(a), g(b, ƒ(c))) can be written as S  g(S, S)  g(ƒ(S), S)  g(ƒ(a), S)  g(ƒ(a), g(S, S))  g(ƒ(a), g(b, S))  g(ƒ(a), g(b, ƒ(S)))  g(ƒ(a), g(b, ƒ(c))). S g ( S, S ) ƒ ( S ) b a A Parse Tree is a tree that represents a derivation. The root is the start symbol and the children of a nonterminal node are the symbols (terminals, nonterminals, or  ) on the right side of the production used in the derivation step that replaces that node. Example. The tree shown in the picture is the parse tree for the following derivation: S  g(S, S)  g(ƒ(S), S)  g(ƒ(a), S)  g(ƒ(a), b).

5 5 Example. Is the grammar S  SaS | b ambiguous? Solution: Yes. For example, the string babab has two distinct leftmost derivations: S  SaS  SaSaS  baSaS  babaS  babab. S  SaS  baS  baSaS  babaS  babab. The parse trees for the derivations are pictured. S S a S bb b S b bb Ambiguous Grammar: Means there is at least one string with two distinct parse trees, or equivalently, two distinct leftmost derivations or two distinct rightmost derivations. Quiz (2 minutes). Show that the grammar S  abS | Sab | c is ambiguous. Solution: The string abcab has two distinct leftmost derivations: S  abS  abSab  abcab and S  Sab  abSab  abcab. Unambiguous Grammar: Sometimes one can find a grammar that is not ambiguous for the language of an ambiguous grammar. Example. The previous example showed S  SaS | b is ambiguous. The language of the grammar is {b, bab, babab, …}. Another grammar for the language is S  baS | b. It is unambiguous because S produces either baS or b, which can’t derive the same string. Example. The previous quiz showed S  c | abS | Sab is ambiguous. Its language is {(ab) m c(ab) n | m, n  N}. Another grammar for the language is S  abS | cT and T  abT | . It is unambiguous because S produces either abS or cT, which can’t derive the same string. Similarly, T produces either abT or , which can’t derive the same string.


Download ppt "1 Section 3.3 Grammars A grammar is a finite set of rules, called productions, that are used to describe the strings of a language. Notational Example."

Similar presentations


Ads by Google