Presentation is loading. Please wait.

Presentation is loading. Please wait.

AUTOMATA THEORY. Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES.

Similar presentations


Presentation on theme: "AUTOMATA THEORY. Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES."— Presentation transcript:

1 AUTOMATA THEORY

2 Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES

3 Introduction  Context-free grammars (CFG) have played a central role in compiler technology since the 1960’s.  They turned the implementation of parsers, ad- hoc implementation task.  Parsers: functions that discover the structure of a program.

4 An informal example  Let us consider the language of palindromes.  A palindrome is a string that reads the same forward and backward, such as otto, madamimadam.  Let’s consider describing only the palindromes with alphabet {0,1}. EX: 0110,11011 etc.

5 A Context-free Grammar for Palindromes 1.P  є 2.P  0 3.P  1 4.P  0P0 5.P  1P1 Only for binary strings.

6 Definition of CFG  A CFG is a way of describing language by recursive rules called productions.  A CFG consists of … 1.A finite set of symbols/terminals/terminal symbols. 2.A finite set of variables/nonterminals. 3.A start symbol/start variable. 4.A finite set of productions/rules.

7 Definition of CFG (continue)  Each productions consists of: a.the head of the production. b.the production symbol  c.The body of the production, a string of zero or more terminals and variables.

8 Definition of CFG (continue)  The four components of CFG G can be represent as follows: G = (V, T, P, S) Variables terminals productions Start variable

9 A Context-free Grammar for Palindromes  The grammar G for the palindrome is represented by.. G = ({P},{0,1},A,P) pal where A represents the set of five productions:  P  є  P  0  P  1  P  0P0  P  1P1 only for binary string

10 Example of CFG  A CFG for simple expressions where the operators ‘+’ and ‘*’ present. It allows only the letters ‘a’ and ’b’ and the digits ‘0’ and ‘1’. Every identifiers must begin with a and b which may be followed by any other string in {a,b,0,1}*  G=({E,I},T,P,E)  T={0,1,a,b,+,*,(,)} productions:  E  I  E  E+E  E  E*E  E  (E)  I  a 6. I  b 7. I  Ia 8. I  Ib 9. I  I0 10 I  I1

11 Derivation using grammar  (ab+ab0) 1.E  (E)-------------4 2.E  (E+E)----------2 3.E  (I+E)-----------1 4.E  (Ib+E)---------8 5.E  (ab+E)--------5 6.E  (ab+I)----------1 7.E  (ab+I0)----------9 8.E  (ab+Ib0)--------8 9.E  (ab+ab0)-------5 productions: 1.E  I 2.E  E+E 3.E  E*E 4.E  (E) 5.I  a 6. I  b 7. I  Ia 8. I  Ib 9. I  I0 10 I  I1

12 Example of CFG  A CFG for syntactically correct infix algebraic expressions in the variables x, y and z.infix  G=({S},T,P,S)  T={x, y, z,-,+,*,/,(,)} productions: S → x S → y S → z S → S + S S → S - S S → S * S S → S / S S → ( S )

13 Derivation using grammar S → S * S S → S / S S → ( S ) productions: S → x S → y S → z S → S + S S → S - S

14 An informal example

15 An example of CFG

16

17 LMD and RMD  LMD (Left Most Derivation): At each step we replace the left most variable by one of its production bodies. Such a derivation is called a leftmost derivation. A derivation is leftmost by using the relations => and => for one or many steps.  RMD (Right Most Derivation): At each step we replace the right most variable by one of its production bodies. Such a derivation is called a rightmost derivation. A derivation is leftmost by using the relations => and => for one or many steps. lm rm

18 Left Most Derivation  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  LMD: a*(a+b00):  E =>E*E lm=>I*E lm=>a*E lm=>a*(E) lm=>a*(E+E) lm=>a*(I+E) lm=>a * (a+E) lm=>a*(a+I) lm=>a*(a+I0) lm=>a*(a+I00) lm=>a*(a+b00)

19 Right Most Derivation  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  RMD: a*(a+b00):  E =>E*E rm=>E*(E) rm=>E*(E+E) rm=>E*(E+I) rm=>E*(E+I0) rm=>E*(E+I00) rm=>E * (E+b00) rm=>E*(I+b00) rm=>E*(a+b00) rm=>I*(a+I00) rm=>a*(a+b00)

20 The Language of a Grammar  If G(V,T,P,S) is a CFG, the language of G, denoted L(G), is the set of terminal strings that have derivations from the start symbol. That is, L(G)={w in T | S  w} If a language L is the language of some context-free grammar, then L is said to be a context-free language, or CFL. G *

21 Parse Tree  A tree representation for derivations which shows clearly has the symbols of a terminal string are grouped into substrings.  Parse tree used in a compiler, data structure.  In a compiler, the tree structure of the source program facilities the translation of the source program into executable code by allowing natural, recursive functions to perform this translation process.  Graphical representation for a derivations.

22 Constructing Parse Tree  Let us fix on a grammar G=(V,T,P,S). The parse trees for G are trees with the following conditions: 1.Each interior node is labeled by a variable V. 2.Each leaf is labeled by either variable, a terminal or є. 3.If an interior node is labeled A, and its children are labeled X1, X2………………….,Xk respectively, from the left, then A  X1X2…Xk is a production.

23 Parse Tree Example  A parse tree showing the derivation of I+E from E. E E+ E I

24 Parse Tree Example (Continue..)  A parse tree showing the derivation P  0110. * 1.P  є 2.P  0 3.P  1 4.P  0P0 5.P  1P1 0 0P P 1 P 1 є

25 The Yield of a Parse Tree  If we look at the leaves of any parse tree and concatenate them from left, we get a string called the yield of a parse tree, which is always a string that is derived from the root variable. 1.The yield is a terminal string. That is, all leaves are labeled either with a terminal or with є. 2.The root is labeled by the start symbol.

26 Parse tree showing a*(a+b00) E E * E I a ()E E+E I a I I0 I 0 b

27 Parse tree showing ( x + y ) * x - z * y / ( x + x )

28 Parse tree showing The man read this book

29 Inference, Derivations, and Parse Trees Leftmost Derivation Rightmost Derivation Recursive Inference Parse Tree Derivation

30 Self Study   Theorem 5.12, 5.14, 5.18

31 Ambiguous Grammar  A grammar uniquely determines a structure for each string in its language. Not every grammar does provide unique structures.  When a grammar fails to provide unique structure, it is known as ambiguous grammar.  More than one derivation/parse tree.

32 Ambiguous Grammar example  Let us consider a CFG:  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1 Expression: a + a*a LMD: E  E+E  I+E  a+ E  a+ E*E  a+ I*E  a+ a*E  a+ a*I  a+ a*a RMD: E  E*E  E*I  E*a  E+E*a  E+I*a  E+ a*a  I+ a*a  a+ a*a rm lm

33 LMD E E + I a E * E I a I a E Fig: Trees yield a+a*a

34 RMD E E * I a E + E I a I a E Fig: Trees yield a+a*a

35 Removing Ambiguity from Grammar  Two causes of ambiguity in the grammar : 1.The precedence of operator is not respected. 2.A sequence of identical operators can group either from the left or from the right.

36 Prof. Busch - LSU36 Two derivation trees for

37 Prof. Busch - LSU37 take

38 Prof. Busch - LSU38 Good Tree Bad Tree Compute expression result using the tree

39 The solution of the problem of enforcing precedence is to introduce several different variables. 1.A factor- is an expression that cannot be broken apart by any adjacent operators. The only factors in our expression language are: i. Identifiers: It is not possible to separate the letters of identifier by attaching an operator. ii. Any parenthesized expression, no matter what appears inside the parenthesis. 2.A term- is an expression that cannot be broken by the ‘+’ operator. Term is product of one or more factors. 3.An expression-is a sum of one or more terms. Removing Ambiguity from Grammar

40  Let us consider a CFG:  CFG: E  I | E+E | E*E| (E) I  a| B| Ia |Ib |I0 | I1  An unambiguous expression grammar : I  a| B| Ia |Ib |I0 | I1 F  I| (E) T  F| T*F E  T| E+T Removing Ambiguity from Grammar

41 Unambiguous Grammar example CFG: I  a| B| Ia |Ib |I0 | I1 F  I| (E) T  F| T*F E  T| E+T Expression: a + a*a Derivation: E  E+T  T+T  F+ T  I+ T  a+ T  a+ T*F  a+ F*F  a+ I*I  a+ a*a

42 Inherent Ambiguity Topic 5.4.4 L={a n b n c m d m |n>=1, m>=1}U{a n b m c m d m | n>=1, m>=1}

43 E T + T a T * F I a I a E F I F E  E+T  T+T  F+ T  I+ T  a+ T  a+ T*F  a+ F*F  a+ I*I  a+ a*a Fig: Trees yield a+a*a Unambiguous Grammar example

44 Example of CFG  A CFG for generates prefix expressions with operands x and y and binary operators +, -, *. productions: E → x E → y E → +EE E → -EE E → *EE

45 Example of CFG  Design A CFG for the set of all strings with an equal number of a’s and b’s. productions: S→ aSbS | bSaS | Є

46 Example of CFG  Design A CFG on the string length that no string in L(G) has ba as a substring. productions: S→ aS | Sb | a| b

47 Example of CFG  Design A CFG for the regular expression 0*1(0+1)*. productions: S→ A1B A → 0A | Є B → 0B | 1B| Є

48 Example of CFG

49 Application of CFG  CFG- a way to describe natural language  Two of these uses:  1. Parsers  2. Markup language (HTML,XML)  Parsers:  A parse tree-as a graphical representation for derivations.  Parsing is the process of determining if a string of tokens can be generated by a grammar.  A complier may not actually construct a parse tree. However a parser must be capable of constructing such tree.  A parser can be constructed for any grammar. The CFG is an essential concept for the implementation of parsers.

50 YACC Parser Generator  Tools such as YACC take a CFG as input and produce a parser  Exp: Id {…} | Exp ‘+’ Exp {…} | Exp ‘*’ Exp {…} | ‘(’ Exp ‘)’ {…} Id: ‘a’ {…} |’b’ {…} |Id ‘a’ {…} |Id ‘b’ {…} |Id ‘0’ {…} |Id ‘1’ {…} ;

51 Rules for YACC Parser Generator  Rules: 1.Colon is used as the production symbol,  2.Productions-grouped together by the vertical bar 3.List of bodies for a given head ends with semicolon. 4.Terminals are quoted with single quotes 5.Variable names unquoted.

52 Markup Language  A family of language called markup languages. The string in these languages are documents with certain marks (called tags) in them.  Tags  semantics of various string within the documents.  The things I hate : 1. ABC xyz 2. AB ABC XYZ xy a) The text as viewed The things I hate ABC xyz AB ABC XYZ xy b) the HTML source EM  Emphasized string P  Paragraph OL  Ordered Lists LI  List Index

53 1.Char  a|A|… 2.Text  є |Char Text 3.Doc  є|Element Doc 4.Element  Text| Doc | List | 5. ListItem  Doc 6. List  є|ListItem List

54 Thank You


Download ppt "AUTOMATA THEORY. Chapter 05 CONTEX-FREE GRAMMERS AND LANGUAGES."

Similar presentations


Ads by Google