Download presentation
Presentation is loading. Please wait.
Published byAllan Carroll Modified over 9 years ago
2
1 Alphabets: An Alphabet is a finite set of symbols. We will usually use to denote the alphabet of input symbols or “terminal characters.” String: A String (or “sentence” or “word”) is a finite sequence of symbols. The set + : The set of all strings over of length 1 or more. The length of x is |x|. The empty string ; | | = 0 * Concatenation xy or x·y. x = x = x x 2 = xx * : the set of all strings over of length 0 or more. * { } x is a Prefix of y if there exists a z such that y = xz x is a Proper Prefix of y if x is a prefix of y and z y.
3
2 A Language is some subset of *. Terminals are members of . Another set of symbols (alphabet) are the non-terminals (or variables or syntactic categories) which represent strings of terminals. Vocabulary symbols are terminals or non-terminals. Concatenation of languages L · M = {xy| x L, y M} L i = concatenation of L i times, L 0 = { } L M = {x| x L or x M} The closure of L is L * = i=0 L i = L 0 L 1 L 2 ... The positive closure of L is L +.
4
3 A Production rule is written as or ::= . A phrase structure grammar G is a quadruple (N, , P, S), where N: finite set of non-terminals. : finite set of alphabet (terminals). P: a set of products. S: the start symbol. Example: G 1 = ({A, S}, {0,1}, P, S) where P is S 0A1 0A 00A1 A If is a string in (N ) * and is a production in G, then we say directly derives and write . : derives in one or more steps. : derives in zero or more steps. + *
5
4 If S then is called a Sentential Form of G. If S x then x is called a Sentence of G. The language generated by G, written L(G), is {x| x * and S x} Now, G 1 = ({A, S}, {0,1}, P, S) where P is S 0A1 0A 00A1 A , therefore L(G 1 ) = {0 n 1 n | n > 0} CONVENTIONS: Terminals: a, b, c, d, 0, 1, +, (, ), begin Non-terminals: A, B, C, D, S, Vocabulary symbols: U, V, W, X, Y, Z Strings of terminals: u, v, w, x, y, z Strings of vocabulary symbols: , , * * *
6
5 Type 0: Unrestricted Grammars any Type 1: Context Sensitive Grammars(CSG) for all , | | | | Type 2: Context Free Grammars(CFG) for all , N (i.e., A ) Type 3: Right (or Left)-Linear Grammars if all productions are of the form A x or A xB G 2 = ({S, B, C}, {a, b, c}, P, S) P: S aSBC S abC CB BC bB bb bC bc cC cc Which Type ? What the language is? G 3 : S S + S S S * S S (S) S a Which Type ? What language ?
7
6 An Ambiguous Grammar is one for which some sentence has two or more different parse trees. // Show that the last one at previous page is ambiguous grammar// // Try to prove the following CFG grammar is ambiguous: S AB | CD A 0A | B 1B2 | C 2C | D 0D1 | // // Try to prove the following CFG grammar is ambiguous(?!): S if X then S | M M if X then M else S X X + T | T T T * F | F F (X) | a //
8
7 L = {0 n 1 n | n 1} is a Context Free Language ? Yes, since S 0S1 | 01 generates L. A RECOGNIZER is a machine (system) with a finite description that can accept a terminal string for some grammar and determine whether the string is in the language accepted by the grammar. A PARSER can, in addition, find a derivation for the string. PARSING Alternatives: Suppose we want to parse id * id + id in G 0 : E E + T | T T T * F | F F (E) | id, then E E + T T P T * P id P id id This parse tree might be created with left-most derivation or right-most derivation as follows:
9
8 E E + T T + T T * P + T P * P + T id * P + T id * id + T id * id + P id * id + id lm Try it yourself !
10
9 Pumping Lemma for Regular Sets Let L be a regular set, then there exists a constant p>0 depending on L such that for every w L where |w| p, w=xyz where 0 xy k z L for all k 0. Pf: Let M be a finite automaton that accepts L. Let p be the number of states in M. Select w L such that |w| p, then w can be written as a 1 a 2 a 3... a n-1 a n s 0 s 1 s 2 s n-1 s n Since n+1>p, not all of the states can be unique. Let s i = s j for some i<j i+p, now let x = a 1 a 2... a i, y = a i+1... a j, z = a j+1... a n. Now we can delete y from w or insert y any number of times and we will still go from the start to final state. So xy k z L for every k 0. QED. The feeling of pumping x y z
11
10 Qz1: Prove L={ 0 n 1 n | n 1} is NOT regular Pf: Assume L is regular. Pick a “large enough” string in L, say w = 0 p 1 p. Now, show no substring y of w can be pumped. One of the following must be true. (1) y = 0 i for some i 1 but xy 2 z = 0 p+i 1 p L (2) y = 1 i for some i 1 but xy 2 z = 0 p 1 p+i L (3) y = 0 i 1 j for some i, j 1 but xy 2 z = 0 p-i 0 i 1 j 0 i 1 j 1 p-j L Therefore, L is NOT regular. QED.
12
11 Qz2: Prove L={ 0 p | p is a prime number} is NOT regular Pf: Assume L is regular.[thus any string of L is pumpable] Let w = xyz where x=a p, y=a q, z=a r, p,r 0, q> 0, then 0 p+nq+r L for each n 0, that is p+nq+r is prime for each n 0. But this is impossible, since let n = p+2q+r+2, then p+nq+r = (q+1)(p+2q+r) which is a product of two natural numbers each greater than 1. So, if n=p+2q+r+2, then p+nq+r is NOT prime. Therefore, it is controversy to the assumption of w L for each n 0. QED.
13
12 Nondeterministic Finite Automata q0q0 q1q1 q2q2 q3q3 q4q4 0,1 start 0 1 1 It is Greek to you ? , : delta , : sigma : tau , : gamma , : phi Z, : zeta
14
13 Def: DFA: M=(K, , , S 0, F) where K =set of states, = set of alphabet, S 0 K, the start state. F K, set of finite states, and : K * K, the transition function Theorem: Let L be a set accepted by a nondeterministic finite state automaton. Then there exists a deterministic finite state automaton that accepts L.
15
14 Prove the grammar G with productions S 0S1 | 01 accepts exactly L={0 n 1 n | n 1} PROOF: First show L(G) L (i.e., the grammar generates only string in L.) Inductive hypothesis: If w L(G) derived in k steps, then w L. Basis: k=1, the only one-step derivation is S 01 and 01 L. Inductive step: assume inductive hypothesis is true for k = k 0 1; show true for k = k 0 +1>1. Since k >1 the first step must be S 0S1 0x1 = w. But S x is of no more then k 0 steps, so by hypothesis x L, say x = 0 i 1 i, i 1. Then w = 0x1 = 0 i+1 1 i+1 L. Now show L L(G) (i.e., the grammar generates all strings of L.) Inductive hypothesis: If w L and |w| = 2k, w L(G). Basis: k=1, the only string in L of length 2 is 01. But S 01 so 01 L(G). Inductive step: assume inductive hypothesis is true for k=k 0 1; show true for k = k 0 +1>1. Since the length of w is 2k, w = 0 k 1 k. By inductive hypothesis 0 k-1 1 k-1 L(G) and thus S 0 k-1 1 k-1. So S 0S1 0 0 k-1 1 k-1 1 = w is a valid derivation for w. Thus w L(G). L L(G), so L = L(G). K-1 **
16
15 A Push-Down Automaton (PDA) is a septuple P=(Q, , , , q 0, Z, F), where Q is finite set of states, is a finite input alphabet, is a finite stack alphabet, maps elements of Q * ( ⋃ { }) * into finite subsets of Q * * q 0 Q is start state, Z is start stack symbol, F Q is set of final states. Example: Let P=({q 0, q 1, q 2 }, {0,1}, {Z, 0}, , q 0, Z, {q 0 }) where (q 0, 0, Z) = {(q 1, 0Z)} (q 1, 0, 0) = {(q 1, 00)} (q 1, 1, 0) = {(q 2, )} (q 2, 1, 0) = {(q 2, )} (q 2, , Z) = {(q 0, )} L(P)={0 n 1 n | n 1} ? Why ?
17
16 A Configuration of P is a triple (q, w, ) Q * * * *. A Move (q, aw, Z ) (q i, w, i ) occurs if (q i, i ) (q, a, Z). An Initial Configuration is (q 0, w, Z). A string w is Accepted by P if (q 0, w, Z) (q, , ) for q F, *. The Language Accepted by P, L(P) is the set of all strings P accepts. * 接續上一頁之話題 : (q 0, 0011, Z) (q 1, 011, 0Z) (q 1, 11, 00Z) (q 2, 1, 0Z) (q 2, , Z) (q 0, , ) 用 暫代 Now, try to build a PDA that accepts L={ww R | w (0, 1) + }.
18
17 (q 0, 0, Z) = {(q 0, 0Z) } (q 0, 1, Z) = {(q 0, 1Z) } (q 0, 0, 1) = {(q 0, 01) } (q 0, 1, 0) = {(q 0, 10) } (q 0, 0, 0) = {(q 0, 00), (q 1, ) } (q 0, 1, 1) = {(q 0, 11), (q 1, ) } (q 1, 0, 0) = {(q 1, ) } (q 1, 1, 1) = {(q 1, ) } (q 1, , Z) = {(q 1, ) } Two items are included, thus it is a Nondeterministic PDA.
19
18 A Deterministic PDA is one in which (1). q Q, Z , whenever (q, , Z) , then (q, a, Z)= a . (2). q Q, a ( { }), Z , (q, a, Z) contains at most one element. Converting a CFG to a PDA : For each production A , make (q, ) (q, , A). For each a , make (q, ) (q, a, a). Show whether some specific language L is a CFL ? 1.If L is NOT a CFL, then we may prove it by pumping lemma of CFL. 2.If L is a CFL, then we may prove it by (a) giving a deterministic/nondeterministic pushdown automaton for L( but sometime this DPDA doesn’t exist, since DPDA accepts only a subset of all CFL’s) or, (b) giving a context-free grammar for L.
20
19 Theorem: For any CFL L, there exists a constant p depending on L such that z L, where |z| p, z may be written as z = uvwxy such that 1. |vx| 1 (i.e., both are not ) 2. |vwx| p 3. uv i wx i y L i 0. { 證明相似於 RL.} Prove L ={ a i b i c i | i 0} is NOT a CFL. Proof: If it were, by pumping lemma of CFL, p>0 z L where |z| p, let z = a p b p c p = uvwxy such that (i). |vx| 1 (ii). |vwx| p (iii). uv i wx i y L i 0.
21
20 But (1) suppose vwx = a j, j p, then uwy = a p-l b p c p L, since |vx| 0, l 0. It is a contradiction to (iii) uwy L when let i=0. The same argument holds for vwx = b j or vwx = c j. (2) suppose vwx = a j b k, j,k p, then uwy = a p-l’ b p-l’’ c p L, since |vx| 0, either l’ 0 or l’’ 0 or both. It is a contradiction to (iii) uwy L when let i=0. The same argument holds for vwx = b j c k. (3) suppose vwx = a j b p c k, but |vwx| p, so vwx cannot contain both a’s and c’s. Thus, there are no pumpable substrings. It concludes that L cannot be context free.
22
21 Begin by extending to FIRST k and FOLLOW k : FIRST k ( ) = { w | ( |w| < k and w) or ( |w| = k and wx for some x) } * * * The domain of FIRST k is extended to sets of strings in the natural way. FOLLOW k (A) = { w | S A and w FIRST k ( ) } * G is LL(k) for some fixed k iff whenever there are two leftmost derivations S wA w w x and S wA w w y and , then FIRST k (x) FIRST k (y).
23
22 S Abc | aAcb A | b | c For left-sentential form S: FIRST 1 (Abc) = { b, c } FIRST 1 (aAcb) = { a } For left-sentential form Abc: FIRST 1 ( bc) = { b } FIRST 1 (bbc) = { b } FIRST 1 (cbc) = { c } FIRST 2 ( bc) = { bc } FIRST 2 (bbc) = { bb } FIRST 2 (cbc) = {cb } In left-sentential form Acb: FIRST 2 ( cb) = { cb } FIRST 2 (bcb) = { bc } FIRST 2 (ccb) = {cc } No multiply defined entries We know LL(2) grammar.
24
23 FIRST 1 FOLLOW 1 FIRST 2 FOLLOW 2 S a, b, c $ ab,ac,bb,bc,cb $$ A , b, c b, c , b, c bc, cb Some grammars are not LL(k) for any k. For instance, S A | B A aAb | 0 B aBbb | 1 L(G) = {a n 0b n | n 0} {a n 1b 2n | n 0} is not LL(k). Assume it were, S A a n 0b n, S B a n 1b 2n for any n. Let k = 2m, m I +, then FIRST k ( a 2m 0b 2m ) = FIRST k ( a 2m 1b 4m ), But A B. Since k is arbitrary, the G is not LL(k) for any k.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.