Download presentation
Published byAlondra Dormer Modified over 10 years ago
1
CSCI 3130: Formal Languages and Automata Theory Tutorial 5
Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1
2
Agenda Cocke-Younger-Kasami (CYK) algorithm Pushdown Automata (PDA)
Parsing CFG in normal form Pushdown Automata (PDA) Design 2
3
Bottom-up Parsing for normal form
CYK Algorithm Bottom-up Parsing for normal form 3
4
Cocke-Younger-Kasami Algorithm
Used to parse context-free grammar in Chomsky normal form (or simply normal form) Every production is of type X YZ X a S ε Normal Form Example S AB A CC | a | c B BC | b C CB | BA | c 4
5
CYK Algorithm - Idea = Algorithm 2 in Lecture Note (10L8.pdf)
Idea: Bottom Up Parsing Algorithm: Given a string s of length N For k = 1 to N For every substring of length k Determine what variable(s) can derive it 5
6
CYK Algorithm - Example
CFG Parse abbc S AB A CC | a | c B BC | b C CB | BA | c 6
7
CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-1 substring abbc 7
8
CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-2 substring abbc 8
9
CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-3 substring abbc Length-4 substring Done! 9
10
CYK Algorithm – Idea (2) Idea: Parsing of longer substrings depends on parsing of shorter substrings Example: abb may be decomposed as ab + b a + bb If we know how to parse ab and b (or, a and bb) then we know how to parse abb 10
11
CYK Algorithm – Substring
Denote sub(i, j) := substring with start index = i and end index = j Example: For abbc, sub(2,4) = bbc This notation is not to complicate things, but just for the sake of convenience in the following discussion… 11
12
CYK Algorithm – Table Each cell corresponds to a substring
Store variables deriving the substring Substring of length = 3 Starting with index = 2 i.e., sub(2,3) = bbc Length of Substring a b b c 12 Start Index of Substring
13
CYK Algorithm – Simulation
Base Case : length = 1 The possible choices of variable(s) can be known by scanning through each production S AB A CC | a | c B BC | b C CB | BA | c A B B A , C a b c 13
14
CYK Algorithm – Simulation
Loop : length = 2 For each substring of length 2 Decompose into shorter substrings Check cells below it A B A, C S AB A CC | a | c B BC | b C CB | BA | c ab Let’s parse this substring a b c 14
15
CYK Algorithm – Simulation
For sub(1,2) = ab, it can be decomposed: ab = a + b = sub(1,1) + sub(2,2) Possible choices: AB Scan rules A B A, C : S S AB A CC | a | c B BC | b C CB | BA | c S a b c 15
16
CYK Algorithm – Simulation
For sub(2,3) = bb, it can be decomposed: bb = b + b = sub(2,2) + sub(3,3) Possible choices: BB Scan rules No suitable rules are found The CFG cannot parse this substring S A B A, C : ∅ S AB A CC | a | c B BC | b C CB | BA | c ∅ a b c 16
17
CYK Algorithm – Simulation
For sub(3,4) = bc, it can be decomposed: bc = b + c = sub(3,3) + sub(4,4) Possible choices: BA, BC Scan rules S ∅ A B A, C : B, C S AB A CC | a | c B BC | b C CB | BA | c B, C a b c 17
18
CYK Algorithm – Simulation
For sub(1,3) = abb: abb = ab + b = sub(1,2) + sub(3,3) Possible choices: SB Scan rules No suitable variables found yet But, there is another way to decompose the string S ∅ B, C A B A, C : ∅ S AB A CC | a | c B BC | b C CB | BA | c a b c 18
19
CYK Algorithm – Simulation
For sub(1,3) = abb: abb = a + bb = sub(1,1) + sub(2,3) Possible choices: ∅ Scan rules Cant parse smaller substring Cant parse the string No need to scan rules S ∅ B, C A B A, C S AB A CC | a | c B BC | b C CB | BA | c a b c 19
20
CYK Algorithm – Simulation
For sub(1,3) = abb: abb = sub(1,1) + sub(2,3) gives no valid parsing abb = sub(1,2) + sub(3,3) gives no valid parsing Cannot parse S ∅ B, C A B A, C S AB A CC | a | c B BC | b C CB | BA | c ∅ a b c 20
21
CYK Algorithm – Simulation
For sub(2,4) = bbc: bbc = sub(2,2) + sub(3,4) Possible choices: BB, BC bbc = sub(2,3) + sub(4,4) Possible choices: ∅ Variable: B ∅ S B, C A B A, C S AB A CC | a | c B BC | b C CB | BA | c B a b c 21
22
CYK Algorithm – Simulation
Finally, for sub(1,4) = abbc: Possible choices: Variables: This cell represents the original string, and it consists S abbc is in the language AB , SB, SC ∅ B S B, C A B A, C S S AB A CC | a | c B BC | b C CB | BA | c a b c 22
23
CYK Algorithm – Parse Tree
abbc is in the language! How to obtain the parse tree? Tracing back the derivations: sub(1,4) is derived using SAB from sub(1,1) and sub(2,4) sub(1,1) is derived using Aa sub(2,4) is derived using BBC from sub(2,2) and sub(3,4) … So, record also the used derivations! 23
24
CYK Algorithm – Parse Tree
Obtained from the table S ∅ B B, C A B A, C a b c 24
25
CYK Algorithm – Conclusion
A bottom up parsing algorithm Dynamic Programming Solution of a subproblem (parsing of a substring) depends on that of smaller subproblems Before employing CYK Algorithm, convert the grammar into normal form Remove ε-productions Remove unit-productions 25
26
CYK Algorithm – Detailed
D = “On input w = w1w2…wn: If w = ε, and S ε is rule, Accept For i = 1 to n: For each variable A: Test whether A b is a rule, where b = wi. If so, place A in table(i, i). For l = 2 to n: For i = 1 to n – l + 1: Let j = i + l – 1, For k = i to j – 1: For each rule A BC: If table(i,k) contains B and table(k+1, j) contains C Put A in table(i, j) If S is in table (1,n), accept. Otherwise, reject.” 26
27
NFA with infinite memory/states
Pushdown Automata NFA with infinite memory/states 27
28
Pushdown Automata PDA ~= NFA, with a stack of memory Transition:
NFA – Depends on input PDA – Depends on input and top of stack Push a symbol to stack Pop a symbol to stack Read a terminal on string Transitions are non-deterministic (possibly ε) 28
29
Pushdown Automata and NFA
Accept: NFA – Go to an Accept state PDA – Go to an Accept state 29
30
PDA – Example 1 Given the following language: Design a PDA for it
L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 30
31
PDA – Example 1 - Idea Idea: The input has two sections First half
All ‘0’s Second half All ‘1’s #‘1 depends on #‘0’ #‘0’ ≤ #‘1’ ≤ #‘0’ × 2 31
32
PDA – Example 1 – Solution
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 32
33
PDA – Example 1 – Explain Solution: Let’s try some string… w = 00111
See white board for simulation… q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 33
34
PDA – Example 1 – Explain Solution: Indicates the start of parsing
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 34
35
PDA – Example 1 – Explain Solution:
This part saves information about #‘0’ # ‘X’ in stack = #‘0’ q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 35
36
PDA – Example 1 – Explain Solution: This part accounts for #‘1’
#‘0’ ≤ #‘1’ ≤ #‘0’ × 2 q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 36
37
PDA – Example 1 – Explain Solution: Consume one ‘X’ and eats one ‘1’
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 37
38
PDA – Example 1 – Explain Solution: Consume one ‘X’ and eats two ‘1’
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 38
39
PDA – Example 1 – Explain Solution: Consume one ‘X’, and then
eats one ‘1’, or eat two ‘1’ q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 39
40
PDA – Example 1 – Explain Solution: Indicates the end of parsing
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 40
41
PDA – Example 2 Given the following language: Design a PDA for it
L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 41
42
PDA – Example 2 – Idea Idea:
Sequentially read (multiple) ‘a’, ‘b’, ‘c’ and ‘d’ Maintain: #‘a’ + #‘c’ #‘b’ + #‘d’ If these numbers equal Accept 42
43
PDA – Example 2 – Solution
e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 43
44
PDA – Example 2 – Explain Solution:
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY start a b c d end L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 44
45
PDA – Example 2 – Explain Solution: Each X in stack = An extra a or c
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 45
46
PDA – Example 2 – Explain Solution: Each Y in stack = An extra b or d
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 46
47
PDA – Example 2 – Explain Solution: X and Y ‘cancel’ each other
The stack contains only X’s or only Y’s e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 47
48
PDA – Example 2 – Explain Solution: No X’s and no Y’s means
#a + #c = #b + #d Accept e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 48
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.