The CYK Algorithm Presented by Aalapee Patel Tyler Ondracek CS6800 Spring 2014
A Membership Problem To determine if the given string is a member of the language defined by a context free grammar. Given a context-free grammar G and a string w – G = (V, ∑,P, S) where » V finite set of variables » ∑ (the alphabet) finite set of terminal symbols » P finite set of rules » S start symbol (distinguished element of V) Is W in the language of G?
CYK Algorithm Developed by J. Cocke D. Younger, T. Kasami to answer the membership problem Input should be in Chomsky Normal form – A BC – A a – S λ where B, C Є V – {S} Uses bottom up parsing Uses dynamic programming or table filling algorithm
CYK Basic Idea Let u = x 1 x 2 …x n be a string to be tested for membership Step 1: For each substring of u of length 1 find the set of variables A with a rule A -> x i,i Step 2: for each substring of u of length 2 find the set of variables A that derives A -> x i,i+1 : Step n: for the string x 1,n find the set of variables A that derives A -> x 1,n
The Diagonal Table Approach W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 W1W2W3W4 Formula for filling the table: W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) Fill the table using the above formula where W i,j is a production of the Grammar The final row (i.e. W 1,4 ) determines if the string w is in L(G) If it contains the start symbol (S) then w is in L(G)
CYK Table Filling Example W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol
CYK Table Filling Example W 1,4 W 1,3 W 2,4 {S, C}W 2,3 W 3,4 {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 1,2 = (W 1,1, W 2,2 ) = {A, C} {B} = {AB, CB}. What rules form AB and CB?
CYK Table Filling Example W 1,4 W 1,3 W 2,4 {S, C} {∅}{∅} W 3,4 {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 2,3 = (W 2,2, W 3,3 ) = {B} {B} = {BB}. What rules form BB?
CYK Table Filling Example W 1,4 W 1,3 W 2,4 {S, C} ∅ {C} {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 3,4 = (W 3,3, W 4,4 ) = {B} {A} = {BA}. What rules form BA?
CYK Table Filling Example W 1,4 {C}W 2,4 {S, C} ∅ {C} {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 1,3 = (W 1,1, W 2,3 ), (W 1,2, W 3,3 ) = {A, C} { ∅ } U {S, C} {B}= {A, C, SB, CB}. What rules form A, C, SB or CB?
CYK Table Filling Example W 1,4 {C}{B} {S, C} ∅ {C} {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 2,4 = (W 2,2, W 3,4 ), (W 2,3, W 4,4 ) = {B} { C } U { ∅ } {A}= {BC, A} What rules form BC or A?
CYK Table Filling Example {S, A, C} {C}{B} {S, C} ∅ {C} {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba W i,j = (W i, i, W i+1, j ), (W i, i+1, W i+2, j ) …… (W i, j-1, W j, j ) W 1,4 = (W 1,1, W 2,4 ), (W 1,2, W 3,4 ), (W 1,3, W 4,4 ) = {A, C} { B } U { S, C } {C} U {C} {A} = {AB, CB, SC, CC, CA} What rules form AB, CB, SC, CC or CA?
CYK Table Filling Example {S, A, C} {C}{B} {S, C} ∅ {C} {A, C}{B} {A} cbba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba The string w is in the language Since W 1,n which is W 1,4 has the start symbol
CYK Table Filling Example W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbaa W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol
CYK Table Filling Example W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 aaba W 1,4 W 1,3 W 2,4 W 1,2 W 2,3 W 3,4 W 1,1 W 2,2 W 3,3 W 4,4 cbba First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol S AB | BC A BA | a B CC | b C AB | a
References David Rodriguez-Velazquez “The CYK Algorithm” 2009 course website Savitha parur venkitachalam “Membership problem CYK Algorithm” 2013 course presentation Languages and Machines, An Introduction to the Theory of Computer Science - Thomas A. Sudkamp