Download presentation
Presentation is loading. Please wait.
1
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6801 עיבוד שפות טבעיות - שיעור תשע Bottom Up Parsing עידו דגן המחלקה למדעי המחשב אוניברסיטת בר אילן
2
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6802 The CYK Parsing Algorithm Cocke-Younger-Kasami Assumes the grammar is in CNF (and depends on this!) Based on a “dynamic programming” approach: –Build solutions compositionally from sub-solutions –Store sub-solutions and re-use them whenever necessary Recognition version: decide whether S * w
3
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6803 CNF Chomsky Normal Form (CNF): –The grammar is ε-free –Each production of the grammar is either of the form A →B C or A →a (i.e., either 2 non-terminal symbols or 1 terminal symbol on RHS) Any CFG can be converted into a weakly equivalent CFG in Chomsky Normal Form
4
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6804 The CYK Parsing Algorithm Principles of the Algorithm: –Input is: w = x 1 …x n –We denote w ij = x i x i+1 …x i+j-1 the sub string of w of length j starting with x i –For every w ij and for every variable A in the grammar, the algorithm determines if A * w ij
5
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6805 The CYK Parsing Algorithm The algorithm works on substrings of increasing length: We start with substrings of length 1: w i1 =x i, for 1 i n –A * w i1 if A x i is a rule in the grammar. We then continue with substrings of length 2,3,...
6
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6806 The CYK Parsing Algorithm For a substring w ij we consider all possible ways of breaking it into two parts w ik and w i+k j-k A * w ij if: –A BC is a rule in the grammar –B * w ik –C * w i+k j-k
7
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6807 CYK Finally, since w = w 1n we need to verify that S * w 1n
8
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6808 CYK We keep the results for every w ij in a table V ij Each table entry V ij contains the set of variables A such that A * w ij Note that we only need to fill in entries up to the diagonal – the longest substring starting at is of length n-i+1
9
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6809 Example Baby grammar –S NP VP –NP Det Nominal |ProperNoun | Det Noun | NP PP –Nominal Noun Noun | Nominal Noun Nominal –VP Verb NP | VP VP PP –PP Preposition NP –Noun boy | cat | dog | cake | candle –Verb likes | ate | drinks –Det a | the –Preposition from| to | on | near |with
10
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68010 12345 theboyatethecake 1 2X 3XX 4XXX 5XXXX DetNounVerbDetNoun NP- - -- VP -- S
11
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68011 Time Complexity of CYK O(n 3 )
12
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68012 Adding Parsing to CYK We need to construct parse trees for strings in L(G) Idea: –Keep back-pointers to the table entries that we combine –At the end - reconstruct a parse from the back- pointers This allows us to find all parse trees (possibly an exponential number)
13
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68013 12345678 0theboyatethecakewithacandle 1 2X 3XX 4XXX 5XXXX 6XXXXX 7XXXXXX 8XXXXXXX DetNounVerbP Noun NP - - - - VP - - S DetNounDet - NP PP NP VP -- -- ---- S- - - -
14
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68014 CNF Why did we want a Grammar in CNF?
15
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68015 Chart Parsing of non-CNF Grammars Version from Allen’s NLU book Bottom up – left to right Starts with lexical constituents (POS) Matches found constituents to extend right hand sides of rules, until all right-hand side matched Stores constituents in a chart, to allow repeated use
16
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68016
17
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68017
18
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68018
19
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68019
20
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68020
21
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68021
22
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68022
23
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68023
24
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68024
25
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68025
26
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68026
27
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68027 Complexity O(N 3 ) Constant depends on grammar parameters
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.