Download presentation
Presentation is loading. Please wait.
Published byMyron Dalton Modified over 9 years ago
1
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm
2
October 2005CSA3180: Parsing Algorithms 22 Problems with DFTD Parser Left Recursion Ambiguity Inefficiency
3
October 2005CSA3180: Parsing Algorithms 23 Left Recursion A grammar is left recursive if it contains at least one non-terminal A for which A * A and * Intuitive idea: derivation of that category includes itself along its leftmost branch. NP NP PP NP NP and NP NP DetP Nominal DetP NP ' s
4
October 2005CSA3180: Parsing Algorithms 24 Infinite Search
5
October 2005CSA3180: Parsing Algorithms 25 Dealing with Left Recursion Reformulate the grammar –A A | as –A A' A' A' | –Disadvantage: different (and probably unnatural) parse trees. Use a different parse algorithm
6
October 2005CSA3180: Parsing Algorithms 26 Ambiguity Coordination Ambiguity: different scope of conjunction: Black cats and dogs like to play Attachment Ambiguity: a constituent can be added to the parse tree in different places: I shot an elephant in my pyjamas VP → VP PP NP → NP PP
7
October 2005CSA3180: Parsing Algorithms 27 Catalan Numbers The nth Catalan number counts the ways of dissecting a polygon with n+2 sides into triangles by drawing nonintersecting diagonals. No of PPs# parses 22 35 414 5132 6469 71430 84867
8
October 2005CSA3180: Parsing Algorithms 28 Handling Disambiguation Statistical disambiguation Semantic knowledge.
9
October 2005CSA3180: Parsing Algorithms 29 Repeated Parsing of Subtrees a flight4 from Indianapolis3 to Houston2 on TWA1 A flight from Indianapolis3 A flight from Indianapolis to Houston 2 A flight from Indianapolis to Houston on TWA 1
10
October 2005CSA3180: Parsing Algorithms 210 Earley Algorithm Dynamic Programming: solution involves filling in table of solutions to subproblems. Parallel Top Down Search Worst case complexity = O(N 3 ) in length N of sentence. Table, called a chart, contains N+1 entries ● book● that● flight● 0 1 2 3
11
October 2005CSA3180: Parsing Algorithms 211 The Chart Each table entry contains a list of states Each state represents all partial parses that have been reached so far at that point in the sentence. States are represented using dotted rules containing information about –Rule/subtree: which rule has been used –Progress: dot indicates how much of rule's RHS has been recognised. –Position: text segment to which this parse applies
12
October 2005CSA3180: Parsing Algorithms 212 Examples of Dotted Rules Initial S Rule S → ● VP, [0,0] Partially recognised NP NP → Det ● Nominal, [1,2] Fully recognised VP VP → V VP ●, [0,3] These states can also be represented graphically
13
October 2005CSA3180: Parsing Algorithms 213 The Chart
14
October 2005CSA3180: Parsing Algorithms 214 Earley Algorithm Main Algorithm: proceeds through each text position, applying one of the three operators below. Predictor: Creates "initial states" (ie states whose RHS is completely unparsed). Scanner: checks current input when next category to be recognised is pre-terminal. Completer: when a state is "complete" (nothing after dot), advance all states to the left that are looking for the associated category.
15
October 2005CSA3180: Parsing Algorithms 215 Early Algorithm – Main Function
16
October 2005CSA3180: Parsing Algorithms 216 Early Algorithm – Sub Functions
17
October 2005CSA3180: Parsing Algorithms 217
18
October 2005CSA3180: Parsing Algorithms 218
19
October 2005CSA3180: Parsing Algorithms 219 fl
20
October 2005CSA3180: Parsing Algorithms 220 Retrieving Trees To turn recogniser into a parser, representation of each state must also include information about completed states that generated its constituents
21
October 2005CSA3180: Parsing Algorithms 221
22
October 2005CSA3180: Parsing Algorithms 222 Chart[3] ↑ Extra Field
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.