Download presentation
Presentation is loading. Please wait.
1
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05
2
Outline Lexicalized CFG (Recap) Hw5 and Project 2 Parsing evaluation measures: ParseVal Collin’s parser TAG Parsing summary
3
Lexicalized CFG recap
4
Important equations
5
Lexicalized CFG Lexicalized rules: Sparse data problem –First generate the head –Then generate the unlexicalized rule
6
Lexicalized models
7
An example he likes her
8
An example he likes her
9
Head-head probability
10
Head-rule probability
11
Estimate parameters
12
Building a statistical tool Design a model: –Objective function: generative model vs. discriminative model –Decomposition: independence assumption –The types of parameters and parameter size Training: estimate model parameters –Supervised vs. unsupervised –Smoothing methods Decoding:
13
Team Project 1 (Hw5) Form a team: program language, schedule, expertise, etc. Understand the lexicalized model Design the training algorithm Work out the decoding (parsing) algorithm: augment CYK algorithm. Illustrate the algorithms with a real example.
14
Team Project 2 Task: parse real data with a real grammar extracted from a treebank. Parser: PCFG or lexicalized PCFG Training data: English Penn Treebank Section 02-21 Development data: section 00
15
Team Project 2 (cont) Hw6: extract PCFG from the treebank Hw7: make sure your parser works given real grammar and real sentences; measure parsing performance Hw8: improve parsing results Hw10: write a report and give a presentation
16
Parsing evaluation measures
17
Evaluation of parsers: ParseVal Labeled recall: Labeled precision: Labeled F-measure: Complete match: % of sents where recall and precision are 100% Average crossing: # of crossing per sent No crossing: % of sents which have no crossing.
18
An example Gold standard: (VP (V saw) (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))) Parser output: (VP (V saw) (NP (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))))
19
ParseVal measures Gold standard: (VP, 1, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) System output: (VP, 1, 6), (NP, 2, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) Recall=4/4, Prec=4/5, crossing=0
20
A different annotation Gold standard: (VP (V saw) (NP (Det the) (N’ (N man)) (PP (P with) (NP (Det a) (N’ (N telescope))))) Parser output: (VP (V saw) (NP (Det the) (N’ (N man) (PP (P with) (NP (Det a) (N’ (N telescope)))))))
21
ParseVal measures (cont) Gold standard: (VP, 1, 6), (NP, 2, 3), (N’, 3, 3), (PP, 4, 6), (NP, 5, 6), (N’, 6,6) System output: (VP, 1, 6), (NP, 2, 6), (N’, 3, 6), (PP, 4, 6), (NP, 5, 6), (N’, 6, 6) Recall=4/6, Prec=4/6, crossing=1
22
EVALB A tool that calculates ParseVal measures To run it: evalb –p parameter_file gold_file system_output A copy is available in my dropbox You will need it for Team Project 2
23
Summary of Parsing evaluation measures ParseVal is the widely used: F-measure is the most important The results depend on annotation style EVALB is a tool that calculates ParseVal measures Other measures are used too: e.g., accuracy of dependency links
24
History-based models
25
History-based approaches maps (T, S) into a decision sequence Probability of tree T for sentence S is:
26
History-based models (cont) PCFGs can be viewed as a history-based model There are other history-based models –Magerman’s parser (1995) –Collin’s parsers (1996, 1997, ….) –Charniak’s parsers (1996,1997,….) –Ratnaparkhi’s parser (1997)
27
Collins’ models Model 1: Generative model of (Collins, 1996) Model 2: Add complement/adjunct distinction Model 3: Add wh-movement
28
Model 1 First generate the head constituent label Then generate left and right dependents
29
Model 1(cont)
30
An example Sentence: Last week Marks bought Brooks.
31
Model 2 Generate a head label H Choose left and right subcat frames Generate left and right arguments Generate left and right modifiers
32
An example
33
Model 3 Add Trace and wh-movement Given that the LHS of a rule has a gap, there are three ways to pass down the gap –Head: S(+gap) NP VP(+gap) –Left: S(+gap) NP(+gap) VP –Right: SBAR(that)(+gap) WHNP(that) S(+gap)
34
Parsing results LRLP Model 187.4%88.1% Model 288.1%88.6% Model 388.1%88.6%
35
Tree Adjoining Grammar (TAG)
36
TAG TAG basics: Extension of LTAG –Lexicalized TAG (LTAG) –Synchronous TAG (STAG) –Multi-component TAG (MCTAG) –….
37
TAG basics A tree-rewriting formalism (Joshi et. al, 1975) It can generate mildly context-sensitive languages. The primitive elements of a TAG are elementary trees. Elementary trees are combined by two operations: substitution and adjoining. TAG has been used in –parsing, semantics, discourse, etc. –Machine translation, summarization, generation, etc.
38
Two types of elementary trees VP ADVP ADV still VP* Initial tree:Auxiliary tree: S NP VP VNP draft
39
Substitution operation
40
They draft policies
41
Adjoining operation Y Y*
42
They still draft policies
43
Derivation tree Elementary trees Derived tree Derivation tree
44
Derived tree vs. derivation tree The mapping is not 1-to-1. Finding the best derivation is not the same as finding the best derived tree.
45
S V do S* they PN NP Wh-movement What do they draft ? i S i NP S VP V NP draft N what do PN they i i S NP S V S VP VNP draft what NP N
46
What does John think they draft ? S V does S* S NP VP V S* think Long-distance wh-movement S S NP VP V NP draft i i does think i i S NP S VS VP S NP VP V draft NP what John they
47
Who did you have dinner with? have S NP VP NP V S S* PN who iPP P NP with VP VP* i S NP PN whoPP P NP with VP have S NP V i i
48
TAG extension Lexicalized TAG (LTAG) Synchronized TAG (STAG) Multi-component TAG (MCTAG) ….
49
STAG The primitive elements in STAG are elementary tree pairs. Used for MT
50
Summary of TAG A formalism beyond CFG Primitive elements are trees, not rules Extended domain of locality Two operations: substitution and adjoining Parsing algorithm: Statistical parser for TAG Algorithms for extracting TAG from treebanks.
51
Parsing summary
52
Types of parsers Phrase structure vs. dependency tree Statistical vs. rule-based Grammar-based or not Supervised vs. unsupervised Our focus: Phrase structure Mainly statistical Mainly Grammar-based: CFG, TAG Supervised
53
Grammars Chomsky hierarchy: –Unstricted grammar (type 0) –Context-sensitive grammar –Context-free grammar –Regular grammar Human languages are beyond context-free Other formalism –HPSG, LFG –TAG –Dependency grammars
54
Parsing algorithm for CFG Top-down Bottom-up Top-down with bottom-up filter Earley algorithm CYK algorithm –Requiring CFG to be in CNF –Can be augmented to deal with PCFG, lexicalized CFG, etc.
55
Extensions of CFG PCFG: find the most likely parse trees Lexicalized CFG: –use less strong independence assumption –Account for certain types of lexical and structural dependency
56
Beyond CFG History-based models –Collins’ parsers TAG –Tree-writing –Mildly context-sensitive grammar –Many extensions: LTAG, STAG, …
57
Statistical approach Modeling –Choose the objective function –Decompose the function: Common equations: joint, conditional, marginal probabilities Independency assumptions Training: –Supervised vs. unsupervised –Smoothing Decoding –Dynamic programming –Pruning
58
Evaluation of parsers Accuracy: ParseVal Robustness Resources needed Efficiency Richness
59
Other things Converting into CNF: –CFG –PCFG –Lexicalized CFG Treebank annotation –Tagset: syntactic labels, POS tag, function tag, empty categories –Format: indentation, brackets
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.