Download presentation
Presentation is loading. Please wait.
Published byJessie Dalton Modified over 8 years ago
1
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley
2
Grammar Induction Grammar Engineer EM Data Output
3
Central Questions How do we specify what we want to learn? How do we fix observed errors? NPs are things like DT NN That’s not quite it!
4
Experimental Set-up Binary Grammar { X 1, X 2, … X n } plus POS tags Data WSJ-10 [7k sentences] Evaluate on Labeled F 1 Grammar Upper Bound: 86.1 XjXj X i XkXk
5
Unconstrained PCFG Induction Learn PCFG with EM Inside-Outside Algorithm Lari & Young [90] Results 0 i j n (Inside) (Outside)
6
Encoding Knowledge What’s an NP? Semi-Supervised Learning
7
Encoding Knowledge What’s an NP? For instance, DT NN JJ NNS NNP Prototype Learning
8
Prototype List
9
DT The NN koala How to use prototypes? NP PP VP VBD sat IN in DT the NN tree ¦ ¦ DT The NN koala JJ hungry NP PhrasePrototype NPDT NN PPIN DT NN VPVBD IN DT NN
10
Distributional Similarity (DT NN) (JJ NNS) (NNP NNP) NP (VBD DT NN) (MD VB NN) (VBN IN NN) VP (IN NN) (IN PRP) (TO CD CD) PP (DT JJ NN) { ¦ __ VBD : 0.3, VBD __ ¦ : 0.2, IN __ VBD: 0.1, ….. } proto=NP Clark ‘01 Klein & Manning ‘01
11
Distributional Similarity (DT NN) (JJ NNS) (NNP NNP) NP (VBD DT NN) (MD VB NN) (VBN IN NN) VP (IN NN) (IN PRP) (TO CD CD) PP (IN DT) proto=NONE
12
DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry Prototype CFG + Model proto=NPproto=NONE
13
Prototype CFG + Model NP PP VP DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ S JJ hungry NP P (DT NP | NP) P (proto=NP | NP) CFG Rule Proto Feature P CFG + (T,F) = X(i,j)! 2 T P( | X) P(p i,j | X)
14
Prototype CFG + Induction Experimental Set-Up Add Prototypes BLIPP corpus Results Unlabeled F 1 66.9
15
Constituent-Context Model DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry Klein & Manning ‘02 + - P(VBD IN DT NN | +) P(NN __ ¦ | +) Yield Context P(NN VBD| - ) P(JJ __ IN | - )
16
Constituent-Context Model Bracketing Matrix P CCM (B, F’) = (i,j) P(B i,j ) P(y i,j | B i,j ) P(c i,j | B i,j ) DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry YieldContext
17
Intersected Model Different Aspects of Syntax Intersected EM [Klein 2005, Liang et. al. ‘06] P(T, F, F’) = P CCM (B(T), F’) P CFG + (T,F) CCMCFG PP NP
18
Grammar Induction Experiments Intersected CFG + and CCM Add CCM brackets Results
19
Reacting to Errors Possessive NPs Our TreeTarget Tree
20
Reacting to Errors Add Category: NP-POS NN POS New Analysis
21
Error Analysis Modal VPs Our TreeTarget Tree
22
Reacting to Errors Add Category: VP-INF VB NN New Analysis
23
Fixing Errors Supplement Prototypes NP-POS and VP-INF Results Unlabeled F 1 78.2
24
Results Summary 26.3 65.1 86.1 62.2 56.8 050100 PCFG PROTO CCM Best Upper Bound Labeled F1 ¼ 50% Error reduction From PCFG to BEST
25
Conclusion Prototype-Driven Learning Flexible Weakly Supervised Framework Merged distributional clustering techniques with structured models
26
Thank You! http://www.cs.berkeley.edu/~aria42
27
Lots of numbers
28
More numbers
29
Yet more numbers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.