Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.

Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein

The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]

The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00]

The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00]  Automatic clustering?

Forward Learning Latent Annotations EM algorithm: X1X1 X2X2 X7X7 X4X4 X5X5 X6X6 X3X3 Hewasright.  Brackets are known  Base categories are known  Only induce subcategories Just like Forward-Backward for HMMs. Backward [Matsuzaki et al. ‘05]

Overview Limit of computational resources - Hierarchical Training - Adaptive Splitting - Parameter Smoothing

Refinement of the DT tag DT-1 DT-2 DT-3 DT-4 DT

Refinement of the DT tag DT

Hierarchical refinement of the DT tag DT

Hierarchical Estimation Results ModelF1 Baseline87.3 Hierarchical Training88.4

Refinement of the, tag  Splitting all categories the same amount is wasteful:

Adaptive Splitting  Want to split complex categories more  Idea: split everything, roll back splits which were least useful Likelihood with split reversed Likelihood with split

Adaptive Splitting Results ModelF1 Previous88.4 With 50% Merging89.5

Number of Phrasal Subcategories

PP VP NPNP Number of Phrasal Subcategories

X NA C Number of Phrasal Subcategories

TOTO, PO S Number of Lexical Subcategories

N NN S NN P JJ

Smoothing  Heavy splitting can lead to overfitting  Idea: Smoothing allows us to pool statistics

ModelF1 Previous89.5 With Smoothing90.7 Result Overview

 Proper Nouns (NNP):  Personal pronouns (PRP): NNP-14Oct.Nov.Sept. NNP-12JohnRobertJames NNP-2J.E.L. NNP-1BushNoriegaPeters NNP-15NewSanWall NNP-3YorkFranciscoStreet PRP-0ItHeI PRP-1ithethey PRP-2itthemhim Linguistic Candy

 Relative adverbs (RBR):  Cardinal Numbers (CD): RBR-0furtherlowerhigher RBR-1morelessMore RBR-2earlierEarlierlater CD-7onetwoThree CD-4198919901988 CD-11millionbilliontrillion CD-0150100 CD-313031 CD-9785834

Inference She heard the noise. Exhaustive parsing: 1 min per sentence

Coarse-to-Fine Parsing [Goodman ‘97, Charniak&Johnson ‘05] Coarse grammar NP … VP Treebank Parse Prune NP-17 NP-12 NP-1 VP-6 VP-31… Refined grammar … Parse

Hierarchical Pruning Consider again the span 5 to 12: …QPNPVP… coarse: split in two: …QP1QP2NP1NP2VP1VP2… …QP1 QP3QP4NP1NP2NP3NP4VP1VP2VP3VP4… split in four: split in eight: …………………………………………… < t

Intermediate Grammars X-Bar= G 0 G= G1G2G3G4G5G6G1G2G3G4G5G6 Learning DT 1 DT 2 DT 3 DT 4 DT 5 DT 6 DT 7 DT 8 DT 1 DT 2 DT 3 DT 4 DT 1 DT DT 2

G1G2G3G4G5G6G1G2G3G4G5G6 Learning G1G2G3G4G5G6G1G2G3G4G5G6 Projected Grammars X-Bar= G 0 G= Projection  i 0(G)1(G)2(G)3(G)4(G)5(G)0(G)1(G)2(G)3(G)4(G)5(G) G

Final Results (Efficiency)  Parsing the development set (1600 sentences)  Berkeley Parser:  10 min  Implemented in Java  Charniak & Johnson ‘05 Parser  19 min  Implemented in C

Final Results (Accuracy) ≤ 40 words F1 all F1 ENG Charniak&Johnson ‘05 (generative)90.189.6 This Work90.690.1 GER Dubey ‘0576.3- This Work80.880.1 CHN Chiang et al. ‘0280.076.6 This Work86.383.4

Extensions  Acoustic modeling  Infinite Grammars  Nonparametric Bayesian Learning [Petrov, Pauls & Klein ‘07] [Liang, Petrov, Jordan & Klein ‘07]

Conclusions  Split & Merge Learning  Hierarchical Training  Adaptive Splitting  Parameter Smoothing  Hierarchical Coarse-to-Fine Inference  Projections  Marginalization  Multi-lingual Unlexicalized Parsing

Thank You! http://nlp.cs.berkeley.edu

Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.

Similar presentations

Presentation on theme: "Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.

Similar presentations

Presentation on theme: "Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein."— Presentation transcript:

Similar presentations

About project

Feedback