Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.

Similar presentations


Presentation on theme: "Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein."— Presentation transcript:

1 Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein

2 The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]

3 The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00]

4 The Game of Designing a Grammar  Annotation refines base treebank symbols to improve statistical fit of the grammar  Parent annotation [Johnson ’98]  Head lexicalization [Collins ’99, Charniak ’00]  Automatic clustering?

5 Previous Work: Manual Annotation  Manually split categories  NP: subject vs object  DT: determiners vs demonstratives  IN: sentential vs prepositional  Advantages:  Fairly compact grammar  Linguistic motivations  Disadvantages:  Performance leveled out  Manually annotated [Klein & Manning ’03] ModelF1 Naïve Treebank Grammar72.6 Klein & Manning ’0386.3

6 Previous Work: Automatic Annotation Induction  Advantages:  Automatically learned: Label all nodes with latent variables. Same number k of subcategories for all categories.  Disadvantages:  Grammar gets too large  Most categories are oversplit while others are undersplit. [Matsuzaki et. al ’05, Prescher ’05] ModelF1 Klein & Manning ’0386.3 Matsuzaki et al. ’0586.7

7 Previous work is complementary Manual Annotation Allocates splits where needed Very tedious Compact Grammar Misses Features Automatic Annotation Splits uniformly Automatically learned Large Grammar Captures many features This Work Allocates splits where needed Automatically learned Compact Grammar Captures many features

8 Forward Learning Latent Annotations EM algorithm: X1X1 X2X2 X7X7 X4X4 X5X5 X6X6 X3X3 Hewasright.  Brackets are known  Base categories are known  Only induce subcategories Just like Forward-Backward for HMMs. Backward

9 Overview Limit of computational resources - Hierarchical Training - Adaptive Splitting - Parameter Smoothing

10 Refinement of the DT tag DT DT-1 DT-2 DT-3 DT-4

11 Refinement of the DT tag DT

12 Hierarchical refinement of the DT tag

13 Hierarchical Estimation Results ModelF1 Baseline87.3 Hierarchical Training88.4

14 Refinement of the, tag  Splitting all categories the same amount is wasteful:

15 The DT tag revisited Oversplit?

16 Adaptive Splitting  Want to split complex categories more  Idea: split everything, roll back splits which were least useful

17 Adaptive Splitting  Want to split complex categories more  Idea: split everything, roll back splits which were least useful

18 Adaptive Splitting  Want to split complex categories more  Idea: split everything, roll back splits which were least useful

19 Adaptive Splitting  Evaluate loss in likelihood from removing each split = Data likelihood with split reversed Data likelihood with split  No loss in accuracy when 50% of the splits are reversed.

20 Adaptive Splitting Results ModelF1 Previous88.4 With 50% Merging89.5

21 Number of Phrasal Subcategories

22 PP VP NPNP

23 Number of Phrasal Subcategories X NA C

24 Number of Lexical Subcategories TOTO, PO S

25 Number of Lexical Subcategories IN DT RBRB VBx

26 Number of Lexical Subcategories N NN S NN P JJ

27 Smoothing  Heavy splitting can lead to overfitting  Idea: Smoothing allows us to pool statistics

28 Linear Smoothing

29 Result Overview

30

31 ModelF1 Previous89.5 With Smoothing90.7

32 Final Results F1 ≤ 40 words F1 all words Parser Klein & Manning ’0386.385.7 Matsuzaki et al. ’0586.786.1 This Work90.289.7

33 Final Results F1 ≤ 40 words F1 all words Parser Klein & Manning ’0386.385.7 Matsuzaki et al. ’0586.786.1 Collins ’9988.688.2 Charniak & Johnson ’0590.189.6 This Work90.289.7

34 Linguistic Candy  Proper Nouns (NNP):  Personal pronouns (PRP): NNP-14Oct.Nov.Sept. NNP-12JohnRobertJames NNP-2J.E.L. NNP-1BushNoriegaPeters NNP-15NewSanWall NNP-3YorkFranciscoStreet PRP-0ItHeI PRP-1ithethey PRP-2itthemhim

35 Linguistic Candy  Relative adverbs (RBR):  Cardinal Numbers (CD): RBR-0furtherlowerhigher RBR-1morelessMore RBR-2earlierEarlierlater CD-7onetwoThree CD-4198919901988 CD-11millionbilliontrillion CD-0150100 CD-313031 CD-9785834

36 Conclusions  New Ideas:  Hierarchical Training  Adaptive Splitting  Parameter Smoothing  State of the Art Parsing Performance:  Improves from X-Bar initializer 63.4 to 90.2  Linguistically interesting grammars to sift through.

37 Thank You! petrov@eecs.berkeley.edu

38 Other things we tried  X-Bar vs structurally annotated grammar:  X-Bar grammar starts at lower performance, but provides more flexibility  Better Smoothing:  Tried different (hierarchical) smoothing methods, all worked about the same  (Linguistically) constraining rewrite possibilities between subcategories:  Hurts performance  EM automatically learns that most subcategory combinations are meaningless: ≥ 90% of the possible rewrites have 0 probability


Download ppt "Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein."

Similar presentations


Ads by Google