Download presentation
Presentation is loading. Please wait.
Published byBrian Hubbard Modified over 9 years ago
1
In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009
2
Syntax 101 Given a sentence, produce a syntax tree (parse) Example: ‘Mary like books’ Software which does this known as a parser
3
Grammars Context-Free Grammar (CFG) ▫Simple rules describing potential configurations ▫From example: S → NP VP NP → Mary VP → V NP V → likes NP → books Problems with ambiguity
4
Tree Substitution Grammar (TSG) Incorporates larger tree fragments Substitution operator (◦) combines fragments Context-free grammar is a trivial TSG ◦◦ =
5
Treebanks Database of sentences and corresponding syntax trees ▫Trees are hand-annotated Penn Treebanks among most commonly used Grammars can be created automatically from a treebank (training) ▫Extract rules (CFG) or fragments (TSG) directly from trees
6
Learning Grammar from Treebank Many rules or fragments will occur repeatedly ▫Incorporate frequencies into grammar ▫Probabilistic Context-Free Grammar (PCFG), Stochastic Tree Substitution Grammar (STSG) Data-Oriented Parsing (DOP) model ▫DOP1 (1992): Type of STSG ▫Describes how to extract fragments from a treebank for inclusion in grammar (model) ▫Generally limit fragments to a certain max depth
7
Penn Chinese Treebank Latest version 6.0 (2007) ▫Xinhua newswire (7339 sentences) ▫Sinorama news magazine (7106 sentences) ▫Hong Kong news (519 sentences) ▫ACE Chinese broadcast news (9246 sentences)
9
Penn Chinese Treebank and DOP Latest version 6.0 (2007) ▫Xinhua newswire (7339 sentences) ▫Sinorama news magazine (7106 sentences) ▫Hong Kong news (519 sentences) ▫ACE Chinese broadcast news (9246 sentences) Previous experiments (2004) with Penn Chinese Treebank and DOP1 ▫1473 trees selected from Xinhua newswire ▫Fragment depth limited to three levels or less
10
An improved DOP model: DOP* Challenges with DOP1 model ▫Computationally inefficient (exponential increase in number of fragments extracted) ▫Statistically inconsistent A new estimator: DOP* (2005) ▫Limits fragment extraction by estimating optimal fragments using subsets of training corpus Linear rather than exponential increase in fragments ▫Statistically consistent (accuracy increases as size of training corpus increases)
11
Research Question & Hypothesis Will a DOP* parser applied to the Penn Chinese Treebank show significant improvement in accuracy for a model incorporating fragments up to depth five compared to a model incorporating only fragments up to depth three? Hypothesis: Yes, accuracy will significantly increase ▫Deeper fragments allow parser to capture non- local dependencies in syntax usage/preference
12
Selecting training and testing data Subset of Xinhua newswire (2402 sentences) ▫Includes only IP trees (no headlines or fragments) Excluded sentences of average or greater length Remaining 1402 sentences divided three times into random training/test splits ▫Each test split has 140 sentences ▫Other 1262 sentences used for training
13
Preparing the trees Penn Treebank converted to dopdis format Chinese characters converted to alphanumeric codes Standard tree normalizations ▫Removed empty nodes ▫Removed A over A and X over A unaries ▫Stripped functional tags Original: (IP (NP-PN-SBJ (NR 上海 ) (NR 浦东 )) (VP … Converted: (ip,[(np,[(nr,[(hmeiahodpp_,[])]),(nr,[(hodoohmejc_,[])])]),(vp, …
14
Training & testing the parser DOP* parser is created by training a model with the training trees The parser is then tested by processing the test sentences ▫Parse trees returned by parser are compared with original parse trees from treebank Standard evaluation metrics computed: labeled recall, labeled precision, and f-score (mean) Repeated for each depth level, test/training split
15
Parsing Results DepthLabeled Recall Labeled Precision F-score 1 59.01%58.14%58.57% 3 71.64%67.42%69.47% 5 72.27%67.80%69.96%
16
Other interesting statistics Depth#Fragments Extracted Total Training Time (hours) Total Testing Time (hours) Seconds / Sentence 16,687 1.6590.3368.64 350,533 3.3420.60515.56 5166,760 4.0996.069156.06 Training time at depth-3 and depth-5 is similar, even though depth-5 has much higher fragment count Testing time though at depth-5 is ten times higher than testing time at depth-3!
17
Conclusion Obtain parsing results for other two testing / training splits, if similar: Increasing fragment extraction depth from three to five does not significantly improve accuracy for a DOP* parser over the Penn Chinese Treebank ▫Determine statistical significance ▫Practical benefit is negated by increased parsing time
18
Future Work Increase size of training corpus ▫DOP* estimation consistency: accuracy should increase as larger training corpus used Perform experiment with DOP1 model ▫Accuracy obtained with DOP* lower than previous experiments using DOP1 (Hearne & Way 2004) Qualitative analysis ▫What constructions are captured more accurately?
19
Future Work Perform experiments with other corpora ▫Other sections of Chinese Treebank ▫Other treebanks: Penn Arabic Treebank, … Increase capacity and stability of dopdis system ▫Encountered various failures on larger runs, crashing after as long as 36 hours ▫Efficiency could be increased by larger memory support (64-bit architecture), storage and indexing using a relational database system
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.