Download presentation
Presentation is loading. Please wait.
1
Smoothing Issues in the Strucutred Language Model
The Center for Language and Speech Processing The Johns Hopkins University 3400 N. Charles Street, Barton Hall Baltimore, MD 21218 Woosung Kim, Sanjeev Khudanpur, and Jun Wu The Center for Language and Speech Processing, The Johns Hopkins University {woosung, sanjeev, Introduction The Structured Language Model(SLM) - An attempt to exploit the syntactic structure of natural language - Consists of a predictor, a tagger and a parser - Jointly assigns a probability to a word sequence and parse structure - Still suffers from data sparseness problem, Deleted Interpolation(DI) has been used Use of Kneser-Ney smoothing to improve the performance Experiment Result N-Best Rescoring Test Set PPL as a Function of l ASR WER for SWB Two corpora Wall Street Journal(WSJ) Upenn Treebank for LM PPL test Switchboard(SWB) For ASR WER test as well as LM PPL Tokenization Original SWB tokenization Examples : They’re, It’s, etc. Not Suitable for syntactic analysis Treebank tokenization Examples : They ‘re, It ‘s, etc. Kneser-Ney Smoothing Backoff Nonlinear Interpolation Smoothing 3gram SLM Inptl Deleted Intpl 39.1% 38.6% 38.2% KN-BO(Predictor) 38.3% 37.7% 37.5% KN-BO(All Modules) 37.8% Nonlinear Intpl 38.1% 37.6% NI w/Deleted Est. Speech Recognizer (Baseline LM) 100 Best Hyp Speech (New LM) Rescoring 1 hypothesis Database Size Specifications(in Words) Item WSJ SWB Word Voc. Part-Of-Speech Tags Non-Terminal Tags Parser Operations 10K(open) 40 54 136 21K(closed) 49 64 112 LM Dev. Set LM Check Set LM Test Set ASR Test Set 885K 117K 82K - 2.07M 216K 20K Concluding Remarks KN smoothing of the SLM shows modest but consistent improvements – both PPL and WER Future Work SLM with Maximum Entropy Models But Maximum Entropy Model training requires heavy computation Fruitful results in the selection of features for the Maximum Entropy Models The Structured Language Model(SLM) Example of a Partial Parse Probability estimation in the SLM ended VP with PP loss NP of PP Language Model Perplexity contract NP loss NP cents NP WSJ Corpus SWB EM Iter. 3gram SLM Intpl Smoothing EM0 EM3 70 73 72 67 66 Deleted 162 166 154 149 146 64 63 60 KN-BO (Predictor) 152 139 137 All Modules 170 153 141 140 65 61 Nonlinear 132 131 NI w/Deleted Estimation 145 150 130 The contract ended with a loss of cents after DT NN VBD IN DT NN IN CD NNS Parse tree probability predictor tagger LM PPL parser This research was partially supported by the U.S. National Science Foundation via STIMULATE grant No
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.