Download presentation
Presentation is loading. Please wait.
Published bySharleen Chloe Waters Modified over 9 years ago
1
A Multi-span Language Modeling Frame Work For Speech Recognition Jimmy Wang Speech Lab, NTU
2
Outline 1.Introduction. 2.N-gram Language Modeling. 3.Smoothing and Clustering of N-gram Language Model. 4.LSA Modeling. 5.Hybrid LSA+N-gram Language Model. 6.Conclusion.
3
INTRODUCTION ㄌㄧㄡˊ ㄅㄤ ㄧㄡ ˇ ㄒㄩㄝ ˇ ㄢˋ ㄓㄨㄚ ㄉㄠˋ ㄧˊ ㄉㄨㄟˋ ㄒㄧㄤˋ. 劉邦友血案抓到一對象 劉邦友血案抓到一隊象 ㄕㄨㄟ ㄐㄧㄠ ㄧ ㄨㄢ ㄉㄨㄛ ㄕㄠ ㄑㄧㄢ. 水餃一碗多少錢 睡覺一晚多少錢
4
INTRODUCTION Stochastic Modeling of Speech Recognition :
5
INTRODUCTION N-gram language modeling has been the the formalism of choice for ASR because of reliability, but can only constraint locally. For global constraints, parsing and rule- based grammar have been only successful in small vocabulary application.
6
INTRODUCTION N-gram+LSA (Latent Semantic Analysis) language models integrate local constraints via N-gram, and global constraints through LSA models.
7
N-gram Language Model Assume each word depends only on the previous N-1 words (N words total). N-gram=N-1 order Markov Model. P( 象 |… 抓到一隊 ) P( 象 | 抓到, 一隊 ). Perplexity:
8
N-gram Language Model N-gram Training From Text Corpus: Corpus Size ranges from hundreds Mbytes to several Gbytes. Maximum Likelihood Approach: P( “ the | nothing but ” ) C( “ nothing but the ” ) / C( “ nothing but ” ).
9
Smoothing and Clustering Terrible on test data: If no occurrences of C(xyz), probability is 0. Find 0< <1 by optimizing on “ held- out ” data.
10
Smoothing and Clustering CLUSTERING = Classes of (same things). P(Tuesday | party on) or P(Tuesday | celebration on) = P(WEEKDAY|EVENT) Put words in clusters: P(WEEKDAY|EVENT) WEEKDAY = Sunday, Monday, Tuesday, EVENT=party, celebration, birthday. Clustering may lead to good result for very little training data.
11
Smoothing and Clustering Word Clustering Methods : 1.Build them by hand. 2.Part of Speech (POS) tags. 3.Automatic Clustering:Swap words between clusters to minimize perplexity. Automatic Clustering: 1.top-down splitting(Decision Tree): Fast. 2.bottom-up merging: Accurate.
12
LSA MODELING Word Co-Occurrence Matrix: W V=vocabulary of size M. M=40000~80000 T=training corpus of N documents. N=80000~100000 Ci,j=Number of words Wi in document Dj. Nj=Total number of words in Dj. Ei=normalized entropy of Wi in the corpus T.
13
LSA MODELING Vector Representation: SVD (Singular Value Decomposition) of W: U is MxR of vectors ui, represents words, S is RxR diagonal matrix of singular values, V is NxR of vectors vj, represents documents. Experiment on different values led to that R=100~300 seemed to be adequate balanced.
14
LSA MODELING Language Modeling: Hq-1:overall history of current document Word-Clustered LSA model : This clustering takes the global context and hence more semantic information.
15
LSA+N-gram Language Model Integration with N-grams: Maximum Entropy Estimation : Hq-1:overall history of n-gram component and LSA component.
16
LSA+N-gram Language Model Context Scope Selection: In real case, the prior probability would change over time. So we need to define the current document history or limit the size of history considered. Exponential Forgetting: 0< λ <=1
17
LSA+N-gram Language Model Initialization of V 0 : In the beginning, we may present the pseudo- document V 0 as: 1.Zero vector. 2.Centroid vector of all training documents. 3.If the domain is known, then we start at the centroid of specific region in the LSA space.
18
CONCLUSION Hybrid N-gram+LSA model performs much better than traditional N-gram in perplexity(~25%) and WER(~14%). LSA performs better in the within-domain testing data, and not so good for cross- domain testing. Discounting obsolete data using exponential forgetting can be better when the topics change incrementally.
19
CONCLUSION LSA modeling is much more sensitive to “ content words ” than “ function words ”, which is a complement for N-gram modeling Provided suitable domain adaptation framework, the hybrid LSA+N-gram model should improve the perplexity and recognition rate further more.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.