Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiments in Adaptive Language Modeling Lidia Mangu & Geoffrey Zweig.

Similar presentations


Presentation on theme: "Experiments in Adaptive Language Modeling Lidia Mangu & Geoffrey Zweig."— Presentation transcript:

1 Experiments in Adaptive Language Modeling Lidia Mangu & Geoffrey Zweig

2 Motivation Multi-domain recognition IBM Superhuman Recognition Program –Switchboard / Fisher –Voicemail –Call Center –ICSI Meetings One-size LM may not fit all –Even a gigantic LM

3 Lots of Past Work Kneser & Steinbiss ’93 –On The Dynamic Adaptation of Stochastic Language Modeling” –Tune mixing weights to suit particular text Chen, Gauvain, Lamel, Adda & Adda ’01 –“Language Model Adaptation for Broadcast News Transcription” –Build and add new LMs from relevant training data Florian & Yarowsky ’99 – Hierarchical LMs Gao, Li & Lee ’00 – Upweight training counts whose frequency is similar to that in test Seymore & Rosenfeld ’97- Interpolate Topic LMs Bacchiani & Roark ’03 – MAP adaptation for voicemail Many others.

4 Plan of Attack No adaptation: The Superhuman LM –8-way LM from multiple domains Baseline adaptation: Adjust interpolation weights per conversation Extended adaptation: build new LM from relevant training data

5 Description of Atomic LMs SWB + CallHome –3.4M words, 1.4M 3-gms Broadcast News –148M words, 38M 3-gms Financial Call Ceneters –655K words, 303K 3-gms UW Web data (conversational-like) –192M words, 48M 3-gms SWB Cellular –244K words, 134K 3-gms UW Web data (meeting-like) –28M words, 12M 3-gms UW Newsgroup data –102M words, 34M 3-gms Voicemail –1.1M words, 551K 3-gms

6 Description of Lattice-Building Models & Process Generate lattices with bigram LM –Word-internal acoustic context –3.6K acoustic units; 142K gaussians –PLP + VTLN + FMLLR + MMI LM rescoring w/ 8-way interpolated LM Acoustic rescoring w/ cross-word AM –Cross-word AM –10K acoustic units; 589K gaussians –PLP + VTLN + FMLLR + ML Adapt on scripts of the last step –Adjust interpolation weights to minimize perplexity on decoded scripts

7 Baseline Adaptation Results UnadaptedSupervisedUnsupervised Meetings40.239.639.9 Call-center142.140.841.0 Call-center239.238.137.7 Swb ‘9832.532.032.1 Voicemail26.925.125.7 Ave. Benefit-1.1%-0.9%

8 Results on RT03 UnadaptedSupervisedUnsupervised Fisher24.224.024.1 SWB33.032.932.8 All28.828.6 Benefit-0.2%

9 Conclusions Simple adaptation effective for a multi- domain system –Contrasts some previous results on BN Not very sensitive to initial decoding errors Dynamic LM construction to be explored


Download ppt "Experiments in Adaptive Language Modeling Lidia Mangu & Geoffrey Zweig."

Similar presentations


Ads by Google