Language model using HTK Raymond Sastraputera
Overview Introduction Implementation Experimentation Conclusion
Introduction Language model HTK 3.3 N-gram Windows binary Word 1
Implementation Database Preparation Mapping OOV words Word map N-gram file Mapping OOV words Vocabulary list
Implementation Language model generation Perplexity Unigram Bigram Trigram 4-gram Perplexity
Result N-gram Perplexity unigram 401.7305 bigram 131.8727 trigram 113.2483 4-gram 109.9411
Conclusion and Summary Higher n-gram Less perplexity More memory usage Too high means over fitting Multiple backed Waste of time
Reference 1. HTK (http://htk.eng.cam.ac.uk/)
Thank you Any Questions?