Language model using HTK

Slides:



Advertisements
Similar presentations
Click here to insert title of presentation Type names of authors here Affiliations may be listed here on this line Introduction Insert text here. The box.
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Scalable Mining For Classification Rules in Relational Databases מוצג ע ” י : נדב גרוסאוג Min Wang Bala Iyer Jeffrey Scott Vitter.
Test practice Multiplication. Multiplication 9x2.
REDUCED N-GRAM MODELS FOR IRISH, CHINESE AND ENGLISH CORPORA Nguyen Anh Huy, Le Trong Ngoc and Le Quan Ha Hochiminh City University of Industry Ministry.
Numbers
1 Language Model (LM) LING 570 Fei Xia Week 4: 10/21/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA A A.
CMU-Statistical Language Modeling & SRILM Toolkits
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Efficient Minimal Perfect Hash Language Models David Guthrie, Mark Hepple, Wei Liu University of Sheffield.
HTK Tool Kit. What is HTK tool kit HTK Tool Kit The HTK language modeling tools are a group of programs designed for constructing and testing statistical.
IEEE 64 th ECTC – Orlando, FL, USA May 27–30, 2014 Add Company Logo Here.
COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION Md. Akmal Haidar and Douglas O’Shaughnessy INRS-EMT,
NLP. Introduction to NLP Extrinsic –Use in an application Intrinsic –Cheaper Correlate the two for validation purposes.
Efficient Language Model Look-ahead Probabilities Generation Using Lower Order LM Look-ahead Information Langzhou Chen and K. K. Chin Toshiba Research.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Group – 8 Maunik Shah Hemant Adil Akanksha Patel.
10-1 人生与责任 淮安工业园区实验学校 连芳芳 “ 自我介绍 ” “ 自我介绍 ” 儿童时期的我.
Tooele Valley Change Application Policy. Summary Review Water Right Issues Policy Conclusions.
A Simple Approach for Author Profiling in MapReduce
PRESENTATION ON VIRTUAL CAMPUS
A Straightforward Author Profiling Approach in MapReduce
Germany VS United Kingdom
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
7/4/2018.
7/5/2018.
XINFO - Programming Languages - Java
Hire Toyota Innova in Delhi for Outstation Tour
In the name of God Language Modeling Mohammad Bahrani Feb 2011.
GOOD PRACTICES FOR DATA DISSEMINATION
Yahoo Mail Customer Support Number
Most Effective Techniques to Park your Manual Transmission Car
How do Power Car Windows Ensure Occupants Safety
ASSESSING THE USABILITY OF MODERN STANDARD ARABIC DATA IN ENHANCING THE LANGUAGE MODEL OF LIMITED SIZE DIALECT CONVERSATIONS Authers:- Tiba Zaki Abulhameed.
Web Content FileSystem
Click here to insert title of presentation
Neural Language Model CS246 Junghoo “John” Cho.
Language-Model Based Text-Compression
SYEN 3330 Digital Systems Chapter 4 – Part 2 SYEN 3330 Digital Systems.
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Getting Started with MS Access
General presentation of MOOCs
THANK YOU!.
Memory Management Overview
Their Eyes Were Watching God Essay & Multiple Choice Metacognition
Fine-grained vs Coarse-grained multithreading
Getting Started with FileMaker Pro
Thank you.
Thank you.
Presented by Wen-Hung Tsai Speech Lab, CSIE, NTNU 2005/07/13
word Vocabulary Cards Language Literature Social Studies Science Math
Ашық сабақ 7 сынып Файлдар мен қапшықтар Сабақтың тақырыбы:
Windows басқару элементтері
Introduction to Text Analysis
Presenter’s Name (if case)
Research Paper Overview.
Қош келдіңіздер!.
Social Practice of the language: Describe and share information
LANGUAGE EDUCATION.
Информатика пән мұғалімі : Аитова Карима.
Graduation Project Project Name
LEARNING & DEVELOPMENT STRATEGY: PROCESS OVERVIEW
Конференција на МАКС, 17 јуни
The Design and Implementation of a Log-Structured File System
Presentation transcript:

Language model using HTK Raymond Sastraputera

Overview Introduction Implementation Experimentation Conclusion

Introduction Language model HTK 3.3 N-gram Windows binary Word 1

Implementation Database Preparation Mapping OOV words Word map N-gram file Mapping OOV words Vocabulary list

Implementation Language model generation Perplexity Unigram Bigram Trigram 4-gram Perplexity

Result N-gram Perplexity unigram 401.7305 bigram 131.8727 trigram 113.2483 4-gram 109.9411

Conclusion and Summary Higher n-gram Less perplexity More memory usage Too high means over fitting Multiple backed Waste of time

Reference 1. HTK (http://htk.eng.cam.ac.uk/)

Thank you Any Questions?