Language model using HTK

Slides:

Advertisements

Similar presentations

Click here to insert title of presentation Type names of authors here Affiliations may be listed here on this line Introduction Insert text here. The box.

Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.

Scalable Mining For Classification Rules in Relational Databases מוצג ע ” י : נדב גרוסאוג Min Wang Bala Iyer Jeffrey Scott Vitter.

Test practice Multiplication. Multiplication 9x2.

REDUCED N-GRAM MODELS FOR IRISH, CHINESE AND ENGLISH CORPORA Nguyen Anh Huy, Le Trong Ngoc and Le Quan Ha Hochiminh City University of Industry Ministry.

1 Language Model (LM) LING 570 Fei Xia Week 4: 10/21/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA A A.

CMU-Statistical Language Modeling & SRILM Toolkits

Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.

Efficient Minimal Perfect Hash Language Models David Guthrie, Mark Hepple, Wei Liu University of Sheffield.

HTK Tool Kit. What is HTK tool kit HTK Tool Kit The HTK language modeling tools are a group of programs designed for constructing and testing statistical.

IEEE 64 th ECTC – Orlando, FL, USA May 27–30, 2014 Add Company Logo Here.

COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION Md. Akmal Haidar and Douglas O’Shaughnessy INRS-EMT,

NLP. Introduction to NLP Extrinsic –Use in an application Intrinsic –Cheaper Correlate the two for validation purposes.

Efficient Language Model Look-ahead Probabilities Generation Using Lower Order LM Look-ahead Information Langzhou Chen and K. K. Chin Toshiba Research.

1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.

Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.

A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-

Content-Based MP3 Information Retrieval Chueh-Chih Liu Department of Accounting Information Systems Chihlee Institute of Technology 2005/06/16.

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.

Group – 8 Maunik Shah Hemant Adil Akanksha Patel.

10-1 人生与责任淮安工业园区实验学校连芳芳 “ 自我介绍 ” “ 自我介绍 ” 儿童时期的我.

Tooele Valley Change Application Policy. Summary Review Water Right Issues Policy Conclusions.

A Simple Approach for Author Profiling in MapReduce

PRESENTATION ON VIRTUAL CAMPUS

A Straightforward Author Profiling Approach in MapReduce

Germany VS United Kingdom

Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky

XINFO - Programming Languages - Java

Hire Toyota Innova in Delhi for Outstation Tour

In the name of God Language Modeling Mohammad Bahrani Feb 2011.

GOOD PRACTICES FOR DATA DISSEMINATION

Yahoo Mail Customer Support Number

Most Effective Techniques to Park your Manual Transmission Car

How do Power Car Windows Ensure Occupants Safety

ASSESSING THE USABILITY OF MODERN STANDARD ARABIC DATA IN ENHANCING THE LANGUAGE MODEL OF LIMITED SIZE DIALECT CONVERSATIONS Authers:- Tiba Zaki Abulhameed.

Web Content FileSystem

Click here to insert title of presentation

Neural Language Model CS246 Junghoo “John” Cho.

Language-Model Based Text-Compression

SYEN 3330 Digital Systems Chapter 4 – Part 2 SYEN 3330 Digital Systems.

EEG Recognition Using The Kaldi Speech Recognition Toolkit

Getting Started with MS Access

General presentation of MOOCs

Memory Management Overview

Their Eyes Were Watching God Essay & Multiple Choice Metacognition

Fine-grained vs Coarse-grained multithreading

Getting Started with FileMaker Pro

Presented by Wen-Hung Tsai Speech Lab, CSIE, NTNU 2005/07/13

word Vocabulary Cards Language Literature Social Studies Science Math

Ашық сабақ 7 сынып Файлдар мен қапшықтар Сабақтың тақырыбы:

Windows басқару элементтері

Introduction to Text Analysis

Presenter’s Name (if case)

Research Paper Overview.

Қош келдіңіздер!.

Social Practice of the language: Describe and share information

LANGUAGE EDUCATION.

Информатика пән мұғалімі : Аитова Карима.

Graduation Project Project Name

LEARNING & DEVELOPMENT STRATEGY: PROCESS OVERVIEW

Конференција на МАКС, 17 јуни

The Design and Implementation of a Log-Structured File System

Presentation transcript:

Language model using HTK Raymond Sastraputera

Overview Introduction Implementation Experimentation Conclusion

Introduction Language model HTK 3.3 N-gram Windows binary Word 1

Implementation Database Preparation Mapping OOV words Word map N-gram file Mapping OOV words Vocabulary list

Implementation Language model generation Perplexity Unigram Bigram Trigram 4-gram Perplexity

Result N-gram Perplexity unigram 401.7305 bigram 131.8727 trigram 113.2483 4-gram 109.9411

Conclusion and Summary Higher n-gram Less perplexity More memory usage Too high means over fitting Multiple backed Waste of time

Reference 1. HTK (http://htk.eng.cam.ac.uk/)

Thank you Any Questions?