Digital Speech Processing HW3

Slides:



Advertisements
Similar presentations
Intro to C#. Programming Coverage Methods, Classes, Arrays Iteration, Control Structures Variables, Expressions Data Types.
Advertisements

數位語音處理概論 HW#2-1 HMM Training and Testing
專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.
BETA E-hand-in: Guidelines for Home assignments 1.
Perl Practical Extraction and Report Language Senior Projects II Jeff Wilson.
James Tam Introduction to CPSC 233 CPSC 233: Introduction to Computers II Object-oriented programming The "nuts and bolts" of programming Object-oriented.
1 Language Model (LM) LING 570 Fei Xia Week 4: 10/21/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA A A.
Almost-Spring Short Course on Speech Recognition Instructors: Bhiksha Raj and Rita Singh Welcome.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
Discussion Section: HW1 and Programming Tips GS540.
SI485i : NLP Set 3 Language Models Fall 2012 : Chambers.
Agenda What is Computer Programming? The Programming Process
12/13/2007Chia-Ho Ling1 SRILM Language Model Student: Chia-Ho Ling Instructor: Dr. Veton Z. K ë puska.
Name:Venkata subramanyan sundaresan Instructor:Dr.Veton Kepuska.
CMU-Statistical Language Modeling & SRILM Toolkits
1 Homework / Exam Exam 3 –Solutions Posted –Questions? HW8 due next class Final Exam –See posted schedule Websites on UNIX systems Course Evaluations.
Arthur Kunkle ECE 5525 Fall Introduction and Motivation  A Large Vocabulary Speech Recognition (LVSR) system is a system that is able to convert.
Statistical Language Modeling using SRILM Toolkit
1 Programming Languages Tevfik Koşar Lecture - II January 19 th, 2006.
Efficient Minimal Perfect Hash Language Models David Guthrie, Mark Hepple, Wei Liu University of Sheffield.
CS OPERATING SYSTEMS Project 2 Matrix Multiplication Project.
Digital Speech Processing Homework 3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Chinese Word Segmentation and Statistical Machine Translation Presenter : Wu, Jia-Hao Authors : RUIQIANG.
2013 Computer Architecture Project MIPS Programming : Merge Sort ChangHyun Yun Room 236, Engineering Building.
Input & Output In Java. Input & Output It is very complicated for a computer to show how information is processed. Although a computer is very good at.
Operating Systems Session 1. Contact details TA: Alexander(Sasha) Apartsin ◦ ◦ Office hours: Homepage:
ACIS Introduction to Data Analytics & Business Intelligence Big Data Import.
IR Homework #2 By J. H. Wang Mar. 31, Programming Exercise #2: Query Processing and Searching Goal: to search relevant documents for a given query.
Efficient Language Model Look-ahead Probabilities Generation Using Lower Order LM Look-ahead Information Langzhou Chen and K. K. Chin Toshiba Research.
Compact WFSA based Language Model and Its Application in Statistical Machine Translation Xiaoyin Fu, Wei Wei, Shixiang Lu, Dengfeng Ke, Bo Xu Interactive.
SRILM - The SRI Language Modeling Toolkit
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
IR Homework #1 By J. H. Wang Mar. 5, Programming Exercise #1: Indexing Goal: to build an index for a text collection using inverted files Input:
Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department.
Homework #4: Operator Overloading and Strings By J. H. Wang Apr. 17, 2009.
Speech Analysis TA : 林賢進 HW /10/28 1. Goal This homework is aimed to analyze speech from spectrogram, and try to distinguish different initials/
1 Lecture 4 Post-Graduate Students Advanced Programming (Introduction to MATLAB) Code: ENG 505 Dr. Basheer M. Nasef Computers & Systems Dept.
CS210: Programming Languages Overview of class Dr. Robert Heckendorn.
Homework 1 (due:April 10th) Deadline : April 10th 11:59pm Where to submit? eClass 과제방 ( How to submit? Create a folder. The name.
DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION DIGITAL SPEECH PROCESSING HOMEWORK #1 DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION Date: Oct, Revised.
Homework 1 (due:April 8th) Deadline : April 8th 11:59pm Where to submit? eClass “ 과제방 ” ( How to submit? Create a folder. The name.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Project 1 Data Communication Spring 2010, ICE Stephen Kim, Ph.D.
HW2-2 Speech Analysis TA: 林賢進
Digital Speech Processing Homework /05/04 王育軒.
+ Auto-Testing Code for Teachers & Beginning Programmers Dr. Ronald K. Smith Graceland University.
Assignment 4: Deep Convolutional Neural Networks
Homework 1 (due:April 13th) Deadline : April 13th 11:59pm Where to submit? eClass 과제방 ( How to submit? Create a folder. The name.
Homework 1.
ENEE150 Discussion 01 Section 0101 Adam Wang.
Date: October, Revised by 李致緯
Homework 3 (due:May 27th) Deadline : May 27th 11:59pm
Prof. Lin-shan Lee TA. Roy Lu
Digital Speech Processing Homework 3
Speech Analysis TA:Chuan-Hsun Wu
專題研究 week3 Language Model and Decoding
Prof. Lin-shan Lee TA. Lang-Chi Yu
Digital Speech Processing
Homework 3 (due:June 5th)
Digital Speech Processing Homework 3
Operation System Program 4
Language-Model Based Text-Compression
Homework 3 May 王育軒.
Prof. Lin-shan Lee TA. Po-chun, Hsu
Homework 2 (due:May 15th) Deadline : May 15th 11:59pm
Homework 2 (due:May 5th) Deadline : May 5th 11:59pm
Prof. Lin-shan Lee TA. Roy Lu
Homework 2 (due:May 13th) Deadline : May 11th 11:59pm
Computer network HW2 -Retransmission + Congest control
Presentation transcript:

Digital Speech Processing HW3 蔡政昱 吳全勳 2013/11/20

Outline Introduction SRILM Requirement Submission Format

Outline Introduction SRILM Requirement Submission Format

Introduction 讓 他 十分 ㄏ怕 只 ㄒ望 ㄗ己 明ㄋ 度 別 再 這ㄇ ㄎ命 了 演ㄧ ㄩ樂 產ㄧ ㄐ入 積ㄐ ㄓ型 提ㄕ 競爭ㄌ Your HW3 讓 他 十分 害怕 只 希望 自己 明年 度 別 再 這麼 苦命 了 演藝 娛樂 產業 加入 積極 轉型 提升 競爭力

Introduction Imperfect acoustic model with some phone losses. The finals of some characters are lost. What can we do for decoding 注音文? Acoustic Model 一ㄢㄧ ㄩㄌㄜ ㄔㄢㄧㄝ Acoustic Model 演ㄧ ㄩ樂 產ㄧ

Introduction In general, we can use a language model For example, let Z = 演ㄧ ㄩ樂 產ㄧ P(Z) is independent of W W = w1w2…wN , Z = z1z2…zN Available from Bigram Language Model

Introduction 演 ㄧ ㄩ 樂 業 藝 餘 娛 於 0.02 0.2 0.1 0.01 0.3

So… We need to build a bigram character-based language model. Use the language model to decode the sequence. There is a nice toolkit to help you.

Outline Introduction SRILM Requirement Submission Format

SRILM SRI Language Model Toolkit http://www.speech.sri.com/projects/srilm/ A toolkit for building and applying various statistical language models C++ classes in SRILM are very useful Using and reproducing some programs of SRILM in this homework

SRILM Download the executable from the course website Different platform: i686 for 32-bit GNU/Linux i686-m64 for 64-bit GNU/Linux (CSIE workstation) Cygwin for 32-bit Windows with cygwin environment If you want to use the C++ library, you can build it from the source code

SRILM You are strongly recommended to read FAQ on the course website Possibly useful codes in SRILM $SRIPATH/misc/src/File.cc (.h) $SRIPATH/lm/src/Vocab.cc (.h) $SRIPATH/lm/src/ngram.cc (.h) $SRIPATH/lm/src/testError.cc (.h)

SRILM perl separator_big5.pl corpus.txt > corpus_seg.txt

SRILM ./ngram-count –text corpus_seg.txt –write lm.cnt –order 2 -text: input text filename -write: output count filename -order: order of ngram language model ./ngram-count –read lm.cnt –lm bigram.lm –unk –order 2 -read: input count filename -lm: output language model name -unk: view OOV as <unk> without this, all the OOV will be removed

Example 在 國 民 黨 失 去 政 權 後 第 一 次 參 加 元 旦 總 統 府 升 旗 典 禮 corpus_seg.txt 在 國 民 黨 失 去 政 權 後 第 一 次 參 加 元 旦 總 統 府 升 旗 典 禮 有 立 委 感 慨 國 民 黨 不 團 結 才 會 失 去 政 權 有 立 委 則 猛 批 總 統 陳 水 扁 人 人 均 顯 得 百 感 交 集 …. trigram.lm \data\ ngram 1=6868 ngram 2=1696830 ngram 3=4887643 \1-grams: -1.178429 </s> -99 <s> -2.738217 -1.993207 一 -1.614897 -4.651746 乙 -1.370091 ...... lm.cnt 夏 11210 俸 267 鴣 7 衹 1 微 11421 檎 27 …… Log Probability

SRILM ./disambig –text $file –map $map –lm $LM –order $order -text: input filename -map: a mapping from (注音/國字) to (國字) You should generate this mapping by yourself from the given Gig5-ZhuYin.map, either using EXCEL or writing a simple program on your own. -lm: input language model

SRILM Be aware of polyphones (破音字). Big5-ZhuYin.map 一 ㄧˊ/ㄧˋ/ㄧ_ 乙 ㄧˇ 丁 ㄉㄧㄥ_ ㄑㄧ_ 乃 ㄋㄞˇ ㄐㄧㄡˇ … 長 ㄔㄤˊ/ㄓㄤˇ 行 ㄒㄧㄥˊ/ㄏㄤˊ ZhuYin-Big5.map ㄅ 八 匕 卜 不 卞 巴 比 丙 包 … 八 八 匕 匕 卜 卜 … ㄆ 仆 匹 片 丕 叵 平 扒 扑 疋 … 仆 仆 匹 匹 Be aware of polyphones (破音字). There should be spaces between all characters.

Outline Introduction SRILM Requirement Submission Format

Requirement (I) ./separator_big5.pl corpus.txt corpus_seg.txt Segment corpus and all test data into characters ./separator_big5.pl corpus.txt corpus_seg.txt ./separator_big5.pl <testdata/xx.txt> <testdata/xx.txt> Train character-based bigram LM Get counts: ./ngram-count –text corpus_seg.txt –write lm.cnt –order 2 Compute probability: ./ngram-count –read lm.cnt –lm bigram.lm –unk –order 2 Generate the map from Big5-ZhuYin.map See FAQ 4 Using disambig to decode testdata/xx.txt ./disambig –text $file –map $map –lm $LM –order $order > $output

Requirement (II) Implement your version of disambig. Using dynamic programming (Viterbi). The vertical axes are the candidate characters.

Requirement (II) You have to use C++ or Matlab. You are strongly recommended to use C++… Speed Using SRILM’s library will save you a lot of time (please refer to FAQ) Your output format should be consistent with srilm. ex: <s> 這 是 一 個 範 例 格 式 </s> There are an <s> at the beginning of a sentence, a </s> at the end, and whitespaces in between all characters.

How to deal with Big5 All testing files are encoded in Big5 A Chinese character in Big5 is always 2 bytes, namely, char[2] in C++

Outline Introduction SRILM Requirement Submission Format

Submission format Files required: The ZhuYin-Big5.map you generate The decoded results of 10 test data produced by SRILM’s disambig result1/1.txt ~ 10.txt The decoded results of 10 test data produced by your disambig result2/1.txt ~ 10.txt All source codes (your disambig & your program for the map generation) Makefile (if C++ is used) Report No SRILM related files nor corpus_seg.txt nor LMs Compress into one zip file named “hw3_[your_student_ID].zip” and then upload to CEIBA

Submission format (report) The report should include: 1. Your environment (CSIE workstation, Cygwin, …) 2. How to “compile” your program (if C++ is used) 3. How to “execute” your program (give me examples) ex: ./program –a xxx –b yyy 4. What you have done 5. NO more than two A4 pages. 6. NO “what you have learned”

Grading Requirement I (40%) Requirement II (40%) Report (20%) Note that if you use C++ and there’s no Makefile in the submitted zip file, score in this part will be halved Report (20%) Bonus (15%) Character-based trigram language model (10%) (need pruning for speed) Other strategies (5%)

If you have any questions… FAQ http://speech.ee.ntu.edu.tw/courses/DSP2013 Autumn/ 蔡政昱 r02942067@ntu.edu.tw 吳全勳 r02922002@ntu.edu.tw