專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.

Slides:



Advertisements
Similar presentations
Visit the ccScan Website Scan, Import, and Automatically File documents to the Cloud SCAN, IMPORT, AND AUTOMATICALLY FILE DOCUMENTS TO SALESFORCE ® Introduction.
Advertisements

Building an ASR using HTK CS4706
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
1 Feel free to contact us at
CIS101 Introduction to Computing Week 05. Agenda Your questions CIS101 Survey Introduction to the Internet & HTML Online HTML Resources Using the HTML.
CIS101 Introduction to Computing
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen.
Lecturer: Ghadah Aldehim
12/13/2007Chia-Ho Ling1 SRILM Language Model Student: Chia-Ho Ling Instructor: Dr. Veton Z. K ë puska.
Name:Venkata subramanyan sundaresan Instructor:Dr.Veton Kepuska.
1 Intro to Linux - getting around HPC systems Himanshu Chhetri.
1M4 speech recognition University of Sheffield M4 speech recognition Martin Karafiát*, Steve Renals, Vincent Wan.
Arthur Kunkle ECE 5525 Fall Introduction and Motivation  A Large Vocabulary Speech Recognition (LVSR) system is a system that is able to convert.
Digital Alchemy | 5750 Stratum Drive Fort Worth, Texas | Phone: Fax: | Rate Code Manager Tutorial.
Constructing Your Own Corpus from Written Language.
Electronic Thesis and Dissertation Database Errors Ryan Mestre Luke Schmader Client: Zhiwu Xie Blacksburg March 3, 2014 Virginia Tech CS 4624.
Diamantino Caseiro and Isabel Trancoso INESC/IST, 2000 Large Vocabulary Recognition Applied to Directory Assistance Services.
Digital Speech Processing Homework 3
Where hands say it all.  Start with demo to illustrate our idea.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
DSP homework 1 HMM Training and Testing
1M4 speech recognition University of Sheffield M4 speech recognition Vincent Wan, Martin Karafiát.
Comparison of the SPHINX and HTK Frameworks Processing the AN4 Corpus Arthur Kunkle ECE 5526 Fall 2008.
Welcome to the World of Wikis Classroom Management: An Evidence-Based Independent Study Instructions for Navigating the wiki and submitting projects.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
DireXions – Your Tool Box just got Bigger PxPlus Version Control System Using TortoiseSVN Presented by: Jane Raymond.
Hyperparameter Estimation for Speech Recognition Based on Variational Bayesian Approach Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee and Keiichi.
Natural Language Processing Course Project: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
College Physics I. Download the following files: Syllabus WebAssign Quick Start Guide (Online homework) All the documents are available at the website:
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
Temple University Training Acoustic model using Sphinx Train Jaykrishna shukla,Mubin Amehed& cara Santin Department of Electrical and Computer Engineering.
By The Supreme Team CMPT 275 Assignment 2 May 29, 2009.
專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1.
專題研究 (2) Feature Extraction, Acoustic Model Training WFST Decoding
Supervisor: Dr. Elsayed Eissa Hemayed. o Marwa Ibrahim Lamey. Mayada Ibrahim Aly. o Mona Sherif Ahmed. o Suad Mohamed Barakat. o Marwa Ibrahim Lamey.
The HTK Book (for HTK Version 3.2.1) Young et al., 2002.
Digital Speech Processing HW3
New Communication Platform of Our Class.
DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION DIGITAL SPEECH PROCESSING HOMEWORK #1 DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION Date: Oct, Revised.
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences Recurrent Neural Network-based Language Modeling for an Automatic.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Technical lssues for the Knowledge Engineering Competition Stefan Edelkamp Jeremy Frank.
Digital Speech Processing Homework /05/04 王育軒.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
BIG SALES GUIDE TO UPDATE PRODUCT PRICE IN BULK. Programs you need to have: LibreOffice Download Link, Click HERE !
QQ Registration & Usage
Scan, Import, and Automatically file documents to Box Introduction
Automatic Speech Recognition
Date: October, Revised by 李致緯
Prof. Lin-shan Lee TA. Roy Lu
Dr. ElSayed Eissa Hemayed
專題研究 week3 Language Model and Decoding
Prof. Lin-shan Lee TA. Lang-Chi Yu
Digital Speech Processing
MICROSOFT OUTLOOK SUPPORT PHONE NUMBER CALL AT OUR TOLL FREE NUMBER: # # For more information,
Hotmail Sign in Error, , Hotmail Login Support
MLP Based Feedback System for Gas Valve Control in a Madison Symmetric Torus Andrew Seltzman Dec 14, 2010.
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Automatic Speech Recognition: Conditional Random Fields for ASR
Prof. Lin-shan Lee TA. Po-chun, Hsu
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
2007 SPEECH PROJECT PRESENTATION
Voice Activation for Wealth Management
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
Prof. Lin-shan Lee TA. Roy Lu
Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee
Presentation transcript:

專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei

Outline  Review  Live Demo introduction  Restriction  To do  To think  FAQ

Review: achieving ASR system  Automatic Speech Recognition System  Input wave -> output text

Review  Week 1: feature extraction  compute-mfcc-feat  add-delta  compute-cmvn-stats  apply-cmvn  File format: scp,ark

Review  Week 2: training acoustic model  monophone  clustering tree  triphone  Models: final.mdl, tree

Review  Week 3: decoding and training lm  SRILM( ngram-count/ kn-smoothing )  Kaldi – WFST decoding  HTK – Viterbi decoding  Vulcan( kaldi format -> HTK format )  Models: final.mmf tiedlist

Live Demo  Now we integrated them into a real-world ASR system for you.  You could upload your own models.  Now give a shot! Experience your own ASR in a “real” way.

Live Demo   Demo

Restriction  MFCC with dim 39 only.  Fixed phone set. (Chinese phones)  LM must be one of unigram/bigram/trigram model.

To Do  Sign up Live Demo with your account.  Please inform TA of your account name for activation.  Test with basic model embedded in the system.  Upload your model  LM/LEX/TREE/MDL  For better performance, you may re-train your models.  Test with your own models.

To Think  Compare the basic models with your own models, what is the main difference?  Do you know of what kind your training data are?  train.text/dev.text/test.text  How about manually tagging your own lexicon and train your own language model?  呂相弘 CH-l CH_y CH_si CH_i CH_a CH_N# CH_h CH_o CH_N#  這是 呂相弘 的 廣告  Guess about the training data of the basic models.  Exemplify your description.

FAQ  Q: How to download the models in the workstation?  A:  FileZilla FileZilla  MobaXterm MobaXterm  sftp command in your linux OS.

FAQ  Q: Why I always got server error?  A:  Make sure you got models uploaded.  Is the timestamp empty?

FAQ  Q: In corpus mode, why I always got error or 0 accuracy?  A:  Make sure your corpus is written under UTF-8 encoding.  In notepad, the default is ANSI.  In vim, the default is UTF-8.

FAQ  Q: In corpus mode, why I got negative accuracy?  A:  Accuracy is actually calculated by (length – error ) / length.

Q&A  This system is just online for the first semester.  Any bug is expected.  If you got any question, contact TA through FB group or instantly.  We need you feedback about UI/function.  Feel free saying about anything. 