Prof. Lin-shan Lee TA. Po-chun, Hsu

Slides:



Advertisements
Similar presentations
Visit the ccScan Website Scan, Import, and Automatically File documents to the Cloud SCAN, IMPORT, AND AUTOMATICALLY FILE DOCUMENTS TO SALESFORCE ® Introduction.
Advertisements

For Details Visit : or For any Help Contact the Librarian EBSCOhost 2.0.
數位語音處理概論 HW#2-1 HMM Training and Testing
專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.
CIS101 Introduction to Computing Week 05. Agenda Your questions CIS101 Survey Introduction to the Internet & HTML Online HTML Resources Using the HTML.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
Accessing iGO Using a PC, tablet or smartphone, go to to login. Using the username and password provided on your ACR form at the.
Starter for 10 Unit 10: Flickr & YouTube Transform IT SFT10_Flickr_YouTube.
Welcome to the Second Tutorial Welcome to the second part of this communication system website tutorial! This tutorial is for church planters. When you.
Podcasting World Languages OEC 2009 JoNelle Gardner
Arthur Kunkle ECE 5525 Fall Introduction and Motivation  A Large Vocabulary Speech Recognition (LVSR) system is a system that is able to convert.
Step by step to adding sound files and embedding YouTube video to your FYIcode mobile site.
Discipline, Crime, and Violence August New DCV Application The DCV application and submission process has been revised beginning with the
DSP homework 1 HMM Training and Testing
Welcome to the World of Wikis Classroom Management: An Evidence-Based Independent Study Instructions for Navigating the wiki and submitting projects.
Request username and password if you don’t already have one.
Grid Programming on Taiwan Unigrid Platform. Outline Introduction to Taiwan Unigrid How to use Taiwan Unigrid.
Web CEO Assessment Training Module Cancellation and search Engine Ranking by Kat Orense (Oct 06, 2011)
 Go to YouTube and click “create account” on the top right of the page.YouTube  If you already have a Google account (i.e. gmail) then you may use this.
Welcome to OneNote Dan McAllister Just arriving? Sign-in near the door Grab a handout Just arriving? Sign-in near the door Grab a handout Finished for.
UDL Book Builder for Teachers Image sources: Cast UDL.
專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1.
專題研究 (2) Feature Extraction, Acoustic Model Training WFST Decoding
Digital Speech Processing HW3
DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION DIGITAL SPEECH PROCESSING HOMEWORK #1 DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION Date: Oct, Revised.
Building the Wiki together A training manual. The page of the Economic Justice Working Group.
Homework 2 Hints. General Tips Remember what FORM view you are in! – Design, form, and layout view TABLE views include: – Design and Datasheet view.
HW2-2 Speech Analysis TA: 林賢進
BIG SALES GUIDE TO UPDATE PRODUCT PRICE IN BULK. Programs you need to have: LibreOffice Download Link, Click HERE !
How to create a website from scratch.  You should have an internet access.  Visit  You need to create a new account OR.
Migrating Wordpress Migrating Wordpress can sometimes get more complicated as it should. There is no plugin that does this for you, the best way is to.
DU REDCap Introduction
QQ Registration & Usage
beas group AG Beas sql guide Martin Heigl CTO
Automatic Speech Recognition
Development Environment
Susannah sinclair or INTRODUCTION TO SONIA Susannah sinclair or
Training Documentation – Replacing GSPR with RFQ 2.0
Date: October, Revised by 李致緯
Prof. Lin-shan Lee TA. Roy Lu
D.Y.O. Web The new and easy way to create and maintain your own professional dynamic website.
Diversey Rebates End User Manual.
專題研究 week3 Language Model and Decoding
Prof. Lin-shan Lee TA. Lang-Chi Yu
Digital Speech Processing
Using Excel with Google Maps
Bomgar Remote support software
Managing Student Test Settings
Test Information Distribution Engine (TIDE)
How Can I Download My Transactions Directly Into Quicken
Creating assignments: Best Practices
Adding and editing users
Technical Support Overview and Training
Adding and Editing Users
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Volunteer & Teacher Online Registration
ALLIANCE VENDOR WEBINAR -- Annual Spend
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
2007 SPEECH PROJECT PRESENTATION
Voice Activation for Wealth Management
community.afpnet.org/home
Introduction to “TEAMS”
Summer 2013 Prepared by Prof. B. INDEX
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
Adobe Acrobat DC Accessibility Adobe Acrobat Functionality, Part I
Prof. Lin-shan Lee TA. Roy Lu
Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee
STANDARD ACCOUNT: SOLUTION QUICK GUIDE
Chloe Riley | Research Commons Librarian |
Presentation transcript:

Prof. Lin-shan Lee TA. Po-chun, Hsu (r07942095@ntu.edu.tw) 專題研究 WEEK 4 - LIVE DEMO Prof. Lin-shan Lee TA. Po-chun, Hsu (r07942095@ntu.edu.tw)

Outline Review Live Demo introduction Restriction TODO To think FAQ

Review: achieving ASR system Automatic Speech Recognition System Input wave -> output text

Review Week 1: feature extraction compute-mfcc-feat add-delta compute-cmvn-stats apply-cmvn File format: scp,ark

Review Week 2: training acoustic model monophone clustering tree triphone Models: final.mdl, tree

Review Week 3: decoding and training lm SRILM( ngram-count/ kn-smoothing ) Kaldi – WFST decoding HTK – Viterbi decoding Vulcan( kaldi format -> HTK format ) Models: final.mmf tiedlist

Live Demo Now we integrated them into a real-world ASR system for you. You could upload your own models. Now give a shot! Experience your own ASR in a “real” way.

Live Demo https://140.112.21.35:54285/ Demo Remember: use https. Ignore the warnings.

Restriction MFCC with dim 39 only. Fixed phone set. (Chinese phones) LM must be one of unigram/bigram/trigram model.

WARNING Do not press the “Record” button before you upload your model or choose the basic model, otherwise the server will shut down.

To Do Sign up Live Demo with your account. Please fill out the Google form to apply for your account. https://goo.gl/forms/JQPo6YLLkzFszXhG2 FB Group: 數位語音專題 Test with basic model embedded in the system. Upload your model LM/LEX/TREE/MDL For better performance, you may re-train your models. Test with your own models.

To Think Compare the basic models with your own models, what is the main difference? Do you know of what kind your training data are? train.text/dev.text/test.text How about manually tagging your own lexicon and train your own language model? Guess about the training data of the basic models. Exemplify your description.

Language Model : Training Text (2/2) cut -d ' ' -f 1 --complement $train_text > ./exp/lm/LM_train.text

Language Model : ngram-count (3/3) Lexicon lexicon=material/lexicon.train.txt

FAQ Q: How to download the models in the workstation? A: FileZilla MobaXterm “sftp” or “scp” command in your linux OS.

FAQ Q: Why I always got server error? A: Make sure you got models uploaded. Is the timestamp field empty? TREE: exp/tri/tree MDL: exp/tri/final.mdl LM: exp/lm/lm.arpa.txt Lexicon:material/lexicon.decode.txt (or material/lexicon.train.txt) Warning: If using material/lexicon.train.txt, remember to add the following two lines to the top of the file. <s> sil </s> sil

FAQ Q: In corpus mode, why I always got error or 0 accuracy? A: Make sure your corpus is written under UTF-8 encoding. In notepad, the default is ANSI. In vim, the default is UTF-8.

FAQ Q: In corpus mode, why I got negative accuracy? A: Accuracy is actually calculated by (length – error ) / length. 請參考數位語音處理概論 ch8 page 11

Q&A This system is just online. Any bug is expected. If you got any question, contact TA through FB group or email instantly. We need you feedback about UI/function. Feel free saying about anything. Email: r07942095@ntu.edu.tw