Prof. Lin-shan Lee TA. Po-chun, Hsu

Slides:

Advertisements

Similar presentations

Visit the ccScan Website Scan, Import, and Automatically File documents to the Cloud SCAN, IMPORT, AND AUTOMATICALLY FILE DOCUMENTS TO SALESFORCE ® Introduction.

Advertisements

For Details Visit : or For any Help Contact the Librarian EBSCOhost 2.0.

數位語音處理概論 HW#2-1 HMM Training and Testing

專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.

CIS101 Introduction to Computing Week 05. Agenda Your questions CIS101 Survey Introduction to the Internet & HTML Online HTML Resources Using the HTML.

專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.

專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.

Accessing iGO Using a PC, tablet or smartphone, go to to login. Using the username and password provided on your ACR form at the.

Starter for 10 Unit 10: Flickr & YouTube Transform IT SFT10_Flickr_YouTube.

Welcome to the Second Tutorial Welcome to the second part of this communication system website tutorial! This tutorial is for church planters. When you.

Podcasting World Languages OEC 2009 JoNelle Gardner

Arthur Kunkle ECE 5525 Fall Introduction and Motivation  A Large Vocabulary Speech Recognition (LVSR) system is a system that is able to convert.

Step by step to adding sound files and embedding YouTube video to your FYIcode mobile site.

Discipline, Crime, and Violence August New DCV Application The DCV application and submission process has been revised beginning with the

DSP homework 1 HMM Training and Testing

Welcome to the World of Wikis Classroom Management: An Evidence-Based Independent Study Instructions for Navigating the wiki and submitting projects.

Request username and password if you don’t already have one.

Grid Programming on Taiwan Unigrid Platform. Outline Introduction to Taiwan Unigrid How to use Taiwan Unigrid.

Web CEO Assessment Training Module Cancellation and search Engine Ranking by Kat Orense (Oct 06, 2011)

 Go to YouTube and click “create account” on the top right of the page.YouTube  If you already have a Google account (i.e. gmail) then you may use this.

Welcome to OneNote Dan McAllister Just arriving? Sign-in near the door Grab a handout Just arriving? Sign-in near the door Grab a handout Finished for.

UDL Book Builder for Teachers Image sources: Cast UDL.

專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1.

專題研究 (2) Feature Extraction, Acoustic Model Training WFST Decoding

Digital Speech Processing HW3

DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION DIGITAL SPEECH PROCESSING HOMEWORK #1 DISCRETE HIDDEN MARKOV MODEL IMPLEMENTATION Date: Oct, Revised.

Building the Wiki together A training manual. The page of the Economic Justice Working Group.

Homework 2 Hints. General Tips Remember what FORM view you are in! – Design, form, and layout view TABLE views include: – Design and Datasheet view.

HW2-2 Speech Analysis TA: 林賢進

BIG SALES GUIDE TO UPDATE PRODUCT PRICE IN BULK. Programs you need to have: LibreOffice Download Link, Click HERE !

How to create a website from scratch.  You should have an internet access.  Visit  You need to create a new account OR.

Migrating Wordpress Migrating Wordpress can sometimes get more complicated as it should. There is no plugin that does this for you, the best way is to.

DU REDCap Introduction

QQ Registration & Usage

beas group AG Beas sql guide Martin Heigl CTO

Automatic Speech Recognition

Development Environment

Susannah sinclair or INTRODUCTION TO SONIA Susannah sinclair or

Training Documentation – Replacing GSPR with RFQ 2.0

Date: October, Revised by 李致緯

Prof. Lin-shan Lee TA. Roy Lu

D.Y.O. Web The new and easy way to create and maintain your own professional dynamic website.

Diversey Rebates End User Manual.

專題研究 week3 Language Model and Decoding

Prof. Lin-shan Lee TA. Lang-Chi Yu

Digital Speech Processing

Using Excel with Google Maps

Bomgar Remote support software

Managing Student Test Settings

Test Information Distribution Engine (TIDE)

How Can I Download My Transactions Directly Into Quicken

Creating assignments: Best Practices

Adding and editing users

Technical Support Overview and Training

Adding and Editing Users

EEG Recognition Using The Kaldi Speech Recognition Toolkit

Volunteer & Teacher Online Registration

ALLIANCE VENDOR WEBINAR -- Annual Spend

專題研究 WEEK 5 – Deep Neural Networks in Kaldi

2007 SPEECH PROJECT PRESENTATION

Voice Activation for Wealth Management

community.afpnet.org/home

Introduction to “TEAMS”

Summer 2013 Prepared by Prof. B. INDEX

專題研究 WEEK 5 – Deep Neural Networks in Kaldi

Adobe Acrobat DC Accessibility Adobe Acrobat Functionality, Part I

Prof. Lin-shan Lee TA. Roy Lu

Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee

STANDARD ACCOUNT: SOLUTION QUICK GUIDE

Chloe Riley | Research Commons Librarian |

Presentation transcript:

Prof. Lin-shan Lee TA. Po-chun, Hsu (r07942095@ntu.edu.tw) 專題研究 WEEK 4 - LIVE DEMO Prof. Lin-shan Lee TA. Po-chun, Hsu (r07942095@ntu.edu.tw)

Outline Review Live Demo introduction Restriction TODO To think FAQ

Review: achieving ASR system Automatic Speech Recognition System Input wave -> output text

Review Week 1: feature extraction compute-mfcc-feat add-delta compute-cmvn-stats apply-cmvn File format: scp,ark

Review Week 2: training acoustic model monophone clustering tree triphone Models: final.mdl, tree

Review Week 3: decoding and training lm SRILM( ngram-count/ kn-smoothing ) Kaldi – WFST decoding HTK – Viterbi decoding Vulcan( kaldi format -> HTK format ) Models: final.mmf tiedlist

Live Demo Now we integrated them into a real-world ASR system for you. You could upload your own models. Now give a shot! Experience your own ASR in a “real” way.

Live Demo https://140.112.21.35:54285/ Demo Remember: use https. Ignore the warnings.

Restriction MFCC with dim 39 only. Fixed phone set. (Chinese phones) LM must be one of unigram/bigram/trigram model.

WARNING Do not press the “Record” button before you upload your model or choose the basic model, otherwise the server will shut down.

To Do Sign up Live Demo with your account. Please fill out the Google form to apply for your account. https://goo.gl/forms/JQPo6YLLkzFszXhG2 FB Group: 數位語音專題 Test with basic model embedded in the system. Upload your model LM/LEX/TREE/MDL For better performance, you may re-train your models. Test with your own models.

To Think Compare the basic models with your own models, what is the main difference? Do you know of what kind your training data are? train.text/dev.text/test.text How about manually tagging your own lexicon and train your own language model? Guess about the training data of the basic models. Exemplify your description.

Language Model : Training Text (2/2) cut -d ' ' -f 1 --complement $train_text > ./exp/lm/LM_train.text

Language Model : ngram-count (3/3) Lexicon lexicon=material/lexicon.train.txt

FAQ Q: How to download the models in the workstation? A: FileZilla MobaXterm “sftp” or “scp” command in your linux OS.

FAQ Q: Why I always got server error? A: Make sure you got models uploaded. Is the timestamp field empty? TREE: exp/tri/tree MDL: exp/tri/final.mdl LM: exp/lm/lm.arpa.txt Lexicon:material/lexicon.decode.txt (or material/lexicon.train.txt) Warning: If using material/lexicon.train.txt, remember to add the following two lines to the top of the file. <s> sil </s> sil

FAQ Q: In corpus mode, why I always got error or 0 accuracy? A: Make sure your corpus is written under UTF-8 encoding. In notepad, the default is ANSI. In vim, the default is UTF-8.

FAQ Q: In corpus mode, why I got negative accuracy? A: Accuracy is actually calculated by (length – error ) / length. 請參考數位語音處理概論 ch8 page 11

Q&A This system is just online. Any bug is expected. If you got any question, contact TA through FB group or email instantly. We need you feedback about UI/function. Feel free saying about anything. Email: r07942095@ntu.edu.tw