2007 SPEECH PROJECT PRESENTATION

Slides:



Advertisements
Similar presentations
1 Using the HTK speech recogniser to analyse prosody in a corpus of German spoken learners English Toshifumi Oba, Eric Atwell University of Leeds, School.
Advertisements

By: Hossein and Hadi Shayesteh Supervisor: Mr J.Connan.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.
Lecture 2 Teaching reading 4/15/2017 Dr.Hanaa El-Baz.
Some Results on Disk Graphs 陳哲烱 中國科技大學 ( 台灣 ) Joint work with 李國偉、林瑩貞.
English Shellcode J. Mason, S. Small, F. Monrose, G. MacManus CCS ’09 Presented by: Eugenie Lee EE515/IS523: Security101: Think Like an Adversary.
Speech Recognition. What makes speech recognition hard?
Language and Speaker Identification using Gaussian Mixture Model Prepare by Jacky Chau The Chinese University of Hong Kong 18th September, 2002.
Information Theory Rong Jin. Outline  Information  Entropy  Mutual information  Noisy channel model.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
Chapter 2- Visual Basic Schneider1 Chapter 2 Problem Solving.
Prototype & Design Computer Inputs. How to Prototype & Design Computer Inputs Step 1: Review Input Requirements Step 2: Select the GUI Controls Step 3:
A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification Bin MA and Haizhou LI Institute for Infocomm Research Singapore.
RFID ACCESS AUTHORIZATION BY FACE RECOGNITION 報告學生:翁偉傑 1 Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding,
Temple University Goals : 1.Down sample 20 khz TIDigits data to 16 khz. 2. Use Down sample data run regression test and Compare results posted in Sphinx-4.
Makefile Introduction Jia – Wei Lin 1. Outline Why we use make ? Create a Description File Rules of Makefile How make Processes a Makefile? GCC Flags.
Comparison of the SPHINX and HTK Frameworks Processing the AN4 Corpus Arthur Kunkle ECE 5526 Fall 2008.
Csc Lecture 7 Recognizing speech. Geoffrey Hinton.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
Gesture Recognition System Speaker : Bo – Hung Chen Adviser : Dr. Shih – Chung Chen.
1 A preliminary study on unknown word problem in Chinese word segmentation Authors: Ming -Yu Lin Tung – Hui Chiang Keh-Yih Su Speaker: Jbc.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
Handing Uncertain Observations in Unsupervised Topic-Mixture Language Model Adaptation Ekapol Chuangsuwanich 1, Shinji Watanabe 2, Takaaki Hori 2, Tomoharu.
Recurrent neural network based language model Tom´aˇs Mikolov, Martin Karafia´t, Luka´sˇ Burget, Jan “Honza” Cˇernocky, Sanjeev Khudanpur INTERSPEECH 2010.
Controlling Computer Using Speech Recognition (CCSR) Creative Masters Group Supervisor : Dr: Mounira Taileb.
National Taiwan University, Taiwan
1 Introduction to Natural Language Processing ( ) Language Modeling (and the Noisy Channel) AI-lab
專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1.
Shakespeare’s Language Learning Objectives  To understand some of the features of Shakespeare’s language.  To be able to match up modern translations.
專題研究 (2) Feature Extraction, Acoustic Model Training WFST Decoding
English vs. Mandarin: A Phonetic Comparison The Data & Setup Abstract The focus of this work is to assess the performance of three new variational inference.
1 AVCE ICT Unit 7 - Programming Session 8 – Documenting your programs.
Feature Selection with Kernel Class Separability 指導教授:王振興 電機所 N 林哲偉 電機所 N 曾信輝 電機所 N 吳俐瑩 Date: Lei Wang, “Feature selection.
Hello, Who is Calling? Can Words Reveal the Social Nature of Conversations?
Introduction Part I Speech Representation, Models and Analysis Part II Speech Recognition Part III Speech Synthesis Part IV Speech Coding Part V Frontier.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
DYNAMIC TIME WARPING IN KEY WORD SPOTTING. OUTLINE KWS and role of DTW in it. Brief outline of DTW What is training and why is it needed? DTW training.
Chapter 2- Visual Basic Schneider1 Chapter 2 Problem Solving.
Preparation and Practice The Better Speaker Series 278.
影像辨識技術 Image Recognition Technology 吳宗益. Professor Tsung-Yi Wu ( 吳宗益 ) Home Page:
Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Copyright © American Speech-Language-Hearing Association
Command Line Basics.
Prof. Lin-shan Lee TA. Roy Lu
Automatic Speech Recognition
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
Ministry of Defense of Georgia
Digital information encrypted in an image using binary encoding
專題研究 week3 Language Model and Decoding
Prof. Lin-shan Lee TA. Lang-Chi Yu
Digital Speech Processing
We will get started as soon as everyone arrives.
Prof. Lin-shan Lee TA. Po-chun, Hsu
Command Me Specification
There are different types of translator.
Signal Conditioning.
An Introduction to Programming with C++ Fifth Edition
汉语连续语音识别 年1月4日访北京工业大学 973 Project 2019/4/17 汉语连续语音识别 年1月4日访北京工业大学 郑 方 清华大学 计算机科学与技术系 语音实验室
Cheng-Kuan Wei1 , Cheng-Tao Chung1 , Hung-Yi Lee2 and Lin-Shan Lee2
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
專題進度報告 資工四 B 洪志豪 資工四 B 林宜鴻.
Scratch Programming Lesson 7 Debugging.
Prof. Lin-shan Lee TA. Roy Lu
Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee
The Application of Hidden Markov Models in Speech Recognition
Presentation transcript:

2007 SPEECH PROJECT PRESENTATION Speaker: 吳宗翰 陳柏偉 Advise: Prof. Lin-Shan Lee Mentor: MNO2 Date:2007/10/4

Outline Overall Tasks Acoustic Model Language Model Normalization Conversion Linux Learning and Findings Future Work Reference

Overall Tasks

Acoustic Model(1/2) Resources hmmset.mmf training. scp

Acoustic Model(2/2) Goal: Retraining hmmset.mmf Difficulties: 1. Error codes 2. Unfamiliar with flags and directories

Language Model Fundamental: corpora .lm Resource: yahoo_news_utf8

Normalization Deal with: 1.Arabian numbers 2.Full cap English Result norm_yahoo_news.txt

Conversion Convert Coding utf8 big5 Difficulties Cause The program aborted when converting The abortion produces no outcome Cause

Segment Fundamental: Separate word from word Difficulties: The command “make”

Learnings and Findings The use of Linux Search in HTKBook/google/discussion board

Future Work Overcome the problems we met when trying to retrain acoustic model Finish language model Add more mixture to improve the identification

Reference PPT used in Speech project 2007 winter: 1. Lin-Shan Lee. “Introduction to Speech Signal Processing”. 2. Meng, Chao-Hong. “Speech Project 2007 Winter”. 3. Meng, Chao-Hong. “Documents”. 4.林光哲. “HTK-based基礎辨識系統”. Linux: 鳥哥的Linux. http://linux.vbird.org/ Perl: http://www.hcchien.org/ HTKBook