Speech Assessment 語音評測 J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept, Tsing.

Slides:



Advertisements
Similar presentations
Dynamic Time Warping (DTW)
Advertisements

Pollyanna Wang 08/14/08.  幼童華語讀本 OCAC Chinese Kindergarten Reader 1-4 幼童華語讀本 OCAC Chinese Kindergarten Reader 1-4  新編華語注音符號 OCAC Chinese BoPoMoFo 新編華語注音符號.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Retrieval Methods for QBSH (Query By Singing/Humming) J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval.
Confidence Measures for Speech Recognition Reza Sadraei.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
CS1103 電機資訊工程實習 Department of Computer Science National Tsing Hua University.
Basic Features of Audio Signals ( 音訊的基本特徵 ) Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CS Dept, Tsing Hua Univ. Hsinchu, Taiwan.
Performance Evaluation: Estimation of Recognition rates J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan Univ.
Improved Tone Modeling for Mandarin Broadcast News Speech Recognition Xin Lei 1, Manhung Siu 2, Mei-Yuh Hwang 1, Mari Ostendorf 1, Tan Lee 3 1 SSLI Lab,
To an Automatic Speech Recognition System? Jiang Wu Electrical Engineering Department.
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Designing Appropriate Instruction: The Use of Informal Observations
1 Teaching computers to teach people to read and speak updates: (Stanford Open Source Lab ’08) see also:
PCA & LDA for Face Recognition
NM7613: Music Signal Analysis and Retrieval 音樂訊號分析與檢索 Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
2015/9/111 Introduction to ISMIR/MIREX J.-S. Roger Jang (張智星) Multimedia Information Retrieval (MIR) Lab CSIE Dept, National Taiwan Univ.
2015/9/131 Stress Detection J.-S. Roger Jang ( 張智星 ) MIR LabMIR Lab, CSIE Dept., National Taiwan Univ.
Improving Utterance Verification Using a Smoothed Na ï ve Bayes Model Reporter : CHEN, TZAN HWEI Author :Alberto Sanchis, Alfons Juan and Enrique Vidal.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
World Languages Mandarin English Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature.
National Taiwan University
Canossa School (H.K.) P.1 Parents’ Meeting English language curriculum ( 英語課程 )
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
2015/10/221 Progressive Filtering and Its Application for Query-by-Singing/Humming J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept.,
Demos for QBSH J.-S. Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
2016/6/41 Recent Improvement Over QBSH and AFP J.-S. Roger Jang (張智星) Multimedia Information Retrieval (MIR) Lab CSIE Dept, National Taiwan Univ.
Speech Assessment: Methods and Applications for Spoken Language Learning 語音評分的方法、應用與分享 J.-S. Roger Jang ( 張智星 )
RuSSIR 2013 QBSH and AFP as Two Successful Paradigms of Music Information Retrieval Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, CSIE Dept.
Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information Motoyuki Suzuki, Toru Hosoya, Akinori Ito, and Shozo Makino EURASIP.
Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.
Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling Zhengchen Zhang , Shuzhi Sam Ge , Hongsheng.
National Taiwan University, Taiwan
QBSH Corpus The QBSH corpus provided by Roger Jang [1] consists of recordings of children’s songs from students taking the course “Audio Signal Processing.
HMM-Based Speech Synthesis Erica Cooper CS4706 Spring 2011.
Welcome and Introductions Rationale What it is. Why this can help. (Poem – Tiptoe Tiger) Stress What it is. Example(s) Rhythm What it is Examples Break.
Some Research Activities in MIR Lab J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS.
DTW for Speech Recognition J.-S. Roger Jang ( 張智星 ) MIR Lab ( 多媒體資訊檢索實驗室 ) CS, Tsing Hua Univ. ( 清華大學.
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction EuroSpeech 1997 Authors: Y. Kim, H. Franco, L. Neumeyer Presenter:
Tone Recognition With Fractionized Models and Outlined Features Ye Tian, Jian-Lai Zhou, Min Chu, Eric Chang ICASSP 2004 Hsiao-Tsung Hung Department of.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Distance/Similarity Functions for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CS Dept., Tsing Hua Univ., Taiwan
Discussions on Audio Melody Extraction (AME) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Linear Classifiers (LC) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Pitch Tracking in Time Domain Jyh-Shing Roger Jang ( 張智星 ) MIR Lab, Dept of CSIE National Taiwan University
Final Project: English Preposition Usage Checker J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.
Introduction to Music Information Retrieval (MIR)
Introduction to ISMIR/MIREX
Search in Google's N-grams
MIR Lab: R&D Foci and Demos ( MIR實驗室:研發重點及展示)
DP for Optimum Strategies in Games
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Query by Singing/Humming via Dynamic Programming
Singing Voice Separation via Active Noise Cancellation 使用主動式雜訊消除於歌聲分離
Written assignment Written assignment 书面作业.
ASRA: Automatic Speech Recognition & Assessment
Improving Health Question Classification by Word Location Weights
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
自我介紹 學歷: 研究方向: 經歷: 1984:學士,台大電機系 1992:博士,加州大學柏克萊分校、電機電腦系
Introduction to Music Information Retrieval (MIR)
3.0 Map of Subject Areas.
Introduction to Music Information Retrieval (MIR)
Neuro-Fuzzy and Soft Computing for Speaker Recognition (語者辨識)
Endpoint Detection ( 端點偵測)
Query by Singing/Humming via Dynamic Programming
Scientific Computing: Closing 科學計算:結語
Naive Bayes Classifiers (NBC)
Game Trees and Minimax Algorithm
Duration & Pitch Modification via WSOLA
Presentation transcript:

Speech Assessment 語音評測 J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept, Tsing Hua Univ, Taiwan

-2- Outline zIntroduction zMethods zProblems to be solved zDemos

-3- Speech Assessment zSpeech assessment: How to assess an utterance for the purpose of learning a spoken language? yAssessment levels: syllables, words, sentences, paragraphs yAssessment criteria: timbre, tone, energy, rhythm, co-articulation, … yFeedbacks: High-level correction and suggestions

-4- Related Disciplines zRelated disciplines for speech assessment: yLanguage learning: xCALL: Computer Assisted Language Learning xCAPT: Computer Assisted Pronunciation Training ySpeech technology: xUV: Utterance Verification

-5- Our Approach zBasic approach to timbre assessment yLexicon net construction (Usually a sausage net) yForced alignment to identify phone boundaries yPhone scoring based on several criteria, such as ranking, histograms, posterior prob., etc. yWeighted average to get syllable score yWeighted average to get sentence score

-6- Basic Assessment Criteria zTimber yBased on acoustic models zTone yBased on tone recognition (for tonal language) yBased on pitch similarity with the target utterance zEnergy yBased on energy comparison with the target utterance zRhythm yBased on duration comparison with the target utterance zFluency

-7- Additional Assessment Criteria zEnglish yStress xLevels (word or sentence) xMeanings yIntonation xDeclarative sentence xInterrogative sentence yCo-articulation xA red apple. xDid you call me? xHit and run zMandarin yTone yRetroflex or not yCo-articulation x 兒化音

-8- Problems to be Solved zScore related yOptimization yConsistency yInterpretability zConfusing phone id. ( 日本人的發音 ) zSlightly adaptation zParagraph-level assessment zContents construction

-9- Demo: Practice of Mandarin Idioms of Length 4 ( 一語中的 ) yLevel (difficulty) of an idiom is based on it’s freq. via Google search: x 孤掌難鳴 ===> 260,000 x 鶼鰈情深 ===> 43,300 x 亡鈇意鄰 ===> 22,700 x 舉案齊眉 ===> 235,000 yCan be adapted for English learning yNext step: multi- threading, fast decoding via FSM

-10- Demo: Recitation Machine (唸唸不 忘) zSupport Mandarin & English zSupport user-defined recitation script zNext step: multithreading for recording & recognition

-11- Demo: Dialog Practice via Videos zDialog-based practice and evaluation

-12- Demos on PC and PMP zPC 軟體 yLucy’s Café: Speech and Score zPMP y 華語練習機

-13- Demo: Embedded Systems yChicken run ( 落跑雞 )Chicken run ( 落跑雞 yPenguin for Tang Poetry ( 唐詩企鵝 )Penguin for Tang Poetry ( 唐詩企鵝 ) yRobot Fighter ( 蘿蔔戰士 )Robot Fighter ( 蘿蔔戰士 ) ySinging Bass & Dog ( 大 嘴鱸魚和唱歌狗 )Singing Bass & Dog ( 大 嘴鱸魚和唱歌狗 )

-14- On-going Work zOn-going work: yTone recognition and assessment yRetroflex & nonretroflex recognition yDetection of “ 兒化音 ” zDemo page: yhttp://mirlab.org/mir_main/demo.htmhttp://mirlab.org/mir_main/demo.htm