Towards Automatic Fluency Assessment

Slides:



Advertisements
Similar presentations
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Advertisements

Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Conversation table using Google Hangout: from online chat to F2F chat -written fluency and oral fluency development WAFLT Fall Conference Nov. 8.
Catia Cucchiarini Quantitative assessment of second language learners’ fluency in read and spontaneous speech Radboud University Nijmegen.
EE3P BEng Final Year Project – 1 st meeting SLaTE – Speech and Language Technology in Education Martin Russell
Emotions and Voice Quality: Experiments with Sinusoidal Modeling Authors: Carlo Drioli, Graziano Tisato, Piero Cosi, Fabio Tesser Institute of Cognitive.
Presented by Ravi Kiran. Julia Hirschberg Stefan Benus Jason M. Brenier Frank Enos Sarah Friedman Sarah Gilman Cynthia Girand Martin Graciarena Andreas.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition Thurid Vogt, Elisabeth André ICME 2005 Multimedia concepts.
Extracting Social Meaning Identifying Interactional Style in Spoken Conversation Jurafsky et al ‘09 Presented by Laura Willson.
Sound and Speech. The vocal tract Figures from Graddol et al.
Language Comprehension Speech Perception Naming Deficits.
Spectacular Speech Speech I Ms. Jackson. Introduction Use an effective attention getter State the purpose of the speech Preview of the main topic Clear.
Why is ASR Hard? Natural speech is continuous
Phonetics Linguistics for ELT B Ed TESL 2005 Cohort 2.
Chapter 3.  The pre-reading skills that are the building blocks of future reading success:  Concepts of print: Phonemic Awareness-letters represent.
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
Hoarse meeting in Liverpool April 22, 2005 Subglottal pressure and NAQ variation in Classically Trained Baritone Singers Eva Björkner*†, Johan Sundberg†,
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
Speech rate affects the word error rate of automatic speech recognition systems. Higher error rates for fast speech, but also for slow, hyperarticulated.
English Pronunciation Learning System for Japanese Students Based on Diagnosis of Critical Pronunciation Errors Yasushi Tsubota, Tatsuya Kawahara, Masatake.
Whither Linguistic Interpretation of Acoustic Pronunciation Variation Annika Hämäläinen, Yan Han, Lou Boves & Louis ten Bosch.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Automated Scoring of Picture- based Story Narration Swapna Somasundaran Chong Min Lee Martin Chodorow Xinhao Wang.
Speech Science Fall 2009 Oct 28, Outline Acoustical characteristics of Nasal Speech Sounds Stop Consonants Fricatives Affricates.
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
The relationship between objective properties of speech and perceived pronunciation quality in read and spontaneous speech was examined. Read and spontaneous.
Turn-taking Discourse and Dialogue CS 359 November 6, 2001.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Speech and Language What’s the difference?. Definitions: What is Speech? What is Language? The term “Language” can refer to the content in your brain.
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
New Acoustic-Phonetic Correlates Sorin Dusan and Larry Rabiner Center for Advanced Information Processing Rutgers University Piscataway,
Study on the Development of Oral Proficiency in EFL Learners Under CALL Model Zheng Yurong Harbin Engineering University
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Arlindo Veiga 1,2 Sara Cadeias 1 Carla Lopes 1,2 Fernando Perdigão 1,2 1 Instituto.
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
Outline  I. Introduction  II. Reading fluency components  III. Experimental study  1) Method and participants  2) Testing materials  IV. Interpretation.
Exploring the relationship between linguistic knowledge, speech processing and oral fluency Dr Zöe Handley, University of York Dr Sible Andringa, Universität.
Research Methodology Proposal Prepared by: Norhasmizawati Ibrahim (813750)
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Automatic Speech Recognition
The effect of speech timing on velopharyngeal function
ASR-based corrective feedback on pronunciation: does it really work?
Fluency in Oral Interaction Workshop (FLOW)
Automatic screening of Alzheimer's disease using speech recognition
Investigating Pitch Accent Recognition in Non-native Speech
Approach, Design, & Procedure The Weakness & Strength
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Conditional Random Fields for ASR
Course Projects Speech Recognition Spring 1386
Recognizing Disfluencies
Studying Intonation Julia Hirschberg CS /21/2018.
Automatic Fluency Assessment
Job Google Job Title: Linguistic Project Manager
THE NATURE OF SPEAKING Joko Nurkamto UNS Solo.
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Recognizing Disfluencies
Elise A. Piazza, Marius Cătălin Iordan, Casey Lew-Williams 
Perceptions on L2 fluency-perspectives of untrained raters
†Department of Speech Music Hearing, KTH, Stockholm, Sweden
Emotional Speech Julia Hirschberg CS /16/2019.
Research on the Modeling of Chinese Continuous Speech Recognition
DIBELS: An Overview Kelli Anderson Early Intervention Specialist - ECC
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
CRITERIA SPEECH CRITERIA SPEECH
Low Level Cues to Emotion
Automatic Prosodic Event Detection
Presentation transcript:

Towards Automatic Fluency Assessment Suma Bhat

The Problem Fluency – subjective quantity Design of right quantifiers critical Measurement of the quantifiers required Problem: Automatic Assessment

Problem is Hard Fluency Subjective quantity Not readily measurable More than just the opposite of disfluency Key: design of good quantifiers

Previous Work Main work by Cucchiarini et al. Quantitative assessment of fluency in speech possible Measure quantifiers of syllable rate and frequency of pauses Use of ASR

Low level acoustic variables are good quantifiers of fluency Our Work Previous work used language specific training data to build ASR Our focus: low-level acoustic measurements Our thesis: Low level acoustic variables are good quantifiers of fluency

Experiments Start with speech signal View acoustic data at a coarse level Syllables well represented by corresponding vocalic nuclei Vocalic nuclei and silent pauses can be detected automatically Goal: correlate with human judgment

Data Data from rated assessment Classroom recording of 2nd language Chinese speech 20s snippets rated for Speech Flow Phonological Control Lexical Accuracy Disfluency Delivery Skills Fluency Fluency most correlated with Speech Flow,Disfluency

Assessment of Fluency Speech Flow Disfluency dur1=duration of speech without silent pauses dur2= total duration of speech Name Definition Articulation Rate Number of syllable-nuclei/dur1 Rate of Speech Number of syllable nuclei/dur2 Phonation/time ratio dur1/dur2 Mean length of silent pauses Mean length of silent pauses Number of silent pauses per second Number of silent pauses/dur2 Number of filled pauses per second Number of filled pauses/dur2 Speech Flow Disfluency

Acoustic Measurements Downsample to 16K Use intensity information Segment utterance into regions of speech and silence silent pause related information Detect vocalic regions in the speech segments syllable related information

Correlation with Human Rating Acoustic Measurement Human Rated Speech Flow Human Rated Disfluency Articulation rate 0.19 0.38 Rate of speech 0.30 0.5 Phonation time ratio 0.58 0.72 Length of silent pause -0.36 -0.35 Frequency of silent pause -0.4 -0.52 Frequency of Filled pause -0.21 -0.43

Mean Values of Quantifiers Acoustic Measurement Non-fluent Fluent Articulation rate (/s) 2.59 3.21 Rate of speech (/s) 1.22 1.93 Phonation time ratio 0.47 0.59 Length of silent pause (s) 1.18 1.05 Frequency of silent pause (/s) 0.83 0.58 Frequency of Filled pause (/s) 0.1 0.16

Conclusion Key Acoustic Features Phonation Time ratio most positively correlated with Speech Flow Frequency of silent pauses most negatively correlated with Speech Flow Summary: Quantifiers obtained by low-level acoustic measurements useful for non-rated assessment

Looking Ahead Measurements on more rated speech Detection of filled pause Look for additional quantifiers Poor pronunciation vocabulary richness Automatic classification of fluency