Towards Automatic Fluency Assessment Suma Bhat
The Problem Fluency – subjective quantity Design of right quantifiers critical Measurement of the quantifiers required Problem: Automatic Assessment
Problem is Hard Fluency Subjective quantity Not readily measurable More than just the opposite of disfluency Key: design of good quantifiers
Previous Work Main work by Cucchiarini et al. Quantitative assessment of fluency in speech possible Measure quantifiers of syllable rate and frequency of pauses Use of ASR
Low level acoustic variables are good quantifiers of fluency Our Work Previous work used language specific training data to build ASR Our focus: low-level acoustic measurements Our thesis: Low level acoustic variables are good quantifiers of fluency
Experiments Start with speech signal View acoustic data at a coarse level Syllables well represented by corresponding vocalic nuclei Vocalic nuclei and silent pauses can be detected automatically Goal: correlate with human judgment
Data Data from rated assessment Classroom recording of 2nd language Chinese speech 20s snippets rated for Speech Flow Phonological Control Lexical Accuracy Disfluency Delivery Skills Fluency Fluency most correlated with Speech Flow,Disfluency
Assessment of Fluency Speech Flow Disfluency dur1=duration of speech without silent pauses dur2= total duration of speech Name Definition Articulation Rate Number of syllable-nuclei/dur1 Rate of Speech Number of syllable nuclei/dur2 Phonation/time ratio dur1/dur2 Mean length of silent pauses Mean length of silent pauses Number of silent pauses per second Number of silent pauses/dur2 Number of filled pauses per second Number of filled pauses/dur2 Speech Flow Disfluency
Acoustic Measurements Downsample to 16K Use intensity information Segment utterance into regions of speech and silence silent pause related information Detect vocalic regions in the speech segments syllable related information
Correlation with Human Rating Acoustic Measurement Human Rated Speech Flow Human Rated Disfluency Articulation rate 0.19 0.38 Rate of speech 0.30 0.5 Phonation time ratio 0.58 0.72 Length of silent pause -0.36 -0.35 Frequency of silent pause -0.4 -0.52 Frequency of Filled pause -0.21 -0.43
Mean Values of Quantifiers Acoustic Measurement Non-fluent Fluent Articulation rate (/s) 2.59 3.21 Rate of speech (/s) 1.22 1.93 Phonation time ratio 0.47 0.59 Length of silent pause (s) 1.18 1.05 Frequency of silent pause (/s) 0.83 0.58 Frequency of Filled pause (/s) 0.1 0.16
Conclusion Key Acoustic Features Phonation Time ratio most positively correlated with Speech Flow Frequency of silent pauses most negatively correlated with Speech Flow Summary: Quantifiers obtained by low-level acoustic measurements useful for non-rated assessment
Looking Ahead Measurements on more rated speech Detection of filled pause Look for additional quantifiers Poor pronunciation vocabulary richness Automatic classification of fluency