Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium.

Slides:



Advertisements
Similar presentations
Change-Point Detection Techniques for Piecewise Locally Stationary Time Series Michael Last National Institute of Statistical Sciences Talk for Midyear.
Advertisements

An Introduction to Fourier and Wavelet Analysis: Part I Norman C. Corbett Sunday, June 1, 2014.
Acoustic/Prosodic Features
Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.
DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Exam 1 review: Quizzes 1-6.
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
1 / 13 Fourier, bandwidth, filter. 2 / 13 The important roles of Fourier series and Fourier transforms: –To analysis and synthesis signals in frequency.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Introduction Relative weights can be estimated by fitting a linear model using responses from individual trials: where g is the linking function. Relative.
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
CENTER FOR SPOKEN LANGUAGE UNDERSTANDING 1 PREDICTION AND SYNTHESIS OF PROSODIC EFFECTS ON SPECTRAL BALANCE OF VOWELS Jan P.H. van Santen and Xiaochuan.
Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech F. Bettens *, F. Grenez *, J. Schoentgen *,** * Université Libre.
Speaking Style Conversion Dr. Elizabeth Godoy Speech Processing Guest Lecture December 11, 2012.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Anatomy of the vocal mechanism
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
1 Eigenmannia: Glass Knife Fish A Weakly Electric Fish Electrical organ discharges (EODs) – Individually fixed between 250 and 600 Hz –Method of electrolocation.
Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.
Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
Pole Zero Speech Models Speech is nonstationary. It can approximately be considered stationary over short intervals (20-40 ms). Over thisinterval the source.
1 Speech Parametrisation Compact encoding of information in speech Accentuates important info –Attempts to eliminate irrelevant information Accentuates.
Stability Spectral Analysis Based on the Damping Spectral Analysis and the Data from Dryden flight tests, ATW_f5_m83h10-1.
FMRI: Biological Basis and Experiment Design Lecture 26: Significance Review of GLM results Baseline trends Block designs; Fourier analysis (correlation)
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
Correlation and spectral analysis Objective: –investigation of correlation structure of time series –identification of major harmonic components in time.
Representing Acoustic Information
Constant process Separate signal & noise Smooth the data: Backward smoother: At any give T, replace the observation yt by a combination of observations.
Source/Filter Theory and Vowels February 4, 2010.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Kinect Player Gender Recognition from Speech Analysis
Instrumental Assessment SPPA 6400 Voice Disorders: Tasko.
Lecture 1 Signals in the Time and Frequency Domains
Basics of Signal Processing. SIGNALSOURCE RECEIVER describe waves in terms of their significant features understand the way the waves originate effect.
Topics covered in this chapter
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Understanding Multivariate Research Berry & Sanders.
METHODOLOGY INTRODUCTION ACKNOWLEDGEMENTS LITERATURE Low frequency information via a hearing aid has been shown to increase speech intelligibility in noise.
93 SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES VOICE SPECTRUM SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES.
Filtering. What Is Filtering? n Filtering is spectral shaping. n A filter changes the spectrum of a signal by emphasizing or de-emphasizing certain frequency.
1 Methods for detection of hidden changes in the EEG H. Hinrikus*, M.Bachmann*, J.Kalda**, M.Säkki**, J.Lass*, R.Tomson* *Biomedical Engineering Center.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
EECE 252 PROJECT SPRING 2014 Presented by: Peizhen Sun Nor Asma Mohd Sidik.
TIME SERIES ANALYSIS Time series – collection of observations in time: x( t i ) x( t i ) discrete time series with Δt Deterministic process: Can be predicted.
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Feedback Filters n A feedback filter processes past output samples, as well as current input samples: n Feedback filters create peaks (poles or resonances)
Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Search for bursts with the Frequency Domain Adaptive Filter (FDAF ) Sabrina D’Antonio Roma II Tor Vergata Sergio Frasca, Pia Astone Roma 1 Outlines: FDAF.
Topic: Pitch Extraction
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
Quantitative methods and R – (2) LING115 December 2, 2009.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
CS 591 S1 – Computational Audio -- Spring, 2017
Time Series Analysis and Its Applications
Signal processing.
Basic Statistical Terms
Voice source characterisation
Measured Period VOICE SIGNAL
TIME SERIES ANALYSIS Time series – collection of observations in time: x( ti ) x( ti ) discrete time series with Δt Deterministic process: Can be predicted.
TIME SERIES ANALYSIS Time series – collection of observations in time: x( ti ) x( ti ) discrete time series with Δt Deterministic process: Can be predicted.
Presentation transcript:

Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium

Vocal microtremor (definition) Modulation of the phonatory frequency Distinct from pathological vocal tremor Hz (Titze, 1995, Sataloff, 1997) Two features : modulation level and modulation frequency

Vocal microtremor (examples)

Motivation ? Tremor data are scarce Vocal jitter & microtremor are base-line phenomena Measurement of vocal tremor frequency via the cycle length time series Test predictions of a simulation model of jitter and tremor (Schoentgen, 2001)

Experiment I : Objectives ? Recording data (tremor level & frequency) Differences between vowel timbres ? Differences between male & female speakers ? Differences between normophonic & mildly dysphonic speakers ?

Corpora Sustained vowels [a], [i], [u] 22 males, 16 females (normophonic) 16 males, 28 females (dysphonic) Voice type : monocycle periodic Register : modal No register or type breaks, or voice arrests No cycle length outliers No excessive additive noise or jitter No pathological vocal tremor

Method (tremor frequency) 1.Estimation of the average cycle length 2.Upsampling (160 kHz) and low-pass filtering of the speech signal 3.Extraction of the vocal cycle length time series via peak picking 4.Removal of frequency drift or glissando 5.Calculation of the magnitude spectrum of the time series 6.Search for the statistically significant spectral peaks 7.Tremor frequency = weighted average of spectral peak positions

Examples of spectra

Method (tremor level) 1.Upsampling (160kHz) and low-pass filtering of the speech signal 2.Extraction of the vocal cycle length time series via peak picking 3.Removal of frequency drift or glissando 4.Smoothing of the time series to decrease jitter 5.Tremor level -> standard deviation of the smoothed cycle length perturbations (divided by average cycle length)

Results (1) No statistically significant differences for the modulation frequency (Hz) and modulation level (%) between : Male & female speakers Normophonic & mildly dysphonic speakers Vowel timbres

Results (2) FeatureInter-quartile interval Modulation level0.4 % % Modulation frequency Hz

Results (3) Dissimilarities between modulation data reported by different studies are due to different cutoff frequencies below which spectral peaks are considered not to contribute to vocal microtremor

Experiment II : Objective Compare the size of vocal cycle length perturbations owing to jitter and frequency tremor

Corpus & Method 22 male and 16 female speakers sustained [a], [i] and [u]. oUpsampling (160kHz) and low-pass filtering of the speech signal oExtraction of the vocal cycle length time series via signal zero-crossings oRemoval of frequency drift or glissando

Linear auto-regressive analysis of the cycle length time series (e.g. Schoentgen, 1995) Present perturbation = weighted sum of past perturbations + de-correlated noise De-correlated noise -> vocal jitter Weighted sum -> vocal tremor (by default) Calculate sample standard deviation for each (& divide by average cycle length)

Example

Results (1) : [a] (inter-quartile ranges) malefemale Relative raw perturbations % % Modulation level % % Relative vocal jitter % %

Results (2) Vocal jitter (%) < vocal tremor (%) (statistically significant) Moderate significant correlation between vocal jitter & tremor (in %) No significant tremor differences between vowel timbres No significant tremor differences between speaker genders

Conclusion Vocal frequency (micro)tremor data can be obtained via the cycle length time series This may be generalized to pathological tremor data, but an additional stage may be required which is the re-sampling of the cycle length time series at equal intervals