Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech F. Bettens *, F. Grenez *, J. Schoentgen *,** * Université Libre.

Slides:



Advertisements
Similar presentations
ON THE REPRESENTATION OF VOICE SOURCE APERIODICITIES IN THE MBE SPEECH CODING MODEL Preeti Rao and Pushkar Patwardhan Department of Electrical Engineering,
Advertisements

Acoustic/Prosodic Features
Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.
Digital Signal Processing
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Spectral Analysis Feburary 24, 2009 Sorting Things Out 1.TOBI transcription homework rehash. And some structural reminders. 2.On Thursday: back in the.
Aging Female Voices: an Acoustic and Perceptive Analysis Technical University of Berlin, Germany Institute for Language and Communication Markus Brückl,
Voice Quality October 14, 2014 Practicalities Course Project report #2 is due! Also: I have new guidelines to hand out. The mid-term is on Tuesday after.
Voice and Voice Disorders
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
CENTER FOR SPOKEN LANGUAGE UNDERSTANDING 1 PREDICTION AND SYNTHESIS OF PROSODIC EFFECTS ON SPECTRAL BALANCE OF VOWELS Jan P.H. van Santen and Xiaochuan.
Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium.
Anatomy of the vocal mechanism
Vocal Joystick A New Dimension in Human-Machine Interaction ET 2 Presentation Group 3 Jeremy Moody, Carrie Chudy.
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
Emotions and Voice Quality: Experiments with Sinusoidal Modeling Authors: Carlo Drioli, Graziano Tisato, Piero Cosi, Fabio Tesser Institute of Cognitive.
SPPA 4030 Speech Science1 Phonation SPPA 4030 Speech Science2 Topic Sequence Anatomy review Achieving phonation Capturing glottal and vocal fold behavior.
Using Creaky Voice Index in Forensic Phonetics – Is it valid and is it reliable? ____________________________ Tuija Niemi-Laitinen Forensic Scientist/Technical.
Hearing & Deafness (5) Timbre, Music & Speech Vocal Tract.
Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.
Hearing & Deafness (5) Timbre, Music & Speech.
Learning Objectives Describe how speakers control frequency and amplitude of vocal fold vibration Describe psychophysical attributes of pitch, loudness.
Laryngeal Physiology.
Voice Assessment: Instrumental
Sound source segregation (determination)
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
Laryngeal Function and Speech Production
Breathy Voice
Chapter 4 Vocal Mechanism Perry C. Hanavan, AuD Question The larynx is the: A.Voice box B.Throat C.Esophagus D.Nasal passage E.Oral cavity.
Voice evaluation.
Hoarse meeting in Liverpool April 22, 2005 Subglottal pressure and NAQ variation in Classically Trained Baritone Singers Eva Björkner*†, Johan Sundberg†,
Instrumental Assessment SPPA 6400 Voice Disorders: Tasko.
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
Voice Quality Feburary 11, 2013 Practicalities Course project reports to hand in! And the next set of guidelines to hand out… Also: the mid-term is on.
Automatic Pitch Tracking September 18, 2014 The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
Speech Acoustics1 Clinical Application of Frequency and Intensity Variables Frequency Variables Amplitude and Intensity Variables Voice Disorders Neurological.
Comparing Audio Signals Phase misalignment Deeper peaks and valleys Pitch misalignment Energy misalignment Embedded noise Length of vowels Phoneme variance.
Male Cheerleaders and their Voices. Background Information: What Vocal Folds Look Like.
Voice Quality + Spectral Analysis Feburary 15, 2011.
Acoustic Cues to Laryngeal Contrasts in Hindi Susan Jackson and Stephen Winters University of Calgary Acoustics Week in Canada October 14,
In The Name of God The Compassionate The Merciful.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
Predicting Voice Elicited Emotions
SPPA 6010 Advanced Speech Science
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Voice Quality Feburary 13, 2014 Practicalities The mid-term is on the Thursday after the break! So I have a review sheet for you. For the mid-term, we.
Topic: Pitch Extraction
Behrman Chapter 5, 6 Place less emphasis on… Minor anatomical landmarks and features Extrinsic muscles of the larynx Blood supply to the larynx Central.
Basic Acoustics. Sound – your ears’ response to vibrations in the air. Sound waves are three dimensional traveling in all directions. Think of dropping.
               n n n d d d z z z       Elettoglotographic study from Brazilian Portuguese fricative voiced sounds Dra. Luciana de.
HOW WE TRANSMIT SOUNDS? Media and communication 김경은 김다솜 고우.
Pitch and Amplitude Perturbation (Jitter and Shimmer) Basic idea: Phonated speech is called quasiperiodic, with quasi being Latin for “sort of” or “more-or-less.
Pitch and Amplitude Perturbation (Jitter and Shimmer)
Instrumental Assessment
Laryngeal correlates of the English tense/lax vowel contrast
Breathy Voice Note that you can hear both a buzzy (periodic) component and a hissy (aperiodic) component.
Analyzing the Speech Signal
Speech Perception CS4706.
Linear Predictive Coding Methods
Analyzing the Speech Signal
†Department of Speech Music Hearing, KTH, Stockholm, Sweden
Julia Hirschberg and Sarah Ita Levitan CS 6998
Presentation transcript:

Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech F. Bettens *, F. Grenez *, J. Schoentgen *,** * Université Libre de Bruxelles ** National Fund for Scientific Research Belgium

CauseVocal Dysperiodicities Vocal Fold Dynamics Diplophonia, Bi-Phonation, Random Vibrations Perturbations Vocal Jitter & Shimmer, Frequency & Amplitude Tremor (Audible) Additive Noise Owing to Turbulence Breathiness, Breathy Voice, Whispery Voice, … “Parasitic” Vibrations Vibrations Ventricular Folds or Ary-Epiglottic Ligaments, … Transients Pitch Breaks, Phonation Breaks, Timbre Breaks, …

Existing Cues of Vocal Noise Detection of individual vocal cycles (or harmonics)  Steady vowel fragments  (Pseudo)-Periodicity Period Perturbation Quotient Amplitude Perturbation Quotient Harmonics-to-Noise Ratio

Objectives : Analyses of Dysperiodicities Give up request that speech fragments are :  (Pseudo)-Periodic  Steady Any Speech Fragment :  Modal Voices & (Very) Hoarse Voices  Sustained Vowels & Running Speech

Motivation : Analysis of Running Speech Voicing in running speech  Variable acoustic impedance  Voicing onsets & offsets  Variable pressure drops  Variable laryngeal positions Voice Loading

Double Linear Predictive Analysis Conventional short-term linear prediction: Long-term linear prediction: remove existing correlations  unpredictable noise component (Qi, 1999) forward short-term prediction error forward double prediction error

Double Linear Predictive Analysis Drawbacks: –e S [n] is an artificial signal –the dysperiodicities in weighted sum x [n] are omitted –e L [n] is inflated to the right of unvoiced/voiced boundaries  Solutions:  remove short-term linear predictive analysis stage  proceed to bi-directional analysis

Forward long-term linear prediction: Backward long-term linear prediction: Bi-directional long-term linear prediction: keep the “best” (frame by frame) Bi-directional Long-term Prediction forward long-term prediction error backward long-term prediction error bi-directional long- term prediction error

Long-term Prediction Distance : P Maximum of the auto-correlation function example: steady vowel [a] (dysphonic speaker)  P = 184 (2 cycles) 

Vocal Noise Cue Signal-to-Dysperiodicity Ratio: SDR = 31,2 dB speech signal dysphonic speaker bi-directional long-term prediction error SDR = 10,1 dB healthy speaker x[n]x[n] eL[n]eL[n] example: steady vowel [a]

Results 1 : Sentence (1 female speaker; modal phonation type) ( : “Il est sorti avant le jour”) speech signal forward long-term prediction error bi-directional long-term prediction error segments [il]

Results 2 : Sentence (1 female speaker; 5 phonation types) ( : “Il est sorti avant le jour”)

Conclusion The forward & backward long-term prediction of speech enables the analysis of any speech signal with a view to the assessment of the vocal noise (i.e. vocal dysperiodicities) The analysis is not based on any assumptions regarding the periodicity or stationarity of the speech signals