Glottal Source Parameterization: A Comparative Study Authors: Ixone Arroabarren, Alfonso Carlosena UNIVERSIDAD PÚBLICA DE NAVARRA Dpt. Electrical Engineering.

Slides:



Advertisements
Similar presentations
1 Analysis of Parameter Importance in Speaker Identity Ricardo de Córdoba, Juana M. Gutiérrez-Arriola Speech Technology Group Departamento de Ingeniería.
Advertisements

Digital Signal Processing
Analysis and Digital Implementation of the Talk Box Effect Yuan Chen Advisor: Professor Paul Cuff.
1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Anna Barney, Antonio De Stefano ISVR, University of Southampton, UK & Nathalie Henrich LAM, Université Paris VI, France The Effect of Glottal Opening on.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
8 VOCE VISTA, ELECTROGLOTTOGRAMS, CLOSED QUOTIENTS
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Toward a high-quality singing synthesizer with vocal texture control Hui-Ling Lu Center for Computer Research in Music and Acoustics (CCRMA) Stanford University,
Vocal microtremor in normophonic and mildly dysphonic speakers Jean Schoentgen Université Libre Bruxelles Brussels - Belgium.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Complete Discrete Time Model Complete model covers periodic, noise and impulsive inputs. For periodic input 1) R(z): Radiation impedance. It has been shown.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
SOME SIMPLE MANIPULATIONS OF SOUND USING DIGITAL SIGNAL PROCESSING Richard M. Stern demo August 31, 2004 Department of Electrical and Computer.
Introduction to Speech Synthesis ● Key terms and definitions ● Key processes in sythetic speech production ● Text-To-Phones ● Phones to Synthesizer parameters.
Chapter 1 Introduction § 1.1 Problem and Analysis § 1.2 Data Engineering § 1.3 Scope § 1.4 Limitations of Course R. J. Chang Department of Mechanical Engineering.
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
Analysis & Synthesis The Vocoder and its related technology.
1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
Source/Filter Theory and Vowels February 4, 2010.
Fundamentals of Digital Communication
Hoarse meeting in Liverpool April 22, 2005 Subglottal pressure and NAQ variation in Classically Trained Baritone Singers Eva Björkner*†, Johan Sundberg†,
IIT Bombay ICA 2004, Kyoto, Japan, April 4 - 9, 2004   Introdn HNM Methodology Results Conclusions IntrodnHNM MethodologyResults.
Topics covered in this chapter
Automatic Pitch Tracking September 18, 2014 The Digitization of Pitch The blue line represents the fundamental frequency (F0) of the speaker’s voice.
Resonance, Revisited March 4, 2013 Leading Off… Project report #3 is due! Course Project #4 guidelines to hand out. Today: Resonance Before we get into.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
Automatic Pitch Tracking January 16, 2013 The Plan for Today One announcement: Starting on Monday of next week, we’ll meet in Craigie Hall D 428 We’ll.
Björkner, Eva Researcher, Doctoral Student Address Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing P.O. Box 3000.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
93 SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES VOICE SPECTRUM SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIES.
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Structure of Spoken Language
Linear Predictive Analysis 主講人:虞台文. Contents Introduction Basic Principles of Linear Predictive Analysis The Autocorrelation Method The Covariance Method.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Pitch Estimation by Enhanced Super Resolution determinator By Sunya Santananchai Chia-Ho Ling.
IIT Bombay 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.
Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,
More On Linear Predictive Analysis
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
SPPA 6010 Advanced Speech Science
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
P105 Lecture #27 visuals 20 March 2013.
Topic: Pitch Extraction
High Quality Voice Morphing
The Effect of an Artificially Lengthened Vocal Tract on Estimated Glottal Contact Quotient in Untrained Male Voices  Christopher S. Gaskill, Molly L.
P105 Lecture #26 visuals 18 March 2013.
Laryngeal correlates of the English tense/lax vowel contrast
Speech Conductor Team Six (see below)
Linear Predictive Coding Methods
The Vocoder and its related technology
Remember me? The number of times this happens in 1 second determines the frequency of the sound wave.
†Department of Speech Music Hearing, KTH, Stockholm, Sweden
The Production of Speech
Digital Systems: Hardware Organization and Design
Speech Processing Final Project
Presentation transcript:

Glottal Source Parameterization: A Comparative Study Authors: Ixone Arroabarren, Alfonso Carlosena UNIVERSIDAD PÚBLICA DE NAVARRA Dpt. Electrical Engineering and Electronic Campus de Arrosadía E Pamplona, Navarra, SPAIN VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESIS GENEVA - AUGUST 27-29, 2003 ISCA Tutorial and Research Workshop International Speech Communication Association

Glottal Source Parameterization: A Comparative Study SUMMARY 0. Introduction 1. New Glottal Source Parameters 2. NAQ and PSP in the LF model 3. Practical considerations 4. Natural Speech Analysis 5. Conclusions Universidad Pública de Navarra

0. INTRODUCTION The study of Voice Quality Speech, Singing Inverse Filtering Glottal Source Parameterization Glottal Source Vocal Tract Response Glottal Source Parameters Open Quotient Asymmetry Coefficient Spectral Tilt  Direct Estimation methods  Fit Estimation methods  New Glottal Source Parameters: NAQ and PSP - Time Domain: Measurement of Landmarks - A model fitting procedure - Time domain references are avoided - Amplitude and Fundamental frequency Normalized Parameters - Frequency Domain: Spectral Correlates (H1-H2) ~ Oq Universidad Pública de Navarra Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 1. NEW GLOTTAL SOURCE PARAMETERS (I) Normalized Amplitude Quotient (NAQ) Simplified Glottal Source Waveform Strongly correlated with the Closing Quotient Normalized in amplitude and frequency Closing Quotient Natural Speech Inverse Filtered Glottal Source Normalized Amplitude Quotient T Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 1. NEW GLOTTAL SOURCE PARAMETERS (II) Parabolic Spectral Parameter (PSP) Inverse Filtered Glottal Source Pitch Synchronous Glottal Source Spectrum Parabolic Curve: a depends on Fundamental Frequency F o = 100 HzF o = 200 Hz Normalized in amplitude and frequency Glottal Source Parameterization: A Comparative Study

The study of Voice Quality Speech, Singing Inverse Filtering Glottal Source Parameterization Glottal Source Vocal Tract Response Glottal Source Parameters Open Quotient Asymmetry Coefficient Spectral Tilt  Direct Estimation methods  Fit Estimation methods  New Glottal Source Parameters: NAQ and PSP - Time Domain: Measurement of Landmarks - A model fitting procedure - Time domain references are avoided - Amplitude and Fundamental frequency Normalized Parameters - Frequency Domain: Spectral Correlates (H1-H2) ~ Oq Universidad Pública de Navarra Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 2. NAQ AND PSP IN THE LF MODEL (I) LF model Glottal Source Derivative  Direct Synthesis Parameters  Timing Parameters  Glottal Source Parameters Open Quotient Asymmetry Coefficient Spectral Tilt Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 2. NAQ AND PSP IN THE LF MODEL (II) NAQ and PSP versus (O q, , f t ) Both parameters are correlated Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 3. PRACTICAL CONSIDERATIONS Sampling Frequency NAQ Calculation: One fundamental period extraction PSP Calculation: Right Period Extraction Wrong Period Extraction Maximum Error High Error  Loss of resolution Low Error  Wrong values Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 4. NATURAL SPEECH ANALYSIS (I) Voice Material Speech database 1 (American English)  1 male  Sentence : "She has left for a great party today."  EGG  Author : Christophe d'ALESSANDRO, ©2003, LIMSI-CNRS, France Modal VoiceModal Vocal TractSmallLong TensionRelaxTense PitchLowHigh VOCAL QUALITIES Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 4. NATURAL SPEECH ANALYSIS (II) Analysis Procedure Speech, Singing NAQ CalculationPSP Calculation Period Extraction Inverse Filtering Glottal Source DerivativeVocal Tract Response Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 4. NATURAL SPEECH ANALYSIS (III) Results Vowel [a:] Vowel [ e ] Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 4. NATURAL SPEECH ANALYSIS (IV) Results Vowel [I] Glottal Source Parameterization: A Comparative Study

Universidad Pública de Navarra 5. CONCLUSIONS  The NAQ and the PSP are two global parameters  f(O q, , f t )  Although their definition is different they parameterize the Glottal Source in the same way  The NAQ calculation is more robust than the PSP calculation  Although PSP is a spectral measurement it depends on how the glottal period is extracted Glottal Source Parameterization: A Comparative Study

Authors: Ixone Arroabarren, Alfonso Carlosena UNIVERSIDAD PÚBLICA DE NAVARRA Dpt. Electrical Engineering and Electronic Campus de Arrosadía E Pamplona, Navarra, SPAIN VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESIS GENEVA - AUGUST 27-29, 2003 ISCA Tutorial and Research Workshop International Speech Communication Association