Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University.

Slides:



Advertisements
Similar presentations
Presented by Erin Palmer. Speech processing is widely used today Can you think of some examples? Phone dialog systems (bank, Amtrak) Computers dictation.
Advertisements

Acoustic/Prosodic Features
Acoustic Characteristics of Consonants
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
Acoustic Characteristics of Vowels
Auditory Neuroscience - Lecture 1 The Nature of Sound auditoryneuroscience.com/lectures.
Vocal Register Basic idea is pretty simple: As a speaker goes from low to medium to high f0, the obvious changes in glottal pulse rate are often accompanied.
Fundamental Frequency & Jitter Lab 2. Fundamental Frequency Pitch is the perceptual correlate of F 0 Perception is not equivalent to measurement: –Pitch=
Vocal registers revisited Gerrit Bloothooft Peter Pabon Utrecht Institute of Linguistics OTS The Netherlands.
Basic Spectrogram Lab 8. Spectrograms §Spectrograph: Produces visible patterns of acoustic energy called spectrograms §Spectrographic Analysis: l Acoustic.
Anatomy of the vocal mechanism
Introduction to Acoustics Words contain sequences of sounds Each sound (phone) is produced by sending signals from the brain to the vocal articulators.
PHYSICAL PROPERTIES OF SPEECH SOUNDS
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
Speech perception Relating features of hearing to the perception of speech.
Relations between vocal registers in voice breaks Gerrit Bloothooft Mieke van Wijck Peter Pabon UiL-OTS Utrecht University.
Whispered Speech A Presentation by Susanne Filges, Agata Mroczkowska and Annette Radon.
Using Creaky Voice Index in Forensic Phonetics – Is it valid and is it reliable? ____________________________ Tuija Niemi-Laitinen Forensic Scientist/Technical.
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Learning Objectives Describe how speakers control frequency and amplitude of vocal fold vibration Describe psychophysical attributes of pitch, loudness.
Laryngeal Physiology.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
A PRESENTATION BY SHAMALEE DESHPANDE
SOUND A vibrating object, such as your voice box, stereo speakers, guitar strings, etc., creates longitudinal waves in the medium around it. When these.
Representing Acoustic Information
LOOK 8/19/2015Theatre Arts 1(T) Sound: Properties and Functions Theatre Arts 1(T)
CS 551/651: Structure of Spoken Language Lecture 1: Visualization of the Speech Signal, Introductory Phonetics John-Paul Hosom Fall 2010.
Source/Filter Theory and Vowels February 4, 2010.
Section 1: Sound Preview Key Ideas Bellringer Properties of Sound
Instrumental Assessment SPPA 6400 Voice Disorders: Tasko.
Artificial Intelligence 2004 Speech & Natural Language Processing Natural Language Processing written text as input sentences (well-formed) Speech.
MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Speech Acoustics1 Clinical Application of Frequency and Intensity Variables Frequency Variables Amplitude and Intensity Variables Voice Disorders Neurological.
Speech Science Oct 7, 2009.
Eva Björkner Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing HUT, Helsinki, Finland KTH – Royal Institute of Technology.
Introduction to SOUND.
Chapter 7: Loudness and Pitch. Loudness (1) Auditory Sensitivity: Minimum audible pressure (MAP) and Minimum audible field (MAF) Equal loudness contours.
Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE Speech Processing Instructor: Dr Kepuska.
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio
SOUND PRESSURE, POWER AND LOUDNESS MUSICAL ACOUSTICS Science of Sound Chapter 6.
EQ: How do different mediums affect the speed of sound?
Waves Mechanical Electromagnetic Medium Transfer of … Pulse Periodic Perpendicular Parallel.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
SPPA 6010 Advanced Speech Science
1/27/2016Tech I Lighting Unit1 Sound: Properties and Functions Technical Theatre I.
A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.
Chapter 12 Preview Objectives The Production of Sound Waves
PROPERTIES OF SOUND CHAPTER 2, SECTION 2. Loudness describes your ___PERCEPTION__ of the energy of a sound. The loudness of sound depends on 2 factors:
Properties of Sound. Loudness Loudness describes your perception of the energy of sound – It describes what you hear The closer you are to the sound,
SOUND PRESSURE, POWER AND LOUDNESS
Physics Mrs. Dimler SOUND.  Every sound wave begins with a vibrating object, such as the vibrating prong of a tuning fork. Tuning fork and air molecules.
Sound and Hearing Chapter 17 Section Four. Science Journal Entry 35 Compare and contrast reflection, refraction and diffraction.
Sound and LightSection 1 Properties of Sound 〉 What are the characteristics of sound waves? 〉 Sound waves are caused by vibrations and carry energy through.
Sound and LightSection 1 Section 1: Sound Preview Key Ideas Bellringer Properties of Sound Musical Instruments Hearing and the Ear Ultrasound and Sonar.
1 Chapter 15 Objectives: 1) Explain a sound wave in terms of wave type and classification. 2) Describe a medium’s characteristics that affect the speed.
HOW WE TRANSMIT SOUNDS? Media and communication 김경은 김다솜 고우.
Chapter 3: The Speech Process
Vocal Register Basic idea is pretty simple: As a speaker goes from low to medium to high f0, the obvious changes in glottal pulse rate are often accompanied.
Instrumental Assessment
Fundamental Frequency Change
Section 1: Sound Preview Key Ideas Bellringer Properties of Sound
Section 1: Sound Preview Key Ideas Bellringer Properties of Sound
Higher Intensity (Volume)
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
Speech Perception CS4706.
Voice source characterisation
1. SPEECH PRODUCTION MUSIC 318 MINI-COURSE ON SPEECH AND SINGING
Speech Perception (acoustic cues)
PROPERTIES OF SOUND CHAPTER 2, SECTION 2
Presentation transcript:

Voice source characterisation Gerrit Bloothooft UiL-OTS Utrecht University

Emasters School Leuven 2002Voice Source Characterization2 Voice research To describe and model the properties of the vocal sound source from view points of: –Physiology –Acoustics –Perception

Emasters School Leuven 2002Voice Source Characterization3 Importance of the voice Speech synthesis –Towards natural sounding synthesis Speech recognition –Using source properties in recognition Speaker recognition/identification –Voice source characteristics are essential Diagnosis –Pathologies, voice classifications

Emasters School Leuven 2002Voice Source Characterization4 Voice possibilities Limited use of voice in speech Range of the fundamental frequency Vocal intensity range Spectral variation

Emasters School Leuven 2002Voice Source Characterization5 Focus in this presentation How do acoustic voice source characteristics vary as a function of F 0 and vocal intensity

Emasters School Leuven 2002Voice Source Characterization6 Voice profile measurement Thirties: Intensity range as function of various pitches –manual measurement Eighties: Automatic computation of F 0 and Intensity – computer measurement – visual feedback – additional parameters

Emasters School Leuven 2002Voice Source Characterization7 Measurement unit One decibel One semi-tone

Emasters School Leuven 2002Voice Source Characterization8 Measurement procedure Subject in front of computer screen Microphone on head set (30 cm) Just phonate, sing, and see the result immediately Best results with recording protocol Feed back stimulates extreme phonations

Emasters School Leuven 2002Voice Source Characterization9 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Sample density Voice profile / density

Emasters School Leuven 2002Voice Source Characterization10 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Sample density Voice profile / speech area

Emasters School Leuven 2002Voice Source Characterization11 Acoustic voice quality parameters Jitter –Stability of periodicity –Asymmetry in vocal folds Crest factor –Max amplitude divided by average energy –Relates to spectral slope Many more …

Emasters School Leuven 2002Voice Source Characterization12 Crest factor Vocal Intensity (dB SPL) Fundamental frequency (Hz) Crest factor

Emasters School Leuven 2002Voice Source Characterization13

Emasters School Leuven 2002Voice Source Characterization14 Real time presentation Screen presentation One data point per F 0 -I cell Advanced data storage [new] Full audio signal Full distribution of data per F 0 -I cell Data for screen presentation

Emasters School Leuven 2002Voice Source Characterization15 Advantages Reusability of recordings Statistical analysis per F 0 -I cell Study of time-varying behavior

Emasters School Leuven 2002Voice Source Characterization16 Crest factor Vocal Intensity (dB SPL) Fundamental frequency (Hz) Crest factor

Emasters School Leuven 2002Voice Source Characterization17 Median smoothing of crest factor Vocal Intensity (dB SPL) Fundamental frequency (Hz) Crest factor Crest factor median smoothed

Emasters School Leuven 2002Voice Source Characterization18 Vocal Registers Different movement patterns of the vocal folds Pulse register (creaky voice) Modal register Falsetto register

Emasters School Leuven 2002Voice Source Characterization19 Pulse register Less than 50 Hz Irregular Long closed period

Emasters School Leuven 2002Voice Source Characterization20 Fundamental Frequency (Hz) Vocal Intensity (dB SPL) Pulse register

Emasters School Leuven 2002Voice Source Characterization21 Modal register “Normal” use of voice Active role of M. Vocalis Vocal folds thick and completely vibrating Wide range in F 0 and intensity Flat spectrum

Emasters School Leuven 2002Voice Source Characterization22 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Modal register

Emasters School Leuven 2002Voice Source Characterization23 Falsetto register Higher pitches M. Vocalis passive, tense vocal ligaments through M.Cricothyroidus Edge vibration of vocal volds Sound poor in higher harmonics (in untrained subjects)

Emasters School Leuven 2002Voice Source Characterization24 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Falsetto register

Emasters School Leuven 2002Voice Source Characterization25 Fundamental frequency (Hz) Vocal Inensity (dB SPL) Register overlap

Emasters School Leuven 2002Voice Source Characterization26 Chest- en head voice Refer to secundary vibratory sensations in the body Chest voice: loud modal register Head voice: –males: higher, softer modal register in overlap area with falsetto register –women: falsetto register

Emasters School Leuven 2002Voice Source Characterization27 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Chest voice and Head voice chest head

Emasters School Leuven 2002Voice Source Characterization28 Registers and voice profiles With a description using Iso-crest factor lines Iso-jitter lines

Emasters School Leuven 2002Voice Source Characterization29 Iso-crest factor lines 4 dB 6 dB Vocal Intensity (dB SPL) Crest factor Fundamental frequency (Hz)

Emasters School Leuven 2002Voice Source Characterization30 Vocal Intensity (dB SPL) Fundamental frequency (Hz) 3 % Jitter (%) Iso-jitter lines

Emasters School Leuven 2002Voice Source Characterization31 New representation Areas defined by iso-parameter lines –crest factor < 4 dB –crest factor > 4 dB, < 6 dB –crest factor > 6 dB –jitter < 3 % –[relative rise time < 6 %]

Emasters School Leuven 2002Voice Source Characterization32 Areas in the phonetogram Vocal Intensity (dB SPL) Fundamental frequency (Hz) Jitter > 3%, unstable RRT < 6 % pressed-like Crest factor < 4 dB sine-like

Emasters School Leuven 2002Voice Source Characterization33 Fundamental frequency (Hz) Vocal registers in the phonetogram Falsetto upper boundary Modal lower boundary Chest voice boundary Vocal Intensity (dB SPL)

Emasters School Leuven 2002Voice Source Characterization34 Comparison of voice profiles Characterisation of Voice pathologies Voice classifications Reuse stored voice profiles of subjects with known voice history

Emasters School Leuven 2002Voice Source Characterization35 Important features Contour has limited value –but most research goes into that direction (norm profiles) Distribution of acoustical parameters across the voice profile tells much more

Emasters School Leuven 2002Voice Source Characterization36 Unit for comparison Voice profile unit defined by small range of F 0 and Vocal Intensity Distributions of acoustic voice parameters per unit Probability density function per parameter Model Hidden Markov Model We need

Emasters School Leuven 2002Voice Source Characterization37 two unconnected states per phonetogram unit vocal registers start and end of phonetion Unit model

Emasters School Leuven 2002Voice Source Characterization38 SpeechVoice Profile phoneme modelF 0 /I unit model not labeledlabeled by F 0 and I spectral envelopeacoustic voice parameters language modelunrestricted transitions “forced alignment recognition” Correspondences

Emasters School Leuven 2002Voice Source Characterization39 Crest factor distributions

Emasters School Leuven 2002Voice Source Characterization40 Fundamental frequency (Hz) Vocal Intensity (dB SPL) Distinctiveness Most distinctive states

Emasters School Leuven 2002Voice Source Characterization41 Conclusions Voice profiles can enhance our understanding of vocal behaviour in a visually attractive way Current data storage opens a series of important research topics Market opportunities for “light” versions