Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson

Slides:



Advertisements
Similar presentations
Sedan Interior Acoustics
Advertisements

Normal Aspects of Articulation. Definitions Phonetics Phonology Articulatory phonetics Acoustic phonetics Speech perception Phonemic transcription Phonetic.
Acoustic and Physiological Phonetics
CS 551/651: Structure of Spoken Language Spectrogram Reading: Approximants John-Paul Hosom Fall 2010.
Perturbation Theory, part 2 November 4, 2014 Before I forget Course project report #3 is due! I have course project report #4 guidelines to hand out.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Basic Spectrogram & Clinical Application: Consonants
Vowel Formants in a Spectogram Nural Akbayir, Kim Brodziak, Sabuha Erdogan.
1 CS 551/651: Structure of Spoken Language Lecture 4: Characteristics of Manner of Articulation John-Paul Hosom Fall 2008.
From Resonance to Vowels March 8, 2013 Friday Frivolity Some project reports to hand back… Mystery spectrogram reading exercise: solved! We need to plan.
“Connecting the dots” How do articulatory processes “map” onto acoustic processes?
Perturbation Theory March 11, 2013 Just So You Know The Fourier Analysis/Vocal Tract exercise is due on Wednesday. Please note: don’t make too much out.
Vowels and Tubes (again) March 22, 2011 Today’s Plan Perception experiment! Discuss vowel theory #2: tubes! Then: some thoughts on music. First: let’s.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson
Comments, Quiz # 1. So far: Historical overview of speech technology - basic components/goals for systems Quick overview of pattern recognition basics.
ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.
ACOUSTICAL THEORY OF SPEECH PRODUCTION
The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.
Speech Recognition Acoustic Theory of Speech Production.
PH 105 Dr. Cecilia Vogel Lecture 14. OUTLINE  consonants  vowels  vocal folds as sound source  formants  speech spectrograms  singing.
PHYS 103 lecture 29 voice acoustics. Vocal anatomy Air flow through vocal folds produces “buzzing” (like lips) Frequency is determined by thickness (mass)
Example: Acoustics in a Muffler. Introduction The damping effectiveness of a muffler is studied in the frequency range 100─1000 Hz In the low-frequency.
ELEG 648 Plane waves II Mark Mirotznik, Ph.D. Associate Professor The University of Delaware
Example: Acoustics in a Muffler. Introduction The damping effectiveness of a muffler is studied in the frequency range 100─1000 Hz In the low-frequency.
It was assumed that the pressureat the lips is zero and the volume velocity source is ideal  no energy loss at the input and output. For radiation impedance:
Pitch changes result from changing the length and tension of the vocal folds The pitch you produce is based on the number of cycles (vocal fold vibrations)
Vowels Vowels: Articulatory Description (Ferrand, 2001) Tongue Position.
Fricatives PlaceVoicelessVoiced Labiodental/f//v/ Interdental //////// //////// Alveolar/s//z/ Palatal //////// //////// Glottal/h/ ////////
Physics of Sound Wave equation: Part. diff. equation relating pressure and velocity as a function of time and space Nonlinear contributions are not considered.
CS 551/651: Structure of Spoken Language Lecture 1: Visualization of the Speech Signal, Introductory Phonetics John-Paul Hosom Fall 2010.
Source/Filter Theory and Vowels February 4, 2010.
Laterals + Nasals December 2, 2009 Administrative Stuff Friday: some notes on audition (hearing) Also: in-class USRIs Production Exercise #4 is due on.
Advanced Microwave Measurements
LE 222 Sound and English Sound system
Speech Production1 Articulation and Resonance Vocal tract as resonating body and sound source. Acoustic theory of vowel production.
Acoustic Phonetics 3/9/00. Acoustic Theory of Speech Production Modeling the vocal tract –Modeling= the construction of some replica of the actual physical.
Fricatives, part II November 21, 2012 Announcements For Friday: spectrogram matching exercise! Fricatives and possibly glides, too. Final exam has been.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Speech Science VII Acoustic Structure of Speech Sounds WS
ECE 598: The Speech Chain Lecture 7: Fourier Transform; Speech Sources and Filters.
ECE 598: The Speech Chain Lecture 4: Sound. Today Ideal Gas Law + Newton’s Second = Sound Ideal Gas Law + Newton’s Second = Sound Forward-Going and Backward-Going.
Transitions + Perception March 27, 2012 Tidbits First: Guidelines for the final project report So far, I have two people who want to present their projects.
Vowels, Tubes and Music November 6, 2014 Pragmatic Considerations I still owe you a lot of homework! I’m setting aside a big chunk of time between now.
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson University.
Structure of Spoken Language
Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp
Stops Stops include / p, b, t, d, k, g/ (and glottal stop)
Vowel Acoustics March 10, 2014 Some Announcements Today and Wednesday: more resonance + the acoustics of vowels On Friday: identifying vowels from spectrograms.
Statistical NLP Spring 2011
From Resonance to Vowels March 13, 2012 Fun Stuff (= tracheotomy) Peter Ladefoged: “To record the pressure of the air associated with stressed as opposed.
Sonorant Acoustics + Place Transitions
Stop Acoustics and Glides December 2, 2013 Where Do We Go From Here? The Final Exam has been scheduled! Wednesday, December 18 th 8-10 am (!) Kinesiology.
Vowels + Music March 18, 2013.
Stop + Approximant Acoustics
Perturbation Theory, part 2
P105 Lecture #27 visuals 20 March 2013.
Acoustic Phonetics 3/14/00.
Acoustic Tube Modeling (I) 虞台文. Content Introduction Wave Equations for Lossless Tube Uniform Lossless Tube Lips-Radiation Model Glottis Model One-Tube.
Stop Acoustics + Glides December 2, 2015 Down The Stretch They Come Today: Stop and Glide Acoustics Friday: Sonorant Acoustics + USRI evaluations We’ll.
Phonetics: A lecture Raung-fu Chung Southern Taiwan University
Fricatives PlaceVoicelessVoiced Labiodental/f//v/ Interdental//////// Alveolar/s//z/ Palatal//////// Glottal/h/////
Vowels (one last time) March 2, 2010 Fun Stuff Any questions or updates on the lab exercise? Cardinal Vowels, revisited Delamont (2009): Adaptive Dispersion.
AN ANALOG INTEGRATED- CIRCUIT VOCAL TRACT PRESENTED BY: NIEL V JOSEPH S7 AEI ROLL NO-46 GUIDED BY: MR.SANTHOSHKUMAR.S ASST.PROFESSOR E&C DEPARTMENT.
Chapter 3: The Speech Process
Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson
Structure of Spoken Language
Chapter 2 Phonology.
Resonances of the Vocal Tract
ECE 598: The Speech Chain Lecture 6: Vowels.
Presentation transcript:

Landmark-Based Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Networks, and Phonology Mark Hasegawa-Johnson University of Illinois at Urbana-Champaign, USA

Lecture 2: Acoustics of Vowel and Glide Production One-Dimensional Linear Acoustics –The Acoustic Wave Equation –Transmission Lines –Standing Wave Patterns One-Tube Models –Schwa –Front cavity resonance of fricatives Two-Tube Models –The vowel /a/ –Helmholtz Resonator –The vowels /u,i,e/ Perturbation Theory –The vowels /u/, /o/ revisited –Glides

1. One-Dimensional Acoustic Wave Equation and Solutions

Acoustics: Constitutive Equations

Acoustic Plane Waves: Time Domain

Acoustic Plane Waves: Frequency Domain Tex

Solution for a Tube with Constant Area and Hard Walls

2. One-Tube Models

Boundary Conditions L 0

Resonant Frequencies

Standing Wave Patterns

Standing Wave Patterns: Quarter- Wave Resonators Tube Closed at the Left End, Open at the Right End

Standing Wave Patterns: Half- Wave Resonators Tube Closed at Both Ends Tube Open at Both Ends

Schwa and Invv (the vowels in “a tug”) F1=500Hz=c/4L F2=1500Hz=3c/4L F3=2500Hz=5c/4L

Front Cavity Resonances of a Fricative /s/: Front Cavity Resonance = 4500Hz 4500Hz = c/4L if Front Cavity Length is L=1.9cm /sh/: Front Cavity Resonance = 2200Hz 2200Hz = c/4L if Front Cavity Length is L=4.0cm

3. Two-Tube Models

Conservation of Mass at the Juncture of Two Tubes A1A1 A 2 = A 1 /2 U 1 (x,t) U 2 (x,t)= 2U 1 (x,t) Total liters/second transmitted = (velocity) X (tube area)

Two-Tube Model: Two Different Sets of Waves Incident Wave P1+ Reflected Wave P1-Incident Wave P2- Reflected Wave P2+

Two-Tube Model: Solution in the Time Domain

Two-Tube Model in the Frequency Domain

Approximate Solution of the Two- Tube Model, A 1 >>A 2 Approximate solution: Assume that the two tubes are completely decoupled, so that the formants include - F(BACK CAVITY) = c/4 L BACK - F(FRONT CAVITY) = c/4L FRONT L BACK L FRONT

The Vowels /AA/, /AH/ L BACK L FRONT L BACK =8.8cm  F2= c/4L BACK = 1000Hz L FRONT =12.6cm  F1= c/4L FRONT = 700Hz

Acoustic Impedance Z( ,j  )  0  0

Low-Frequency Approximations of Acoustic Impedance

Helmholtz Resonator -Z 1 ( ,j  ) =  0 Z 2 ( ,j  )  0 

The Vowel /i/ Back Cavity = Pharynx Resonances: 0Hz, 2000Hz, 4000Hz Front Cavity = Palatal Constriction Resonances: 0Hz, 2500Hz, 5000Hz Back Cavity Volume = 70cm 3 Front Cavity Length/Area = 7cm -1  1/2  √MC = 250Hz Helmholtz Resonance replaces all 0Hz partial-tube resonances. 250Hz 2000Hz 2500Hz

The Vowel /u/: A Two-Tube Model Back Cavity = Mouth + Pharynx Resonances: 0Hz, 1000Hz, 2000Hz Front Cavity = Lips Resonances: 0Hz, 18000Hz, … Back Cavity Volume = 200cm 3 Front Cavity Length/Area = 2cm -1  1/2  √MC = 250Hz Helmholtz Resonance replaces all 0Hz partial-tube resonances. 250Hz 1000Hz 2000Hz

The Vowel /u/: A Four-Tube Model Two Helmholtz Resonators = Two Low-Frequency Formants! F1 = 250Hz F2 = 500Hz F3 = Pharynx resonance, c/2L = 2000Hz 250Hz 500Hz 2000Hz Pharynx Velar Tongue Body Constriction Mouth Lips

4. Perturbation Theory

Perturbation Theory (Chiba and Kajiyama, The Vowel, 1940) A(x) is constant everywhere, except for one small perturbation. Method: 1. Compute formants of the “unperturbed” vocal tract. 2. Perturb the formant frequencies to match the area perturbation.

Conservation of Energy Under Perturbation

“Sensitivity” Functions

Sensitivity Functions for the Quarter- Wave Resonator (Lips Open)  L /AA//ER/ /IY//W/

Sensitivity Functions for the Half- Wave Resonator (Lips Rounded)  L /L,OW//UW/

Formant Frequencies of Vowels From Peterson & Barney, 1952

Summary Acoustic wave equation easiest to solve in frequency domain, for example: –Solve two boundary condition equations for P+ and P-, or –Solve the two-tube model (four equations in four unknowns) Quarter-Wave Resonator: Open at one end, Closed at the other –Schwa or Invv (“a tug”) –Front cavity resonance of a fricative or stop Half-Wave Resonator: Closed at the glottis, Nearly closed at the lips –/uw/ Two-Tube Models –Exact solution: use reflection coefficient –Approximate solution: decouple the tubes, solve separately Helmholtz Resonator –When the two-tube model seems to have resonances at 0Hz, use, instead, the Helmholtz Resonance frequency, computed with low-frequency approximations of acoustic impedance –/iy/: F1 is a Helmholtz Resonance –/uw/ and /ow/: Both F1 and F2 are Helmholtz Resonances Perturbation Theory –Perturbed area  Perturbed formants –Sensitivity function explains most vowels and glides in one simple chart