Spring Wave Oscillations External force causes oscillations Governing equation: f = ½π(k/m) ½ – The spring stiffness and quantity of mass determines the.

Slides:

Advertisements

Similar presentations

Note 2 Transmission Lines (Time Domain)

Advertisements

Types, characteristics, properties

Physics 1025F Vibrations & Waves

Radio Frequency Fundamentals Wireless Networking Unit.

Echo Generation and Simulated Reverberation R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.

Chapter 14 Sound.

Sound Chapter 15.

Comments, Quiz # 1. So far: Historical overview of speech technology - basic components/goals for systems Quick overview of pattern recognition basics.

ECE 598: The Speech Chain Lecture 8: Formant Transitions; Vocal Tract Transfer Function.

ACOUSTICAL THEORY OF SPEECH PRODUCTION

The Human Voice Chapters 15 and 17. Main Vocal Organs Lungs Reservoir and energy source Larynx Vocal folds Cavities: pharynx, nasal, oral Air exits through.

Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.

Complete Discrete Time Model Complete model covers periodic, noise and impulsive inputs. For periodic input 1) R(z): Radiation impedance. It has been shown.

It was assumed that the pressureat the lips is zero and the volume velocity source is ideal  no energy loss at the input and output. For radiation impedance:

Chapter 16 Wave Motion.

July, 2003© 2003 by H.L. Bertoni1 I. Introduction to Wave Propagation Waves on transmission lines Plane waves in one dimension Reflection and transmission.

Transformations Definition: A mapping of one n-dimensional space onto another k-dimensional space, which could be itself. – Example: Mapping a three dimensional.

Physics of Sound Wave equation: Part. diff. equation relating pressure and velocity as a function of time and space Nonlinear contributions are not considered.

Anatomic Aspects Larynx: Sytem of muscles, cartileges and ligaments.

Signal and Systems Introduction to Signals and Systems.

Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.

So far: Historical overview of speech technology  basic components/goals for systems Quick review of DSP fundamentals Quick overview of pattern recognition.

Chapter 16 Wave Motion.

Basic Concepts: Physics 1/25/00. Sound Sound= physical energy transmitted through the air Acoustics: Study of the physics of sound Psychoacoustics: Psychological.

Waves Traveling Waves –Types –Classification –Harmonic Waves –Definitions –Direction of Travel Speed of Waves Energy of a Wave.

EEL 3472 ElectromagneticWaves. 2 Electromagnetic Waves Spherical Wavefront Direction of Propagation Plane-wave approximation.

Computer Sound Synthesis 2 MUS_TECH 335 Selected Topics.

C-15 Sound Physics Properties of Sound If you could see atoms, the difference between high and low pressure is not as great. The image below is.

Linear Prediction Coding (LPC)

1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.

Acoustic Phonetics 3/9/00. Acoustic Theory of Speech Production Modeling the vocal tract –Modeling= the construction of some replica of the actual physical.

Vibration and Waves AP Physics Chapter 11.

Oscillations & Waves IB Physics. Simple Harmonic Motion Oscillation 4. Physics. a. an effect expressible as a quantity that repeatedly and regularly.

Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.

Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.

Sect. 6.5: Forced Vibrations & Dissipative Effects

Chapter 13 - Sound 13.1 Sound Waves.

Speech Science Fall 2009 Oct 28, Outline Acoustical characteristics of Nasal Speech Sounds Stop Consonants Fricatives Affricates.

1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.

Waves and Sound Level 1 Physics.

Structure of Spoken Language

12-3 Properties of Waves.  A wave is the motion of a disturbance.  Waves of almost every kind require a material medium to travel through.  Waves that.

Speech Science VI Resonances WS Resonances Reading: Borden, Harris & Raphael, p Kentp Pompino-Marschallp Reetzp

Hooke’s Law F s = - k x F s is the spring force k is the spring constant It is a measure of the stiffness of the spring A large k indicates a stiff spring.

Block Diagram Manipulation

Classical (I.e. not Quantum) Waves

ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska

12 Weeks to TAKS Week 5. Obj. 5: IPC 5A and 5B Demonstrate wave types and their characteristics through a variety of activities such as modeling with.

Copyright 2004 Ken Greenebaum Introduction to Interactive Sound Synthesis Lecture 22:Physical Modeling II Ken Greenebaum.

More On Linear Predictive Analysis

Chapter 7 The Laplace Transform

Chapter 11 Vibrations and Waves.

H. SAIBI November 25, Outline Generalities Superposition of waves Superposition of the wave equation Interference of harmonic waves.

1 Linear Wave Equation The maximum values of the transverse speed and transverse acceleration are v y, max =  A a y, max =  2 A The transverse speed.

EE Audio Signals and Systems Wave Basics Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Speech Generation and Perception

P105 Lecture #27 visuals 20 March 2013.

Acoustic Tube Modeling (I) 虞台文. Content Introduction Wave Equations for Lossless Tube Uniform Lossless Tube Lips-Radiation Model Glottis Model One-Tube.

Purdue Aeroelasticity

Chapter 14 Vibrations and Waves. Hooke’s Law F s = - k x F s is the spring force k is the spring constant It is a measure of the stiffness of the spring.

Wave Action Theory for Turning of Intake & Exhaust Manifold

AN ANALOG INTEGRATED- CIRCUIT VOCAL TRACT PRESENTED BY: NIEL V JOSEPH S7 AEI ROLL NO-46 GUIDED BY: MR.SANTHOSHKUMAR.S ASST.PROFESSOR E&C DEPARTMENT.

Lesson 20: Process Characteristics- 2nd Order Lag Process

The Human Voice. 1. The vocal organs

The Human Voice. 1. The vocal organs

Speech Generation and Perception

Mobile Systems Workshop 1 Narrow band speech coding for mobile phones

Speech Generation and Perception

N-port Network Port reference Line Impedance Port Voltage & Current.

Time Response, Stability, and

Presentation transcript:

Spring Wave Oscillations External force causes oscillations Governing equation: f = ½π(k/m) ½ – The spring stiffness and quantity of mass determines the frequency Force determines displacement Damping resistance leads to exponential decay (like a filter) – No resistance would create an unstable system Differences from sound – Components are discrete, not distributed – There is no traveling wave k = spring constant x = displacement F = external force B = damper m = mass f = oscillation frequency Spring Oscillation

Rope Impulse Waves A hand jerk is an impulse – An impulse causes a wave to travel. – Speed of impulses determines frequency Rope stiffness determines wave velocity Reverse waves start at the fixed end – Resonates when waves correlate – Cancels when waves don’t correlate – Eventually a steady state is reached and we perceive no motion (standing wave) – Boundary conditions model the interaction Sound differences – lateral verses transverse motion – Waves escape for non-stop sounds

Acoustic Compression Waves Resistance – sources of energy loss – Ex: Vocal tract walls Impedence (Z) – Resistance to wave motion Capacitance – air resistance to compression Reflection – Portion of waves that reflect back to the source Compression and Rarefaction

Vocal Tract Tube Model Series of short uniform tubes connected in series – Each slice has a fixed area – Add slices for the model to become more continuous Mathematic equations – y(n) = u(t) * v(t) * r(t) – z-domain: Y(z) = U(z) V(z) R(z) – U = Glottal source – V = Vocal model – R = Lip radiation model Reminder: * means convolution

Cylinder model(s) Rough model of throat and mouth cavity With nasal cavity Voice Excitation Voice Excitation open open/closed

Vocal Tract Tube Model Assumptions that have little impact on accuracy 1.Vocal system is linear time invariant (LTI) – Speech is not fully linear, but the model still is a good approximation – LTI: an impulse to the system has the same response regardless of time 2.Vocal tract shape is static – Vocal tract moves with a rate of change slower than the sampling rate. 3.Model employs straight tubes which don’t bend – The bends between tubes do not significantly alter the acoustics 4.Model assumes a discrete number of tubes – Has no impact if we set the number of tubes to a large enough number 5.Vocal tract is lossless (Note: An extended IIR model can monitor these) – Vocal tract walls add resistance and reverberation that tend to cancel – Glottal and lip loss quantities dominates vocal tube loss – The IIR model can be extended to model loss, if necessary

Vocal Tract Tube Model Assumptions that impact accuracy All-pole modeling – The ear is more sensitive to peaks than valleys, which is fortunate – Adding poles can sometimes approximate valleys, but not exactly – Accurate for modeling vowels, but not perfect for obstruents – Poles have difficulty modeling ripples in the speech signal Radiation from the lips – Effects are much more complicated than assuming a simple tube – Example: high frequencies are more narrowly directed than low ones Source-filter separation – Interrelationships between vocal tract muscles is complicated Glottal source – Source signals are complicated, not represented by simple F0 impulses – Example: Louder speaking more than increasing the source strength

Definition of Terms Consider sound propagating along a tube Define: – u(x, t) = particle velocity of wave – U(x, t) = volume velocity (u(x,t) * A where A is area) – p(x,t) = wave pressure variation compared to when x = 0 – ρ = density of air at sea level – c = velocity of sound – d = x F s / c discrete normalized distance by sample rate; x = tube length – n = normalized time of n th sample Notes: – Electrical circuit voltage/current is analogous to acoustic p(x,t) & u(x, t) – Discrete versions of u(x,t) and p(x,t) are respectively u(d,n) and p(d,n) – u[n] is a discrete sample wave measurement at point n – u + [n] is a forward traveling wave; u - [n] is a backward traveling wave

Tube Model Characteristics Determined by the boundaries between tubes – Some of the input travels forward and some reflects back – Junctions between tubes completely defines velocity/pressure – u[d,n] = u + [n-d] – u - [n+(D-d)] {D = length of tube, d = point from start} Interpretation: Wave velocity at some point in a tube equals forward wave (u + ) from the previous tube minus backward reflected wave (u - ) from the next tube – p[d,n] = ρc/A (u + [n-d] + u - [n+(D-d)]) Interpretation: Wave pressure at some point in a tube equals forward wave (u + ) from the previous tube plus backward reflected wave (u - ) from the next tube Author states: These expressions are derived from first-principle solution of wave equations and properties of air.

Flow and Pressure at Junctions Wave flow at junctions between tube k and k+1 u k = u k [D k,n] = u k+1 = u k+1 [0,n] and p k [D k, n] = p k+1 [0,n] Using formulas on previous slide – u + k – u - k = u + k+1 – u - k+1 and 1/A k (u + k + u - k ) = 1/A k+1 (u + k+1 + u - k+1 ) – Forward minus backward wave velocity from the end of tube k = Forward minus backward wave velocity from beginning of k+1. – Pressure at junction is the sum of the forward and backward pressure Multiply pressure formula by A k and add two formulas 2u + k = A k /A k+1 (u + k+1 + u - k+1 ) + u + k+1 – u - k+1 2u + k = (A k +A k+1 )/A k+1 u + k+1 – (A k -A k+1 )/A k+1 u - k+1 ) u + k = (A k +A k+1 )/(2A k+1 )u + k+1 – (A k -A k+1 )/(2A k+1 )u - k+1 ) u + k = (A k +A k+1 )/(2A k+1 )u + k+1 + (A k+1 -A k )/(2A k+1 )u - k+1 ) By similar math (subtracting two formulas instead of adding) we can derive formula for u - k

Reflection Coefficients Previous page formula u + k = (A k +A k+1 )/(2A k+1 )u + k+1 + (A k+1 -A k )/(2A k+1 )u - k+1 ) We want to convert U + k formula to use reflection coefficients 1/(1+r k ) = 1/(1 + (A k+1 –A k )/(A k+1 +A k )) Multiply numerator and denominator by A k+1 +A k 1/(1+r k ) = 1/(1+(A k+1 –A k )/(A k+1 +A k ))(A k+1 +A k )/(A k+1 +A k ) Simplify 1/(1+r k ) = (A k+1 + A k )/(A k+1 + A k + A k+1 – A k ) = (A k+1 +A k )/2A k+1 Similarly, we can show that: r k /(1+r k ) = (A k+1 –A k )/2A k+1 Finally: u + k = (1/(1+r k )u + k+1 + r k /(1+r k ) u - k+1 (Formula 11.16a in book) And by similar derivation we derive Formula 11.16b in book u - k = -r k /(1+r k )u + k+1 + 1/(1+r k )u - k+1 Definition: r k = (A k+1 –A k )/(A k+1 +A k )

General Model Wave at junction u + k = (1/(1+r k )u + k+1 + r k /(1+r k ) u - k+1 u - k = -r k /(1+r k )u + k+1 + 1/(1+r k )u - k+1 Take Z transform of wave at junction U + k (z) = (z D k /(1+r k )U + k+1 + r k z D k /(1+r k ) U - k+1 U - k (z) = -r k z -D k /(1+r k )U + k+1 + z -D k /(1+r k )U - k+1 Note – The Z exponent is positive for forward traveling waves and negative for backward traveling waves. In one case the wave is sped up and in the other the wave is being delayed. It is like moving the imaginary axis of the Z plane left or right – Assumptions: Reflection waves from the lips = 0 – All reflection back to the glottis is absorbed by the lungs

Single Tube Model Flow from tube to Lips (U + L ) without reflection and flow back from tube (U - 1(z)) to the glottis U + 1 (z) = (z D /(1+r L )U + L and U - 1 (z) = -r L z -D /(1+r L )U + L Flow from glottis to tube U + G (z) = (z D G /(1+r G )U + 1 (z) -r G z -D G /(1+r G )U - 1 We model the glottis as a tube of length 0, so Z D G = 1 U + G (z) = (1/(1+r G )U + 1 (z) -r G /(1+r G )U - 1 = (1/(1+r G )(z D /(1+r L )U + L + r G /(1+r G )r L z -D /(1+r L ) U + L = Z D + r G r L Z -D /((1+r G ) (1+r L )) Transfer function for Vocal Tract V(z) = U L / U G V(z) = (1+r G )(1+r L ) / (z D +r G r L z -D k ) = (1+r G )(1 + r L ) z -D /(1+r G r L z -2D ) Conclusion: A one tube models the vocal tract as a filter that resonates when it is applied to a source generated signal.

Analysis: Single Tube Vocal transform function V(z) = (1+r G )(1 + r L ) z -D /(1+r G r L z -2D ) Compute D – D = x F s / c = 0.17 (16000)/34 = 8 –.17 = length of male vocal tract, is sampling rate, 34 meters per second is speed of sound – Exponent of denominator is -16. Conclusions – A filter to model the vocal tract only needs 16 terms – One tube needed for every thousand samples per second

Frequency Response: Single Tube Vocal transform function V(z)=(1+r G )(1+r L )z -D /(1+r G r L z -2D ) Replace (1+r G )(1+r L ) by a constant gain factor G Special case – If Glottis closed: r G = 1 – If Lips fully open: r L = -1 – V(e i ω ) = e -i5ω /(1 – e --i10ω ) = 1/ (e -i5ω + e i5ω ) = 2/cos(5ω) Peaks when cos(5 ω) = 0 Frequency Response

Multi-Tube Model Start from the lips Add tube and plug in the formula Continuing to add tubes working backwards Finish the recursion when we reach the glottis z -P/2 Π k=0,N (1+r k ) 1 – a 1 z -1 - a 2 z -2 - … - a n z -P This turns out to be a standard IIR filter Linear prediction can help us automate how to accurately compute these coefficients V(z) = Note: The text derives this using successive matrix multiplication. His approach is fine and easy to implement in a for loop. For our purposes though, the final formula is what is most important.

The All Pole Model Using a Gain factor, G, an IIR filter can model the vocal tract The model needs a pole for each 1k of the sample rate – A multi-pole model results in more non-zero filter coefficients – Adding extra poles will not increase accuracy – Factoring a filter’s polynomial roughly estimates the formants – Actual pole locations don’t correspond to specific tubes. They result from all of the interactions between the tubes – Adding zeros to the model can account for anti resonances CELP (Code Excited Linear Prediction) – memorizes banks of vocal tract IIR filters for signal compression – The residue is the difference between the actual and modeled signal – Idea: It takes less bits to save the residue than the original signal

Radiation A speech signal travel some distance before being heard There is impedance at the lips that alters the signal to somewhat increases the high frequency amplitudes This can be modeled by an additional filter, but an accurate model is quite complicated An adequate model – Transfer function: R(z) = P(z)/U L (z) – Pressure (P) = Output (U L ) times impedance (R) – Typical implementation: R(z) = 1 – αz -1 where.95<α<.98 – Result: approximate 6dB boost per octave

Glottal Source Designing a highly accurate model is an open research area Simplistic model – Feeding impulses at regular intervals – Unfortunately, it doesn’t accurately mimic the vocal fold vibration Alternate models – u[n] = ½ (1-cos(πn/N 1 )) if 0≤n≤N 1 ; cos(π (n-N 1 )/2N 2 )) if N 1 ≤n ≤N 2 ; N 2 ≤n≤ N 3 – Adding zeroes U(z) = ∏ k=0,M (1-u k z -1 )/(1-z) 2 to model jitter (varying periods) and shimmer (varying amplitudes) – Modeling with a formula that models an initial negative impulse (Lijencrants-Fant)

Nasal Cavity A parallel filter is needed – It’s is a static articulator – A simpler all pole filter will suffice Difficulties – Nasal consonants: the oral wave reflects back causing anti resonances – Nasalized vowels: Some of the nasal wave filters back to the oral cavity Possible complete model – Anti-resonances are modeled with zeros – All pole model for the pharnyx + back of mouth – All pole model for nasal cavity – Poles and zeroes for the oral cavity – A splitting and feedback operation

Speech Models for Nasal Speech Vowel or glide U(z) = P(z)M(z) R L (z) P(z)M(z) R L (z) Nasalized Vowel or glide U(z) = P = Pharnyx, M = Mouth, R L = Radiation from lips, N = Nasal, R N Radiation from nose N(z) R N (z) P(z)M(z) Nasal U(z) = N(z) R N (z)

Speech Models for Obstruent Sounds Placement of tongue – Separates the mouth into two sections V b (z) is the vocal tract to the back of the mouth; M(z) is the mouth The air reflects to the back portion causing anti-resonances – Frequency increases as constriction the point moves towards the front – Poles and zeroes are needed to model the anti resonances. Voiced Obstruent U(z) = V b (z) M(z) R L (z) Unvoiced Noise source V b (z) M(z) R L (z) Unvoiced Noise source