Investigating physical properties of speech sounds

Name: Investigating physical properties of speech sounds
Uploaded: 2017-10-11T19:57:24+00:00
Duration: PTM21S46
Description: Investigating physical properties of speech sounds

Investigating physical properties of speech sounds
Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Speech Sound Representation Reconsidered
Articulatory phonetic approach: Describing sounds depending on how they are produced Problems of this approach Representation is only in terms of symbols  Sounds are not like that in reality It’s not reflected that some sounds are more confusing each other when perceived while others are not Eg) i/e vs. a/k s/f vs. s/m So we need another way of describing speech sounds Reviving Sonus

Acoustic representation of speech sounds
Representing sounds as they are Visual other than symbolic representation Depending more upon perception than production or articulation Physical properties are analyzed Similarities and differences of sounds are disclosed Reviving Sonus

Acoustic definition of sound
Variation in air pressure Movements of air particles An audible disturbance of a medium produced by a source The source: any object that vibrates Eg) musical instruments, human vocal cords, microphone The medium: any elastic object that carries vibration Eg) air, water Reviving Sonus

Advantages of acoustic representation
Real/physical mechanism of speech communication is represented No convention, no confusion, no controversy Gradual change of sounds are shown Example) How loud a sound is Small variations are shown Helpful for understanding how computers synthesize speech and how speech recognition works Reviving Sonus

What to represent? Three aspects sounds that can differ Pitch Loudness
Quality (Length) Reviving Sonus

How to represent acoustically?
Sound is air particle movements The best and agreed way of expressing air particle movements: Waveform Another necessary way of representing sound: Spectrum Reviving Sonus

Waveforms

Waveform properties Simple harmonic movement + Time elapse  Waveform
Individual particles move only backward and forward Reviving Sonus

Air particle movement No force Initial force Time Elasticity Inertia
Displacement Reviving Sonus

Simple Waveform Reviving Sonus

Speech sound properties shown in waveforms
Differentiation of sounds Sounds are different, which is crucial in human speech as a communication method Ways in which sounds can differ Perceptually: Pitch, Loudness, Quality Acoustically: Frequency, Amplitude, Phase Waveform shows differences in Acoustic correlate of Loudness  Amplitude Acoustic correlate of Pitch  Frequency Reviving Sonus

Amplitude representing Loudness
(b) 파란색보다 빨간색 파형의 진폭이 2배로 크다  빨간색 소리가 더 큼 Reviving Sonus

Amplitude (cntd.) Air pressure fluctuation
The extent of the maximum variation in air pressure from normal during a sound Unit: Bel, Decibel(dB; 1/10 of Bel), Bark dB: Common logarithm of power ratios Twice amplitude is not heard as twice loud Loud sound: particles move farther and more rapidly Reviving Sonus

Frequency representing Pitch
(a) (b) Reviving Sonus

Frequency (cntd.) The rate at which sound source vibrates
Sound sources: tuning forks, vocal cords, etc Units: Hz, cps (cycle per second) Depending upon Length of the pendulum Length of tuning fork prongs F(requency) = 1/T(period) SONUS reviving

Frequency (cntd.) Standard A frequency: 440 Hz
Octave: a note which is exactly twice the frequency of another note Eg) A(440Hz), A’(880Hz), A’’(1760Hz) Audible Frequency Human: 20Hz(or16Hz) – 20KHz Bats: 20KHz – 100KHz Fastest telephone vibration: 35KHz Most of the human speech sound frequency: below 8KHz Reviving Sonus

Frequency (cntd.) Pitch and frequency are not in linear relationship
Only in the low frequency, fairly linear Hz difference sounds greater than Hz difference Reviving Sonus

Phase difference Reviving Sonus

Phase (cntd.) Phase differences cause different waveforms But
Human ears do not perceive phase differences Reviving Sonus

Waveform is not sufficient..
Two sounds with the same pitch and loudness can still differ Example) Violin A vs. Piano A Example) [i] vs. [a] Another way of representation needed Spectrum Reviving Sonus

More about waveform first..
To know about spectrum and its representation of quality, we need to know more about waveform Reviving Sonus

Types of Waveforms: Pure tones vs. Complex waves
Most sounds, including human speech, sources produce complex vibrations Pure tone: single harmonic motion (SHM), with only one frequency Complex wave: more than one harmonic motion, multiple frequency Pure tone + pure tone of the same frequency and phase  another pure tone Pure tone + pure tones of different frequency  a complex tone Reviving Sonus

Pure tone (Simple Wave, simple harmonic motion, Sinusoid, Sine wave)
Reviving Sonus

Complex wave 100 Hz Hz Hz Reviving Sonus

[a] production by a female speaker
Complex wave [a] production by a female speaker Reviving Sonus

Types of Waveform: Repetitive vs. non-repetitive wave
Strictly Repetitive (periodic): sine wave, ideal sounds Virtually Repetitive (periodic): vowels, sonorants Non-repetitive (aperiodic): obstruents white noise (most complex) click Reviving Sonus

Periodic vs non-periodic wave
Aperiodic [s] Periodic wave [a] Reviving Sonus

Limitation of Waveform Representation
Sound can be heard in 3 different way Loudness, Pitch, Quality Quality can’t be represented directly in waveforms A new way of representation needed Spectrum Reviving Sonus

Spectrum

Background Knowledge on Spectrum
Sound waves can be either simple or complex Simple: sinusoid Complex: Combined simple waves with different frequency Sound quality can be determined by the way such simple waves combine into a complex wave If a complex wave can be split into each simple wave we can see the secret Reviving Sonus

Waveform and Spectrum (100Hz + 200Hz + 300Hz )
4 2 100 200 300 Hz Reviving Sonus

An Example of Spectrum Reviving Sonus

Formants shown in spectrum
Frequency component(s) with boosted energy Formant frequency: Its frequency Reason for formant shaping: Filtering function in vocal tract Decisive aspect of sound quality For vowels three formants (F1, F2, F3) are especially important for their distinction Reviving Sonus

An Example of Formant : Vowel [«]
Reviving Sonus

An Example of Formant: Vowel [e]
1 2 3 4 5 6 50 300 550 800 1050 1300 1550 1800 2050 2300 2550 2800 3050 3300 3550 3800 4050 Hz Amplitude F1 F2 F3 Reviving Sonus

Disadvantages of Spectrum Representation
Less intuitive X-axis denotes frequency level No time varying representation Hard to see interaction with Waveforms Thus, a new way of representation needed  Spectrogram Reviving Sonus

Spectrogram & its reading

What is spectrogram? Begin to be used since 1940s
Another representation of frequency domain analysis The most popular way of representing spectral information 3 dimensional representation X-axis: Time Y-axis: Frequency Darkness (or color): Energy Reviving Sonus

Waveform & Spectrogram aligned
Reviving Sonus

Spectrogram example (color resolution of word “compute”)
Reviving Sonus

Spectrogram example (grayscale of word “compute”)
Reviving Sonus

Types of spectrogram Wideband spectrogram Narrowband spectrogram
better time resolution Narrowband spectrogram better frequency resolution Reviving Sonus

Wideband vs. Narrowband spectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares in two of the vowels Reviving Sonus

Advantages & Disadvantages
Time alignment Disadvantages Less reliable than waveform Reviving Sonus

Vowel Spectrogram Formant frequencies are critical cues for vowel distinction F1: Height high vowels: low F1 F2: Backness back vowels: low F2 Reviving Sonus

Examples of formant frequencies of English monophthongs
Ã F3 2900 2550 2490 2640 2380 2300 2500 2390 F2 2250 1900 1770 1660 1100 1030 870 1500 1190 F1 280 400 550 690 710 450 310 900 640 Reviving Sonus

From http://hctv.humnet.ucla.edu/departments/linguistics
"heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English) From Reviving Sonus

Consonant Spectrogram
General Acoustic structure more complicated than vowels Adjacent sounds (especially vowels) convey important information  locus High frequency characteristics  especially for fricatives and affricates Reviving Sonus

What is LOCUS Information of formant transition from vowels into obstruents or from obstruents into vowels The target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is released The characteristic of the consonantal place and manner  roughly the same in different vowel contexts Reviving Sonus

Stops General Fairly distinct locus for each place Burst
Silence during the closure (only at syllable onset position) Virtually no difference during the closure Reviving Sonus

Stops (cntd.) Voicing distinction
voiced: vertical striations for voiced sounds, less abrupt burst, frequently weakened to be like fricatives or approximants voiceless: generally abrupt burst at higher frequency area Reviving Sonus

Stops (cntd.) Place distinction bilabial alveolar velar
relatively low F2, F3 locus  rising into and falling out of vowel weak and spread vertical lines alveolar F2 locus about 1800 Hz Strong vertical lines velar Velar pinch: vowels F2, F3 merging often double burst long formant transitions Reviving Sonus

Stops (cntd.) Manner distinction Silence duration, VOT, Following V F0
Aspirated [pH] short long high Tense [p’] Lax [p] mid low Reviving Sonus

Examples -- “a bab, a dad, a gag”
Reviving Sonus

Place dependent loci Reviving Sonus

Fricatives General Random noise pattern especially in high frequency regions Place distinction Labiodental [f, v]: rising locus into the following vowel Dental [T, D]: major energy above 6000Hz Alveolar [s, z]: major energy above 4000Hz Alveopalatal [s&, zà]: major energy above 2000Hz Glottal [h]: the trace of formant frequencies of neighbouring vowels Reviving Sonus

Fricatives (cntd.) Weak vs. strong Strong [s, z, s&, zà]: darker bands
Weak [f, v, T, D]: spread and fainter Voiced [v, D ]: often so weak and confused with nasals or approximants Cues to tell [T] from [f]: higher formants of [T] fall into adjacent vowels Reviving Sonus

Example – “fie, thigh, sigh, shy”
Reviving Sonus

Example – “ever, weather, fizzer, pleasure”
Reviving Sonus

Nasals General Place distinction
Formants similar to vowels but fainter Very low F1 (about 250Hz), F2 (about 2500Hz), and F3 (about 3250Hz) Place distinction bilabial [m]: downward F2, F3 locus alveolar [n]: less amount of F2 transition velar [N ]: velar pinch Reviving Sonus

Examples -- “a Pam, a tan, a kang”
Reviving Sonus

Liquids & Approximants
General Formants similar to vowels but fainter (especially at high frequency regions) Approximately F1(250Hz), F2(1200Hz), F3(2400Hz) Slow formant movements Reviving Sonus

Liquids & Approximants (cntd.)
Phone specific properties Labial glide [w]: very low F1, F2 ( Hz|) and gets too close to each relatively low F3 rapid falloff of spectral amplitude (formant movements) Palatal glide [y]: extremely low F1 extremely high F2, F3 Reviving Sonus

Liquids & Approximants (cntd.)
Phone specific properties (cntd.) Flap [R]: soft burst, short duration Retroflex [r]: F3 dipping down close to F2 General lowering of F3, F4 Lateral [l]: Low F1, F2 (approx. F1 250Hz, F2 1200Hz) usually substantial energy in the high F region Reviving Sonus

Example – “led, red, wed, yell”
Reviving Sonus

Final remarks on spectrogram
Spectrogram is not the only cue for acoustic distinction of speech sounds. When there is a mismatch between waveform & spectrogram, the waveform is more reliable in general. Reviving Sonus

References & Links Reviving Sonus

Investigating physical properties of speech sounds

Similar presentations

Presentation on theme: "Investigating physical properties of speech sounds"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Investigating physical properties of speech sounds

Similar presentations

Presentation on theme: "Investigating physical properties of speech sounds"— Presentation transcript:

Similar presentations

About project

Feedback