Download presentation
1
Spectrogram & its reading
by Tae-Yeoub Jang
2
What is spectrogram? Begin to be used since 1940s
Another representation of frequency domain analysis The most popular way of representing spectral information 3 dimensional representation X-axis: Time Y-axis: Frequency Darkness (or color): Energy Reviving Sonus
3
Spectrogram example (color resolution of word “compute”)
Reviving Sonus
4
Spectrogram example (grayscale of word “compute”)
Reviving Sonus
5
Wideband vs. Narrowband spectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares in two of the vowels Reviving Sonus
6
Types of spectrogram Wideband spectrogram Narrowband spectrogram
better time resolution eg) 15 msec window, 1 msec shift, 125 Hz bandwidth Narrowband spectrogram better frequency resolution eg) 50 msec window, 1 msec shift, 40 Hz bandwidth Reviving Sonus
7
Advantages & Disadvantages
Time alignment Disadvantages Less reliable than waveform Reviving Sonus
8
Vowel Spectrogram Formant frequencies are critical cues for vowel distinction F1: Height high vowels: low F1 F2: Backness back vowels: low F2 Reviving Sonus
9
Example formant frequencies of English monophthongs
F3 2900 2550 2490 2640 2380 2300 2500 2390 F2 2250 1900 1770 1660 1100 1030 870 1500 1190 F1 280 400 550 690 710 450 310 900 640 Reviving Sonus
10
"heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English)
Reviving Sonus
11
Consonant Spectrogram
General Acoustic structure more complicated than vowels Adjacent sounds (especially vowels) convey important information locus High frequency characteristics especially for fricatives and affricates Reviving Sonus
12
What is LOCUS Information of formant transition from vowels into obstruents or from obstruents into vowels The target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is released The characteristic of the consonantal place and manner roughly the same in different vowel contexts Reviving Sonus
13
Stops General Fairly distinct locus for each place Burst
Silence during the closure (only at syllable onset position) Virtually no difference during the closure Reviving Sonus
14
Stops (cntd.) Voicing distinction
voiced: vertical striations for voiced sounds, less abrupt burst, frequently weakened to be like fricatives or approximants voiceless: generally abrupt burst at higher frequency area Reviving Sonus
15
Stops (cntd.) Place distinction bilabial alveolar velar
relatively low F2, F3 locus rising into and falling out of vowel weak and spread vertical lines alveolar F2 locus about 1800 Hz Strong vertical lines velar Velar pinch: vowels F2, F3 merging often double burst long formant transitions Reviving Sonus
16
Stops (cntd.) Manner distinction Silence duration, VOT, vowel F0
aspirated short long high tense lax med low Reviving Sonus
17
Examples -- “a bab, a dad, a gag”
Reviving Sonus
18
Place dependent loci Reviving Sonus
19
Fricatives General Random noise pattern especially in high frequency regions Place distinction Labiodental [f, v]: rising locus into the following vowel Dental [, ð]: major energy above 6000Hz Alveolar [s, z]: major energy above 4000Hz Alveopalatal [š, ž ]: major energy above 6000Hz Glottal [h]: the trace of formant frequencies of neighbouring vowels Reviving Sonus
20
Fricatives (cntd.) Weak vs. strong Strong [s, z, š, ž ]: darker bands
Weak [f, v, , ð ]: spread and fainter Voiced [v, ð ]: often so weak and confused with nasals or approximants Cues to tell [] from [f]: higher formants of [] fall into adjacent vowels Reviving Sonus
21
Example – “fie, thigh, sigh, shy”
Reviving Sonus
22
Example – “ever, weather, fizzer, pleasure”
Reviving Sonus
23
Nasals General Place distinction
Formants similar to vowels but fainter Very low F1 (about 250Hz), F2 (about 2500Hz), and F3 (about 3250Hz) Place distinction bilabial [m]: downward F2, F3 locus alveolar [n]: less amount of F2 transition velar [ŋ ]: velar pinch Reviving Sonus
24
Examples -- “a Pam, a tan, a kang”
Reviving Sonus
25
Liquies & Approximants
General Formants similar to vowels but fainter (especially at high frequency regions) Approximately F1(250Hz), F2(1200Hz), F3(2400Hz) Change in formant structure Reviving Sonus
26
Liquids & Approximants (cntd.)
Phone specific properties Labial glide [w]: very low F1, F2 ( Hz|) and gets too close to each relatively low F3 rapid falloff of spectral amplitude Palatal glide [y]: extremely low F1 extremely high F2, F3 Reviving Sonus
27
Liquids & Approximants (cntd.)
Phone specific properties (cntd.) Flap [Ր]: soft burst, short duration Retroflex [r]: F3 dipping down close to F2 General lowering of F3, F4 Lateral [l]: Low F1, F2 (approx. F1 250Hz, F2 1200Hz) usually substantial energy in the high F region Reviving Sonus
28
Example – “led, red, wed, yell”
Reviving Sonus
29
Final remarks Spectrogram is not the only cue for acoustic distinction of speech sounds Very often, the waveform is more reliable Reviving Sonus
30
References & Links Reviving Sonus
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.