Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 1 ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics
2 Course Map Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Where are we? Week 2 Received signal is sampled & quantized q = PCM[ r ] Week 4 Sampled signal first transformed into frequency domain Q = DFT[ q ] Week 3 Quantized Signal is Coded c =code[ q ] Week 5 signal oversampled & low pass filtered Q = LPF[ DFT(q+n) ] Week 6 Transformed signal analyzed Using human psychoacoustic models Week 7 Acoustically Interesting signal is “perceptually coded” C = MP3[ Q] Over Sample DFT LPF DecodeProduce r(t)r(t) p(t)p(t) q + n C Perceptual Coding Store / Transmit Q + N Q Week 4 Week 6 Week 5Week 3 [Painter & Spanias. Proc.IEEE, 88(4):451–512, 2000] 3 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
4 The Physical Ear External Sound Waves Guided by outer ear into auditory canal Excite Inner Ear Through mechanical linkage connecting ear drum to cochlea [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
5 The Physical Ear Initiates signal processing frequency domain analysis Via analog computation Video: CochleaCochlea What part of the Cochlea vibrates for an 800 Hz square wave? [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
6 The Cognitive Ear Modern Psychoacoustics Benefits greatly from o decades of neural recording o contemporary brain imaging technology [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
7 Power Spectrum Model of Hearing Rough Picture (main content of today’s lecture): Critical Bands: Auditory system contains finite array of adaptively tunable, overlapping bandpass filters Frequency Bins: humans process a signal’s component (against noisy background) in the one filter with closest center frequency Masking: certain signal components in a given band are “favored” and others are filtered out Established through decades of psychoacoustic experiments B.C.J. Moore. Int.Rev.Neurobiol., 70:49–86, Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
8 Auditory Thresholds In the lab, you varied the frequency, amplitude and phase of signals What was the effect of each, if any, on the sound you heard? Frequency Amplitude Phase Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
Auditory Thresholds Harvey Fletcher (1940) Played pure tones varying o frequency, f [ Hz] o Intensity, I [Dyn ¢ cm -2 ] = [N ¢ cm -2 ] = 0.1 Pa o phase changes tend to be inaudible Large listener population o Young o Acute Recorded extreme thresholds faintest audible greatest tolerable Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah (
10 Auditory Thresholds Results: pain-free hearing range extends at most over 20 Hz – 20 KHz with sensitivity » 2 ¢ ¢ 0.1 Pa = 20 Pa Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 0.1 Pa [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940].
11 The decibel unit Define standard pressure: p 0 = ¢ 0.1 Pa = 20 Pa Threshold of human hearing Compute Sound Pressure Level as: L SPL = 20 log 10 (p/p 0 ) dB L SPL for p 1 = 20 Pa, for p 2 = 200 Pa, for p 3 = 20 mPa Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Compare to Ambient sea-level pressure: 1 Atmosphere = 10 5 Pascal Q: why use log-log scale? A 1 : dynamic range A 2 : “loudness” is a power function 0.1 Pa
12 The decibel unit – Hearing intensity Week 6 – Psychoacoustics (
13 Let’s try to reproduce these results! Week 6 – Psychoacoustics ( We will listen to single sine tones starting at a frequency of 10KHz, all the way up to 20KHz, so each student can figure out their cut-off frequency Suggestions to improve this experiment?
14 Animal hearing ranges Dogs: Greater hearing range: 40Hz to 60KHz Ultrasonic dog whistles Mice: Large ears in comparison to their bodies Hearing range: 1KHz to 70KHz Can’t hear low frequency noises Communicate with high frequency Distress call (40KHz), alert of predator [Pictures from Wikipedia] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
15 Why Sinusoids? Why not some other harmonic series? Fourier’s analysis shows harmonic analysis could be based on arbitrary smooth periodic fundamental Why does the animal receiver use sinusoids? Hamiltonian Mechanics Simplest physical model of vibrating masses Coupled spring-mass-damper mechanics Produce sinusoidal harmonics Video: CochleaCochlea m x b k …. all sound is produced by vibrating masses …. Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
16 Masking - Spatial Masking Paradigms “Masker” masking “maskee” Tone Masking Noise o pure tone of 80 SPL at 1 kHz o just masks “critical band” noise of 56 SPL centered at 1 kHz Masker-to-Maskee ratio o Constant for fixed relative frequency and varying amplitude o Changes with varying relative frequency [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] 1 “Bark” frequency interval Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
17 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. The first graph shows the masking pattern for a 200Hz tone Mostly masks tones around 200Hz, but also at harmonics The second graph shows the same plot for different frequencies, but only the fundamental part Notice that the band gets wider for increasing frequencies …masker at fundamental can somewhat mask maskees at the harmonics … … but the “spreading curve” is traditionally depicted over the fundamental only Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
18 Tone Masking Noise Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked? 200 Hz tone at 80dB 200 Hz tone at 40dB 300 Hz tone at 40dB 400 Hz tone at 40dB 700 Hz tone at 30dB
19 Masking [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. Tone Masking Noise (Fig 12) value above quiet threshold such that a signal at the abscissa frequency can be heard in presence of top: 200 Hz tone bottom: various frequencies Noise Masking Tone (Fig 13) dots show pure tone magnitude (in dB) required to be audible above noise o Of the magnitude on the middle curve o centered at that frequency o with bandwidth at least wider than the bars of Fig 12 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
20 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB Noise Masking Tone
21 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB o Yes! 1KHz at 60dB Noise Masking Tone noise
22 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB o No! 1KHz at 60dB Noise Masking Tone
23 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! Noise Masking Tone
24 Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! Noise Masking Tone
25 Masking - Temporal Temporal Masking Masker effect persists for tenths of a second Masker effect is “acausal” o on ~ 2/100 timescales Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
26 Pitch JND JND = “just noticeable difference” change in stimulus that “just” elicits perceptual notice where “just” means that a smaller variations of stimulus cannot be discerned [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah What can you say about the JND: Below 1000 Hz? o roughly constant o ~ 3 Hz Above 1000 Hz? o roughly log-log linear o Log[Jnd(f 2 )] - Log[ Jnd(f 1 )] ~ n (Log[f 2 ] - Log[f 1 ]) Suggests that as frequency increases broader frequency bands “assigned” to same length of cochlear tissue Remember cochlea model What is n? e.g. f 1 =2000 f 2 = = 10 – 4 ~ n( Log 10 [2] ) ) n ~ 20
27 JND experiment Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah The following audio files contain a single tone playing for 10 seconds. The sine starts at 200Hz, then changes to a higher frequency (201, 202, 203, 205, 210). This change occurs after a number of “noises”: 1, 2, 3, 4, 5, 6, 7, 8 or 9. Can you notice when the change happens?
28 Critical Bands Decades of empirical study reveal that human audio frequency perception is quantized into < 30 “critical bands” of perceptually near-identical pitch classes corresponding to ~equal length bands of cochlear tissue (neurons) Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
29 Critical Bands: Evidence Tone masking Noise (Fig. a & c) o noise audibility threshold o for small bandwidth noise o remains constant o until tone frequency locus o falls away from critical bandwidth Noise masking Tone (Fig. b & d) o same effect o with masker and maskee roles reversed [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
30 The Bark Scale “Bark” units: Uniform JND scale for frequency Maps frequency intervals into their respective critical band number [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
31 The Bark Scale Frequency-to-Bark function First Principles vs. Empirical Modeling [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
32 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Consider the following recording Any ways to improve the compression?
33 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB Masked
34 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB JND: Could only represent integer frequency values
35 Compression opportunities Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Zooming in on a smaller portion Any ways to improve the compression? 200Hz205Hz Frequency 195Hz dB
36 Next Week Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah How can we use what we know about human perception to compress music? Frequency hearing range Masking o Temporal o Spatial o JND o Barks
37 Big Ideas Sound is a pressure wave that makes the Cochlea vibrate with frequencies from ~20Hz (at the tip) to ~20KHz (at the base) This vibration is sinusoidal (physics) This is why sound harmonics are best represented as sinusoidal signals Masking Temporal – A masker tone can mask another tone that is present either right before or a little after the masker Spatial – A single tone can mask an entire frequency band (that contains the tone) if its intensity is high enough There are <30 such bands (Bark scale), and they are wider for higher frequencies Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
38 Admin Lab 5 report due tomorrow On Thursday: Lab 6 You will be designing your own experiments o To measure the range of frequencies you can hear o To perform spatial masking experiments Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah
39 ESE250: Digital Audio Basics End Week 6 Lecture Human Psychoacoustics Week 6 – Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah