PSYC Auditory Science Spatial Hearing Chris Plack
Spatial Hearing Learning Outcomes –Understand the main cues used to localise sounds: IIDs and ITDs –Understand other cues to localisation including head movements and pinna effects –Understand the BMLD –Understand what is meant by “the precedence effect” and how it is measured
Some Definitions Binaural hearing - hearing with both ears. Localisation - the ability to locate a sound source in the environment. Lateralisation - the ability to localise in the head a sound presented over headphones.
Localisation
Localisation Our ears give us much less information about object location than our eyes. We have only got two spatial channels for hearing (our two ears) compared to arguably several million for vision (the receptors in each retina). However, we can hear sound sources that are beyond our line of sight (e.g., behind our heads), and this helps us to orient attention and can be important for survival.
Binaural Cues Our two ears can be used to localise sounds. A sound will tend to be more intense, and arrive earlier, at the ear closest to the sound source. Hence, we can determine the direction of a sound source based on: Interaural Intensity Differences (IIDs - also called ILDs, interaural level diferences) Interaural Time Differences (ITDs)
Intensity Cues Mainly because of the shadowing effect of the head, a sound to the right will be more intense in the right ear than in the left ear: LR
Interaural Intensity Differences The sound reaching the ear farthest from the source is less intense due to head shadowing mainly, and also to dissipation of intensity with distance according to inverse-square law (only useful for sounds close to head). Low-frequency sounds diffract around the head, high- frequency sounds don't, and thus a high- frequency shadow is cast over the farthest ear. Hence the IID is frequency dependent, being greatest for high frequencies.
Timing Cues Because sound travels relatively slowly (330 m/s), a sound from the right will arrive perceptibly earlier at the right ear than at the left ear: LR LR Time 1Time 2
Interaural Time Differences The interaural distance (approx. 23 cm) produces a maximum ITD of about 0.69 ms when the source is directly opposite one ear (90 o ). The ITD falls to zero as the source moves forward or backwards to be in front (0 o ) or behind (180 o ) the listener. Smallest detectable ITD is about 0.01 ms!
Ambiguities in ITDs For a continuous pure tone, ambiguities arise when the period of the tone is less than twice the ITD - closest peaks in the waveform may suggest wrong ear is leading. For a sound directly to the side, this occurs for a frequency of about 735 Hz. Ambiguities can be resolved if the tone is modulated, i.e., if there are envelope cues (including abrupt onsets).
Ambiguities in ITDs
Duplex Theory The duplex theory suggests that sound localisation is based on interaural time differences at low frequencies and interaural intensity differences at high frequencies. However, for fluctuating high-frequency signals the envelope can carry good timing information. It is now thought that, for most sounds (which have wideband spectra), ITDs may dominate at all frequencies.
Minimum Audible Angle Indicates the smallest change in sound source position that can be detected by the listener. Using sinusoidal signals, the MAA is smallest for frontal signals (1 o for frequencies below 1 kHz). Around 1.5 kHz the IIDs are small and ITDs become ambiguous resulting in an increase in MAA. Performance worsens markedly as the source moves away from a frontal position, but in the real world the listener can move their head!
The Cone of Confusion Interaural time and intensity differences are ambiguous. For example, we can’t tell the difference between a sound directly in front and a sound directly behind using IIDs or ITDs. Same IIDs and ITDs for sound source on surface of cone:
The Cone of Confusion Ambiguities can be resolved by: Head movements Spectral effects of pinna, head, and torso reflections
Head Movements …but what if sound is too brief? a) ITD = 0:b) ITD shows left leading: ? ?
Effects of Pinna The pinna modifies sound entering ear depending on direction, resolving ambiguities and providing cues to source elevation: 10 dB 15 ° - 15° Frequency (kHz)
Because of shape of concha, sounds at higher elevations have shorter reflected path lengths, hence notch at a higher frequency: Effects of Pinna
Hebrank & Wight (1974) JASA 56, p. 1829
Accurate vertical localisation only with broadband signals (and only those with energy > 4 kHz). Vertical localisation prevented by occlusion of the convolutions in the pinnae (horizontal localisation unaffected apart from front/back distinctions). Vertical localisation almost as good with single ear. Vertical localisation sensitive to manipulations of the source spectrum. Middlebrooks & Green (1991) Ann. Rev. Psychol. 42, p. 135 Evidence for Importance of Spectral Cues in Vertical Localisation
Distance Perception Loudness is an important cue for distance: In the direct field (little reverberation) the further away the source is, the quieter the sound. Better with familiar sounds (e.g. speech). Direct-to-reverberant energy ratio is another cue: The closer the sound the louder the direct sound will be compared with the early reflected sounds. e.g. Zahorik (2002) JASA 111, p 1832
Binaural Unmasking
Binaural Masking Level Difference Measure tone (signal) threshold in presence of broadband masker with identical signal and masker to both ears. Invert phase of tone (or masker) in one ear so that signal and masker are lateralised differently. Masked signal threshold is lower (binaural release from masking). The difference between in-phase and altered-phase thresholds is called the BMLD.
N o S o - masker & signal same phase at both ears - poor detection N o S π - masker same phase, signal π radians out of phase - good detection N m S m - masker and signal presented monaurally - poor detection N o S m - masker same to both ears, signal monaural - good detection
The BMLD is frequency dependent since it relies on ITDs:
ConditionBMLD (dB) N u S 3 N u S o 4 N S m 6 N o S m 9 N S o 13 N o S 15 N = noise masker, S = signal, u = uncorrelated noise, o = no phase shift, m = monaural, = 180° phase shift
Huggins Pitch Present the same noise to both ears over headphones - noise is lateralised to the centre of the head. Now decorrelate a narrow band of noise between the ears (so that the band is different between the ears). This band “pops out” and is heard as having a pitch corresponding to the centre frequency of the band: Huggins pitch.
Frequency (Hz) Level Same in both ears Decorrelated between the ears NOISE 500 Huggins Pitch
Gockel, Carlyon, and Plack (2010). Can Huggins pitch harmonics be combined with diotic pure tone harmonics to produce a residue pitch? Mixed-mode conditions (1 HP + 1 NBN): Single-mode conditions (2 HP or 2 NBN):
Present two successive pairs of harmonics. Does pitch change follow analytic (spectral) or synthetic (residue) pitch? FrequenciesF0 (Hz)Harmonic Numbers st & 2 nd nd & 3 rd th & 5 th
Response of listeners for mixed-mode and single-mode conditions highly correlated: Suggests Huggins and diotic harmonics are processed by the same mechanism and combined after MSO.
The Precedence Effect
In a reverberant space (such as a room) sound from a source reflects off the walls, and arrives at the ear from different directions. Why don’t these reflections confuse the auditory system?
The Precedence Effect Direct sound follows shortest path and arrives first. The auditory system takes advantage of this by restricting analysis to the sound arriving first. I.e. the first arriving wavefront takes precedence.
The Precedence Effect For example, click from two loudspeakers separated by 80 o. If click is simultaneous, then heard between loudspeakers. Delay imposed on left loudspeaker. For 0-1 ms delay, sound image moves to right loudspeaker. For delays of 1-30 ms, image localised at right loudspeaker with no contribution from left (precedence effect). For delays > ms effect breaks up, and a direct sound and echo are heard.
Virtual Auditory Space
Sounds presented over headphones tend to be lateralised inside the head. However, if we record sounds using two microphones in the ear canals (or in the ear canals of a “dummy head”) then when this recording is presented over headphones it seems external and can be localised outside the head. The cues from the pinna, head, and torso help to give a recording a spacious quality when presented over headphones.
Dummy Head Recordings
Spatial Hearing Learning Outcomes –Understand the main cues used to localise sounds: IIDs and ITDs –Understand other cues to localisation including head movements and pinna effects –Understand the BMLD –Understand what is meant by “the precedence effect” and how it is measured