Download presentation
Published byBrittany Manning Modified over 9 years ago
1
Spatially Relocated Frequencies and Their Effect on the Localization of a Stereo Image
This research was supported by Delphi Automotive Systems Robert G. Hartman May 2003 Copyright © 2003 Rob Hartman All rights reserved
2
Content Introduction “Hard to localize?” Localization Cues
Localization Cue Salience Auditory Scene Analysis Experimentation Results and Analysis Conclusions References
3
Introduction
4
Introduction Rmic Rspkr Lmic Lspkr
5
? vs. ? Introduction Tweeter Midrange / Woofer Tweeter
6
Demonstration: Note piano shift from center to right
7
“Hard to localize?”
8
“Hard to localize?” Popular opinion Smyth [1] states:
“Low” frequencies / subwoofers Smyth [1] states: “experimental evidence suggests that it is difficult to localize mid-to-high frequency signals above about 2.5 kHz, and therefore any stereo imagery is largely dependent on the accurate reproduction of only the low-frequency components of the audio signal” (p. 18). Minimum Audible Angle (MAA) tests [2, 3] Generally, humans are least sensitive to spatial changes in the “middle frequency” (2-4 kHz) range
9
“Hard to localize?” Do certain conditions make it harder?
Steady/continuous sounds are harder to localize than impulsive sounds Transient ILD and ITD localization cues Narrow-band (octaves, sinusoid, etc.) are harder to localize than wide-band (complex) Less bandwidth means less cues to compare Frequency range and the acoustic space In free field or an anechoic space, middle frequency tones are harder to localize than low or high [2,3] Room modes and reflections most affect the ability to localize low frequencies [4]
10
“Hard to localize?” Ultimately it’s a complex process
Depends on type of localization cues present Physical presence of cues and agreement between them Psychoacoustical importance relative to one another Correlation between sources (multi-source) These factors vary with the spectral content of the sources and their relative position to the listener
11
Localization Cues
12
Localization Cues 1) Interaural Time Differences (ITD)
Due to constant speed of sound with path length differences to the ears. Arrival (IATD), Phase (IPD), Envelope (IETD)
13
Localization Cues 2) Interaural Level Differences (ILD)
Due to acoustical interaction of sound with the head and body. ILD varies significantly with frequency
14
Localization Cues 3) Monaural / Pinnae Cues
Spectral influences in the 5-12 kHz range Differentiates sources with the same ILD or ITD cues (cone of confusion) Helps avoid front/back confusions and determine vertical height of sources Least dominant cue, can use head turn also [5]
15
Localization Cue Salience
16
Localization Cue Salience
Salience based on Physical and Perceptual factors Physical variations ITD due to Spectral Content
17
Localization Cue Salience
Physical variations cont. ITD due to Spatial Position
18
Localization Cue Salience
Physical variations cont. ILD due to Spectral and Spatial factors
19
Localization Cue Salience
Physical variations cont. In reality, complex patterns exist for spectral and spatial (azimuth) variations [6] 90 60 30
20
Localization Cue Salience
Perceptual Salience Assuming physical level of cues are identical, salience depends on the spectral content and relative dominance of cues “Trading experiments” Tests which remove any physical limitations and study only perceptual sensitivity [6] Sensitivity to IPD is greatest for f <800 Hz. Above this, influence is reduced, having no affect above 1.6 kHz ILD has impact over entire frequency range, with a slight increase in sensitivity around 2 kHz.
21
Localization Cue Salience
“Trading experiments” cont. Generally, low frequency ITDs dominate, followed by ILDs. Pinnae cues have the least influence. For low frequencies, 40 us ITD = 1 dB ILD; whereas higher frequencies exhibit only ILD sensitivity Full lateral displacement occurs at us ITD or dB ILD
22
Localization Cue Salience
Summary figures
23
Auditory Scene Analysis
24
Auditory Scene Analysis
Important to consider the effect that multiple “streams” of sound can have on resulting stereo image Depending on temporal and spectral correlation of streams, resulting image could SHIFT (summing localization), SPLIT, WIDEN, etc. ?
25
Auditory Scene Analysis
Precedence Effect Perceptually suppresses similar events occurring msec after original event Delayed events have an effect on the perceived location, as has been shown (summing localization) Above 30 msec, audible “echoes” begin to occur
26
Auditory Scene Analysis
Auditory Stream Segregation [8] Cocktail party effect Temporal Interrelationship Increased time differences causes segregation (precedence effect) Relative Similarity of Fundamental Frequencies Increased difference in perceived pitch causes segregation (binaural beats, etc.) Spectral Distribution (harmonics) Timbre helps differentiate similar instruments Perceptual Location of the Auditory Events Sounds with above differences are more likely segregated with non-coinciding “perceived” spatial locations
27
Experimentation
28
Experimentation Test setup was a Stereo pair with additional offset “Spatially Relocated” (SR) speaker
29
Experimentation Listeners asked to comment on relative shift of central stereo image caused by moving “low” and “high” frequency bands from L to SR channel
30
Experimentation Frequency bands were chosen based on known localization cue performance [6] Band A = 20 – 800 Hz (dominant ITDs) Band B = 800 – 1,600 Hz (reduced ITDs) Band C = 1,600 – 5,000 Hz (reduced ILD) Band D = 5,000 – 12,000 Hz (ILD / pinnae) Band E = 12,000 – 20,000 Hz (dominant ILD)
31
Experimentation Actual test variables compare moving “high” vs. “low” frequency bands to SR channel E vs. Stereo (STR), E vs. A, E vs. AB DE vs. STR, DE vs. A, DE vs. AB, DE vs. ABC CDE vs. STR, CDE vs. A, CDE vs. AB Test signals Ideal signal was spectrally-balanced white noise Music track also used, despite typical “low levels” of high frequency energy
32
Demonstration: White Noise Bursts (E vs. A) Candy Perfume Girl (E vs
Demonstration: White Noise Bursts (E vs. A) Candy Perfume Girl (E vs. A) What is Hip? (E vs. A)
33
Results and Analysis
34
Results and Analysis
35
Results and Analysis
36
Results and Analysis Noise Results
Confident “no shift” E vs. STR & “right” for E vs. A Band A causes more shift than Band E DE vs. STR suggests DE is “just right” of STR; whereas A is definitely “right” of DE Band A causes more shift than band DE CDE is right of STR, but A is also to the right of CDE. Band A causes more shift than band CDE!
37
Results and Analysis
38
Results and Analysis
39
Results and Analysis Music Results
Confident “no shift” for E vs. STR and “right” vs. A Band A causes more shift than Band E Confident “no shift” for DE vs. STR and “right” DE vs. A Spectrogram shows low energy in band DE CDE is “right” of STR, but A is less confident While band CDE does have some energy, it is less than white noise. Thus, CDE does not cause as much shift.
40
Results and Analysis How does moving bands to the SR channel affect the localization cues? Change in azimuth (15 ) creates a new path to the ears IATD will decrease, due to smaller path difference IPD is more complex because of dependence on spectral content ILD expected to minimally change; more for HF than LF
41
Results and Analysis Why do the low frequency (LF) bands create further shifts of the stereo image than the high frequency (HF) bands? SR Band Loudness? If the LF bands are louder than the HF bands Type of Test Signals? Would music w/ more high frequency energy produce similar results as white noise test track? Localization Cue Salience? Are dominance of LF vs. HF cues the cause?
42
Results and Analysis SR Band LOUDNESS
Calculated using Steven’s Mark VII Method [7] Band E is 1.5 times louder than band A! Also performed “loudness” listening experiments showing similar results HF bands are typically louder than LF bands
43
Demonstration: SR Band Loudness (E, A, DE, AB, ABC)
44
Results and Analysis Type of Test Signals
To study the apparent difficulty in noticing the spatial relocation of high frequencies, ABX testing was performed w/ two music tracks. Compared “stereo” (A) with shifting left signal’s HF (> 10 kHz) to SR channel (B)
45
Results and Analysis Type of Test Signals
Sampled 52 critical listening music tracks, and chose tracks with greatest energy above 10kHz Only ~3% of average energy (i.e. large quantities of HF energy not common in music) Results show inability to reliably notice HF spatial relocation (39% and 49% correct responses, worse than guessing)
46
Results and Analysis Localization Cue Salience?
Moving LF bands affects ITD cues with minimal ILD changes Moving HF bands affects ILD cues with meaningless ITD changes. ITD cues are the most dominant cue [6], and could be expected to create more noticeable changes to the stereo image.
47
Conclusions
48
Conclusions Which sounds are “hard to localize?”
Continuous more difficult than impulsive Low frequencies in a reflecting room Middle frequencies (2-4 kHz), over high or low, in an anechoic room Narrow band (octaves, sinusoids) more difficult than wideband (i.e. more cues to compare gives better sense of localization)
49
Conclusions How do we localize?
Most sensitive to ITDs (particularly IPD) caused by spectral content and path length differences between the ears Lesser sensitivity to ILDs – although most sensitive in mid. Frequency range (~ 2kHz) Least sensitive to monaural/pinna cues, which help avoid front/back confusion and determine height of sound. Head tilt and turn provides similar information.
50
Conclusions The RESULTS
Will the LF or HF bands cause greater shift to a stereo image? Moving LF caused more shift than HF bands w/ white noise Moving band “E” (>12 kHz) was typically not noticed In ABX music testing, moving HF energy (>10 kHz) was difficult to discern (worse than guessing) Is this explainable? NOT due to band loudness The HF bands were shown to be LOUDER than the LF bands Large proportions of HF energy uncommon in music Most likely due to the relative dominance of low frequency ITD cues
51
Conclusions Future Research Mono tweeter system Simpler Experiments
HF image shift due to improper ILD cues Using HRTF may help reduce noticeability Overall perception still dominated by “stereo” low-mid range speakers Simpler Experiments Relocate frequencies vertically instead of horizontally Will create less noticeable shifts Different variables “equally loud” bands, different loudspeakers, the acoustic space, varied speaker locations More “scientific” approach with analysis of ear recordings
52
Thank you for your participation!
Questions & Answers ? Thank you for your participation!
53
References [1] Smyth, M. (1999). White paper: an overview of the coherent acoustics coding system. Retrieved October 1, 2001, from [2] Mills, A.W. (1958). On the minimum audible angle. J. Acoust. Soc. Am., 30, [3] Stevens, S.S., & Newman, E.B. (1936). Localization of actual sources of sound. Amer. J. Psychol., 48, [4] Hartmann, W.M. (1983). Localization of sound in rooms. J. Acoust. Soc. Am., 74 (5), [5] Fisher, H., & Freedman, S.J. (1968). The role of the pinna in auditory localization. J. Audi. Res., 8, [6] Blauert, J. (1999). Spatial hearing: the psychophysics of human sound localization. Cambridge, Mass.: The MIT Press. (Original work published in 1974) [7] Stevens, S.S. (1972). Perceived level of noise by mark VII and decibels (E). J. Acoust. Soc. Am., 51, [8] Bregman, A.S. (1999). Auditory scene analysis: the perceptual organization of sound. Cambridge, Mass: MIT Press.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.