Download presentation
Presentation is loading. Please wait.
1
Auditory Objects of Attention Chris Darwin University of Sussex With thanks to : Rob Hukin (RA) Nick Hill (DPhil) Gustav Kuhn (3° year proj) MRC
3
Need for sound segregation Ears receive mixture of sounds We hear each sound source as having its own appropriate timbre, pitch, location Stored information about sounds (eg acoustic/phonetic relations) concerns a single source
4
Mechanisms of segregation Primitive grouping mechanisms based on general heuristics Schema-based mechanisms based on specific knowledge.
5
A Paradox We can attend to sounds coming from a particular direction –everyday experience –Auditory RTs faster to cued side (Spence & Driver, 1994) Interaural time differences (ITDs) are the main cue to the location of a complex sound (Wightman & Kistler, 1992).
6
A Paradox On the other hand ITDs are ineffective at grouping together sounds from a single sound source (Culling & Summerfield, 1995; Darwin & Hukin, 1995)
7
Culling & Summerfield (1995): 4 noise bands
8
ITD versus ILD
9
ILD segregates; ITD does not
10
Left cochlea Right cochlea 200 Hz 500 Hz 1000 Hz 2000 Hz M S O +600µs -600µs +600µs EE AR Coincidence detection and ITD
11
Two models of attention
12
Plan check out Culling & Summerfield for more natural sounds Show evidence for grouping before across-frequency ITD calculated show that ITD can be a very powerful sequential grouping cue
13
Phoneme boundary shift
14
ILD condition 600-Hz Target vowel / I / or / / "Hello, you'll hear the sound X now" no 600-Hz Left Right
15
ILD segregates; ITD does not
16
Phase Ambiguity 500 Hz: period = 2ms L lags by 1.5 msL leads by 0.5 ms LLR cross-correlation peaks at +0.5ms and -1.5ms auditory system weighted toone closest to zero
17
Disambiguating phase-ambiguity Narrowband noise at 500 Hz with ITD of 1.5 ms (3/4 cycle) heard at lagging side. Increasing noise bandwidth changes location to the leading side. Explained by across-frequency consistency of ITD. (Jeffress, Trahiotis & Stern)
18
Resolving phase ambiguity 500 Hz: period = 2ms L lags by 1.5 msor L leads by 0.5 ms ? -2.5 200 800 600 400 -0.51.53.5 Delay of cross-correlator ms Frequency of auditory filter Hz Cross-correlation peaks for noise delayed in one ear by 1.5 ms 300 Hz: period = 3.3ms R RLLR Actual delay Left ear actually lags by 1.5 ms L lags by 1.5 msor L leads by 1.8 ms ? R
19
Segregation by onset-time 200 400 600 800 Frequency (Hz) Duration (ms) 0400 Duration (ms) 080400 SynchronousAsynchronous ITD: ± 1.5 ms (3/4 cycle at 500 Hz)
20
Segregated tone changes location -20 0 20 0 4080 Onset Asynchrony (ms) Pointer IID (dB) Pure Complex RL
21
Segregation by mistuning 200 400 600 800 Frequency (Hz) Duration (ms) 0400 Duration (ms) 080400 In tuneMistuned
22
Mistuned tone changes location
23
Interim Summary ITD ineffective for simultaneous segregation Integration of ITD across frequency influenced by grouping cues Question: Can attention be directed on the basis of ITD to grouped objects?
24
Attending to one sentence Could you please write the word dog down now …dog... You’ll also hear the sound bird this time
25
Continuity of attention expt
26
Continuity of Fo vs ITD Fo differences: 0, 1, 2, 4 semitones ITD differences: ± 45, 91, 181 µs Normal: Fo & ITD work together Switched: Fo & ITD opposed
27
Monotone Fo continuity ineffective
28
Continuity of ITD very effective
29
Summary ITD ineffective for simultaneous grouping ITD provides good spatial separation for grouped objects Monotone pitch contours ineffective for source continuity
30
New questions Reverberation? Natural prosody? Talker differences?
31
Simulated reverberant room
32
Reverberation impairs ITD
33
Natural prosodic contours
34
Natural prosody good against reverb
35
Vocal tract change Me (m) Higher pitch Shorter vocal-tract (higher formants) Both (-> f)
36
Vocal tract good against reverb 0 20 40 60 80 100 Fo togetherFo originalFo apartFo original + VT Effect of reverberation on relative strength of ITD, prosody and vocal tract RT60 = 0 RT60 = 0.5 s change in % correct by ITD when opposed by prosody ITD = ±91 µs
37
Shadowing sentences Jemma felt stiff and tired after 3 hours in the hot and stuffy room and she would have liked || …to go outdoors for a breath of fresh air We had spent our entire time from Cairo to Luxor in a tiny bus with no proper windows and really wanted || …the air conditioning to be switched on …liked the airconditioning...
38
Shadowing results 0 10 20 30 40 50 NormalSwapped Same VT Different VT Switches (against ITD) in shadowing (%) ITD = ±91 µs p<0.05 p<0.002 +ITD +Prosody +ITD +Prosody +Vocal Tract +ITD -Prosody +ITD -Prosody -Vocal Tract
39
Summary ITD no good for simultaneous grouping …but great for locating grouped objects ITD messed up by reverberation Prosody and speaker characteristics less messed up by reverberation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.