Presentation is loading. Please wait.

Presentation is loading. Please wait.

Visual calibration, uncertainty, and attention in spatial auditory processing Norbert Kopčo Dept. of Cybernetics and AI, Technical University of Košice,

Similar presentations


Presentation on theme: "Visual calibration, uncertainty, and attention in spatial auditory processing Norbert Kopčo Dept. of Cybernetics and AI, Technical University of Košice,"— Presentation transcript:

1 Visual calibration, uncertainty, and attention in spatial auditory processing Norbert Kopčo Dept. of Cybernetics and AI, Technical University of Košice, Košice, Slovakia Dept. of Cognitive and Neural Systems, Boston University

2 Auditory Processing in Complex Scenes Sound source determination, segregation, grouping (Yost, 1992) Auditory scene analysis (Bregman, 1990) Factors (Yost, 1994): - Spectral separation - Spectral profile - Harmonicity - Spatial separation - Temporal separation - Temporal onsets and offsets - Temporal modulations - Visual information

3 Auditory Pathway and Spatial Hearing Kendal, Schwartz & Jessell (2000) Cochlea – peripheral filtering and neural coding Olivary complex – extraction of binaural spatial information Thalamus (Inferior Colliculus) – integration, modulation detection Auditory Cortex – auditory object formation, figure/ground separation, ASA, attentional factors

4 OUTLINE Three behavioral experiments (and computational considerations): 1. Spatial specificity and coordinate frame of visual recalibration (ventriloquism aftereffect) in humans and monkeys 2. How a priori information / uncertainty about masker locations influences localizability of target talker in a multi- talker environment 3. „What and where“ model of cortical processing of auditory distance information

5 Visually-induced auditory spatial adaptation in monkeys and humans Norbert Kopčo, I-Fan Lin, Barbara Shinn-Cunningham, Jennifer Groh Center for Cognitive Neuroscience, Duke University Hearing Research Center, Boston University Technical University, Košice, Slovakia

6 6 Introduction Visual stimuli can affect the perception of sound location e.g. the Ventriloquism Effect Way to go Red Sox! Way to go Red Sox! But does effect persist?

7 7 Introduction Visual stimuli can affect the perception of sound location e.g. the Ventriloquism Effect Way to go Red Sox! But does effect persist? - barn owls: prism adaptation (Knudsen et al.) - monkeys: “ventriloquism aftereffect” (Woods and Recanzone, Curr. Biol. 2004)

8 8 GOALS 1. Ventriloquism “aftereffect” in saccade task, in monkeys and humans? 2. Can “spatially specific” aftereffect be induced? - Zwiers et al. (2003): prism adaptation in humans generalizes 3. Reference frame of plasticity? - Visual, auditory, or oculomotor reference frame?

9 9 Methods Basic idea: 1. Pre-adaptation baseline: Measure auditory saccade accuracy 2. Adaptation phase: Present combined visual-auditory stimuli, with visual location shifted 3. Compare auditory saccade accuracy pre- and post-adaptation

10 10 Q1: Does it work? Initial experiment Design: Monkey Pre-adaptation baseline – ~100 Auditory-only trials Adaptation phase – 80% V-A stimuli, visual stimulus shifted 6 deg. Left or Right 20% Auditory-only Compare Auditory-only trials from adaptation phase to pre- adaptation phase Sounds: Loudspeakers Visual stimuli: LEDs

11 11 RESULTS

12 12 RESULTS

13 13 RESULTS

14 14 RESULTS

15 15 Q2: specificity

16 16 Q3: reference frame Eye-centered? Head (ear) -centered? Oculomotor? ? ? 3a. What is the reference frame? 3b. Is the reference frame the same for humans and monkeys?

17 17 Method Audiovisual display Expected behavior Stimulus Location (°) Magnitude (°) Fix head to face 0° Induce shift: - in only one region of space - from a single fixation point Test to see if shift generalizes to the same sub-region in: - head-centered space - eye-centered space FP LEDs Speakers

18 18 Predictions for reference frame

19 19 Results: Humans Audiovisual display Expected Responses FP LEDs Speakers Head-centered or Eye-centered Trained FP A-only responses: - Shift induced in trained sub-region - Some generalization to untrained regions Spatial specificity Shifted FP A-only responses: - Shift reduced and moved to left Mixture of Head-centered and eye-centered representation

20 20 Results: Monkeys Audiovisual display Expected Responses FP LEDs Speakers Head-centered or Eye-centered Trained FP A-only responses: - Shift induced in trained sub-region Spatial specificity - Generalization to untrained regions (asymmetrical) Shifted FP A-only responses: - Shift reduced and moved to left Representation more mixed than in humans

21 21 Summary The main results are consistent across species: Locally induced ventriloquist effect results in short-term adaptation, causing 30-to-50% shifts in responses to A-only stimuli from trained sub-region. The induced shift generalizes outside the trained sub-region, with gradually decreasing strength -> spatially specific The pattern of induced shift changes as the eyes move. Representation appears to be mixed, in a frame that is not purely head-centered or eye-centered.

22 22 Discussion Posterior Parietal Cortex Neural adaptation could have been induced at several stages along the pathway. Future work Examine temporal and spatial factors influencing the eye- centered modulation. Look at other trained sub-regions. Midbrain Pons Cerebrum Thalamus Midbrain Pons Thalamus (Kandel, Schwartz, Jessel) and (Purves)

23 Localizing a speech target in a multitalker mixture Norbert Kopčo 1, Virginia Best 2, and Simon Carlile 2 1 Technical University of Košice, Košice, Slovakia Hearing Research Center, Boston University 2 School of Medical Sciences, University of Sydney, Sydney, Australia

24 24 Introduction Spatial separation of sources enhances speech perception In complex environments (e.g., with multiple talkers), spatial perception also important for “sorting” acoustic scene into objects and focusing attention on sources of interest (Brungart et al 2001; Freyman et al 1999; Kidd et al 2005; Best et al 2007; Shinn-Cunningham 2008) Relatively few studies actually measured localization of speech in a multitalker environment (Yost et al., 1996; Hawley et al.1999; Drullman and Bronkhorst 2000; Brungart et al. 2006)

25 25 Experiment and Goals Study horizontal localization of speech in a multitalker environment Question 1: How does presence of maskers influence localization performance? Evaluate the effect of maskers on biases/variability in responses. Question 2: Is performance affected by a priori knowledge / uncertainty about distribution of masker locations? Compare performance when masker distribution fixed vs. varied from trial to trial. Hypothesis: masker location uncertainty will hurt performance.

26 26 Setup and masker patterns

27 27 Methods Stimuli: Target: word “two” spoken by a female talker Maskers: 4 different monosyllabic words, spoken by 4 male talkers (all longer than target) Target-to-Masker energy ratios: 0 dB or -5 dB Task:Subjects pointed head to perceived target location Subjects asked to indicate location only if target heard (5 catch trials with no target per block to monitor obedience) Conditions (separate blocks): - Control: No masker - Fixed: Masker pattern fixed across block of trials - Mixed: Masker pattern randomly chosen for each trial

28 28 Detection Detection worse at lower TMR, similar in both uncertainty conditions

29 29 Effect of uncertainty Averaged across patterns and target locations, a priori knowledge does not help Masker Uncertainty Helps   Hurts

30 30 Effect of uncertainty When looking only at off-masker locations, a priori knowledge does help Masker Uncertainty Helps   Hurts

31 31 Effect of uncertainty When looking only at on-masker locations, a priori knowledge hurts performance Masker Uncertainty Helps   Hurts

32 32 Interim summary A priori knowledge of masker locations influences target talker localizability: - Improving performance at locations from which (the subject knows) no masker can come - Degrading performance at locations from which (the subject knows) maskers will come Why more information causes worse performance? Possible mechanism: - Redistribution of processing resources - “incorrect” strategy: focusing only on off-masker locations Analyze means and st.devs. in responses to gain more insight into behavior

33 33 Mean Responses: Pattern 1 Compression of responses to peripheral targets Bias Leftward   Rightward

34 34 Bias due to Maskers Compression strongest for targets near peripheral maskers Leftward   Rightward

35 35 Masker Uncertainty and Bias When masker pattern fixed throughout a block, responses biased away from maskers Bias due to Masker Uncertainty Left   Right

36 36 Response Variability Complex effect of target location, masking pattern, uncertainty and TMR

37 37 Uncertainty and Response Variability Patts 1-3 (grouped maskers): Averaged x-location, no Mixed-Fixed diff. If not averaged, fixing pattern - helps for off-masker targets - hurts for on-masker targets Masker Uncertainty Helps   Hurts

38 38 Uncertainty and Response Variability Patts 1-3 (grouped maskers): Averaged x-location, no Mixed-Fixed diff. If not averaged, fixing pattern - helps for off-masker targets - hurts for on-masker targets Masker Uncertainty Helps   Hurts Patts 4-5 (distributed maskers): Effect of uncertainty independent of loc. or TMR - patt 4: uncertainty hurts performance - patt 5: uncertainty helps performance

39 39 Summary 1. Mixture has complex effects on localization bias and variability - depending on masker pattern, location of target re. maskers, and TMR - compression of mean localization responses near peripheral maskers - increases in standard deviations, in particular when maskers distributed 2. Trial-to-trial randomization in the distribution of speech maskers (i.e., masker location uncertainty) modulates the effect of masking: - sometimes exaggerating it (as expected) - but sometimes reducing it (unexpected) These modulatory effects are likely to be due to - change in strategy / assignment of resources: focusing on off-masker locations in fixed condition - alternative: adaptation

40 40 Computational approach - Modeling Faller and Merimaa (2004) Source localization in complex listening situations: Selectionof binaural cues based on interaural coherence. Journal of the Acoustical Society of America. - computational “effective signal processing” model - computes cross-correlation between the ear inputs - only considers moments when cross-correlation peaks = when only one source is present and remaining sources are silent - creates probability distributions of ITD/ILD at these “clean look” moments - “decision maker” selects which source to process

41 41 Computational approach - Modeling Faller and Merimaa (2004) Source localization in complex listening situations: Selectionof binaural cues based on interaural coherence. Journal of the Acoustical Society of America. Example of distribution of representations with and without the criterion.

42 42 Computational approach - Modeling Faller and Merimaa (2004) Source localization in complex listening situations: Selectionof binaural cues based on interaural coherence. Journal of the Acoustical Society of America. Implications for our data: - This model can be a good starting point, but - Model would fail because no “good looks” available (concurrent mono-syllabic words) - No mechanism in the model to describe how listeners use attention in this task: - how attention is directed towards target - if/how “a priori” knowledge of masker locations is used - how uncertainty about masker locations influences behavior

43 Cortical representation of auditory distance Norbert Kopčo 1, Jyrki Ahveninen 2 1 TU Košice, Slovakia 2 Martinos Center for Biomedical Imaging Harvard Medical School

44 Hypotéza o sluch spracovaní v kôre Existujú dve paralelné dráhy, ktoré spracúvajú informáciu o obsahu (what) a priestorovej polohe (where) zvukov. Preto, organizácia sluchovej informácie môže byť podobná ako vo vizuálnom kortexe, kde je what/where organizácia v celku akceptovaná Problémy: - tieto závery sú hlavne na základe štúdií na zvieratách - u človeka, niektoré foneticky citlivé oblasti na zlej strane A1 Rauschecker (2005)

45 fMRI aktivácie kôrových oblastí Aktivácia sluchovej kôry pri prezentovaní stimulov s meniacou sa vzdialenosťou Aktvovaná primárna sluchová kôra Aktivovaná „parietálna“ vetva – „What & Where“ model sedí Aktivovaná aj vizuálna oblasť MT, zodpovedná za vizuálnu reprezentáciu pohybu Ďalšie kroky: Nástroj pre testovanie modelov spracovania vzdialenostnej informácie Kopčo et al. (ARO, 2011)

46 Overall Summary Three behavioral experiments examined sound localization in the horizontal (left-right) dimension: Exp. 1. In both humans and monkeys: - it is possible to induce a spatially specific ventriloquism aftereffect - visually guided recalibration occurs in a mixed coordinate frame Exp. 2. A priori information affects talker localization in a multi-talker mixture - Listeners redistribute resources from masked to non-masked locations Exp. 3. Auditory distance representation consistent with „what and where“ model. Exp. 1 & 2 published in J Neurosci and J Acoust Soc Am, Exp. 3 in preparation

47 Current Projects Continuing in the projects described above Spatial auditory attention and speech perception Contextual Plasticity in Sound Localization (US NIH) Perceptual and cross-modal learning in auditory distance perception (Marie Curie Project, 7FP EU)

48 Collaborators and Support Beata Tomoriová, Rudolf Andoga, Luboš Hládek – TU Košice Virginia Best, Barbara Shinn-Cunningham – Boston University Jennifer Groh – Duke University Simon Carlile – University of Sydney Aaron Seitz – University of California, Riverside Jyrki Ahveninen – Harvard Medical School Support by: US National Institutes of Health Human Frontiers Science Program Slovak Science Grant Agency US National Academies of Science Twinning Program

49 49 Introduction Two sounds can perceptually interact even when they do not overlap in time or frequency. In a previous study of auditory “cuing” (Kopco et al., 2001, 2003), we found that a preceding sound had a strong effect on the perceived location of a target. The current study was designed to examine these interactions in detail, and to determine if these interactions were due to - acoustic effects of the room - low-level neural processing effects (adaptation) - higher level processing (perceptual organization, strategy)

50 50 Methods We measured azimuthal localization performance for a click target stimulus when preceded by another identical click (coming from a known location): - presented with a short onset asynchrony - from a different azimuthal location - in either an anechoic chamber or an ordinary classroom Performance was compared to that in a control in which there was no preceding distractor click.

51 Stimuli and setup

52 Contextual task-dependent plasticity Responses biases away from distractor even on trials on which the target is not preceded by a distractor

53 Click vs Click-Click:Effect of the Preceding Distractor Perceived target location of frontal sources is “attracted” by lateral distractors Interactions for SOAs up to 200 ms Possible mechanisms: - Precedence effect - Short-term adaptation - Grouping of identical target and distractor

54 Click vs Click-Click vs. Click-Click-Click-Click-Click: Effect of Streaming on Perceived Target Azimuth Replacing the single-click distractor by a 5-Hz click-train such that the target does not fall into the temporal pattern of the distractor eliminates the “attractive” bias.


Download ppt "Visual calibration, uncertainty, and attention in spatial auditory processing Norbert Kopčo Dept. of Cybernetics and AI, Technical University of Košice,"

Similar presentations


Ads by Google