Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research Laboratory.

Similar presentations


Presentation on theme: "Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research Laboratory."— Presentation transcript:

1

2 Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research Laboratory

3 2 Credits AFOSR Sponsored Research Team: Brian Simpson Alex Kordik Rich McKinley Mark Ericson Collaborators: Chris Darwin Gerald Kidd

4 3 Introduction 1)Energetic and Informational Masking: Speech in Noise vs Speech in Speech 2)Monaural speech segregation 3)Binaural and Dichotic speech segregation 4)Dynamic aspects of cocktail party problem 5)Audio-Visual cocktail party effects

5 4 Energetic Masking In classic speech-on-noise masking, only one type of masking occurs: Energetic Masking In Energetic Masking: -The masking sound is more intense than the target in one or more critical bands -Some portion of the target signal is inaudible at the periphery

6 5 Energetic Masking Articulation Theory Energetic masking in speech was studied for years by Fletcher and others at Bell Labs -Articulation Theory -Articulation Index (AI) Allows accurate prediction of intelligibility: -For any phonetically balanced vocabulary -For any continuous noise source -Plus numerous correction factors High-Amplitudes, Reverb, Peak-Clipping, etc.

7 6 Informational Masking Energetic Masking also occurs in Speech-on-Speech masking -Where signals overlap within critical band However, informational masking also occurs: Listeners hear two or more audible sounds, but can’t segregate them into separate messages Classic example: multi-tone complexes - No energetic overlap in stimuli, but substantial masking is observed (Kidd, Neff)

8 7 Data collected with Coordinate Response Measure -CRM Originally developed by Moore & McKinley (1980) - Format: Ready (Call Sign) go to (Color) (Number) now. - Target is indicated by call sign Baron - Maskers indicated by other call signs - Complete CRM corpus is available (Bolia et. al, 2001) - 8 Talkers in corpus (4 M, 4 F), 2048 Phrases - 8 Talkers x 4 Colors x 8 Numbers x 8 Call Signs - Embedded call-sign ideal for multitalker studies - Similar to many multichannel monitoring tasks Methods The Coordinate Response Measure (CRM)

9 8 "); document.writeln(""); document.writeln(" Your call sign is Baron. Listeners respond by selecting the appropriate colored digit with the computer mouse Methods The Coordinate Response Measure

10 9 Methods Pros and Cons of CRM Advantages of CRM: Rapid data collection: training and scoring Sentences are reusable Embedded call sign to designate target - does not require a priori designation Disadvantages of CRM: Limited vocabulary - partially offset by lack of context - not phonetically balaced Not “conversationally” realistic CRM emphasizes “speech on speech” masking

11 10 Methods Pros and Cons of CRM Advantages of CRM: Rapid data collection: training and scoring Sentences are reusable Embedded call sign to designate target - does not require a priori designation Disadvantages of CRM: Limited vocabulary - partially offset by lack of context - not phonetically balaced Not “conversationally” realistic CRM emphasizes “speech on speech” masking

12 11 Methods Pros and Cons of CRM Advantages of CRM: Rapid data collection: training and scoring Sentences are reusable Embedded call sign to designate target - does not require a priori designation Disadvantages of CRM: Limited vocabulary - partially offset by lack of context - not phonetically balaced Not “conversationally” realistic CRM emphasizes “informational” masking

13 12 Two-Talker Diotic Listening Results TM=Mod. Noise Masker TN=Cont. Noise Masker TD=Diff. Sex Masker TS=Same Sex Masker TT=Same Talker Masker

14 13 Two-Talker Diotic Listening Error Distribution Most errors match the color and number spoken by the masking talker…. This is indicative of informational masking

15 14 Three-Talker Diotic Listening Results T=Target Talker M=Mod. Noise Masker D=Diff. Sex Masker S=Same Sex Masker T=Same Talker Masker

16 15 Four-Talker Diotic Listening Results T=Target Talker M=Mod. Noise Masker D=Diff. Sex Masker S=Same Sex Masker T=Same Talker Masker

17 16 3-4 Talker Listening Results

18 17 Dichotic Listening Introduction To this point, all stimuli have been diotic Spatial separation is known to play a role - Cherry’s “Cocktail Party Problem” Dichotic masking is pure informational masking - No contralateral energetic masking occurs Previous results have suggested: - Almost perfect segregation across ears - Cherry, Broadbent, Triesman, Kidd, Neff, etc.

19 18 Dichotic Listening Procedure Dichotic listening similar to other procedure but 1) Talkers were known a priori - 1 male, 1 female target talker 2) 2 Talkers presented in right ear (T and M) 3) Masking signal was presented in left ear

20 19 Dichotic Listening Results With 2 talkers in right ear… Noise in left ear doesn’t interfere (Even when Loud) Speech interferes substantially… (Even when Quiet) Reversed Speech interferes… but only when target in right ear lower than masker in right ear

21 20 Binaural Listening Spatial Separation in Azimuth From the classic “cocktail party effect” Spatial separation improves segregation Diotic vs. 45˚ Separation, same-sex talkers

22 21 Binaural Listening Spatial Separation in Distance

23 22 Binaural Listening Spatial Separation in Distance With Natural Better-Ear SNR Cues, Both speech and noise Benefit from separation in distance

24 23 Binaural Listening Spatial Separation in Distance With normalization, speech is Better but Noise is not

25 24 Dynamic Aspects of Multitalker Listening Most Cocktail-Party Listening Experiments assume 1) Target talker is known (“Selective Attention”) 2) Target talker is unknown (“Divided Attention”) Real world listening falls in between these extremes - Attention focused primarily on one talker - Other talkers monitored for “important” info How do listeners adapt to conversational dynamics

26 25 Dynamic Cocktail Party Effects Multitalker Transition Probability Experiment: 3-Talker Condition 1)Standard CRM task 2) 2, 3, or 4 Spatially Separated Same-Sex Talkers - Close or Far separation for 2 and 3 talkers 3)5 Transition Probabilities (0-1) 4) 3 Talker Configurations - Talkers selected randomly - Each location assigned a talker - Target talker follows target location 5)Total of 106,200 Trials - Balanced by Target Talker and Target Location

27 26 Dynamic Cocktail Party Effects Multitalker Transition Probability Overall Perfomance Improves Gradually After Transitions

28 27 Conclusions ? 1)Speech-on-Speech  Speech-in-Noise - Deployment of Auditory Attention is Important - Signal “similarity” is a major factor - Spatial separation is particularly beneficial 2) Multitalker Listening is a Dynamic Process - Listeners adapt to source location changes over 5-8 trials - Listeners learn new situations quickly (10 trials) - Listeners adopt optimal listening strategies


Download ppt "Dynamic Aspects of the Cocktail Party Listening Problem Douglas S. Brungart Air Force Research Laboratory."

Similar presentations


Ads by Google