speech, played several metres from the listener in a room - seems to have the same phonetic content as when played nearby - that is, perception is constant.

Slides:



Advertisements
Similar presentations
Measurement of acoustic properties through syllabic analysis of binaural speech David Griesinger Harman Specialty Group Bedford, Massachusetts
Advertisements

Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements Christopher A. Shera, John J. Guinan, Jr., and Andrew J. Oxenham.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
Binaural Hearing Or now hear this! Upcoming Talk: Isabelle Peretz Musical & Non-musical Brains Nov. 12 noon + Lunch Rm 2068B South Building.
Timbre perception. Objective Timbre perception and the physical properties of the sound on which it depends Formal definition: ‘that attribute of auditory.
Effects. Dynamic Range Processors Fixed Time Delay Effects Variable Time Delay Effects Reverberation Effects Time and Pitch Changing Effects Distortion.
Spatial Perception of Audio J. D. (jj) Johnston Neural Audio Corporation.
Room Acoustics: implications for speech reception and perception by hearing aid and cochlear implant users 2003 Arthur Boothroyd, Ph.D. Distinguished.
CS 551/651: Structure of Spoken Language Lecture 11: Overview of Sound Perception, Part II John-Paul Hosom Fall 2010.
Speech perception 2 Perceptual organization of speech.
Auditorium Acoustics Chapter 23. Sound Propagation Free field sound pressure proportional to 1/r SPL drops 6 dB with every doubling of distance. Indoors.
Effect of reverberation on loudness perceptionInsert footer on Slide Master© University of Reading Department of Psychology 12.
Reflections Diffraction Diffusion Sound Observations Report AUD202 Audio and Acoustics Theory.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
PREDICTION OF ROOM ACOUSTICS PARAMETERS
3-D Sound and Spatial Audio MUS_TECH 348. Cathedral / Concert Hall / Theater Sound Altar / Stage / Screen Spiritual / Emotional World Subjective Music.
Watkins, Raimond & Makin (2011) J Acoust Soc Am –2788 temporal envelopes in auditory filters: [s] vs [st] distinction is most apparent; - at higher.
SPEECH PERCEPTION The Speech Stimulus Perceiving Phonemes Top-Down Processing Is Speech Special?
1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.
Spectral centroid 6 harmonics: f0 = 100Hz E.g. 1: Amplitudes: 6; 5.75; 4; 3.2; 2; 1 [(100*6)+(200*5.75)+(300*4)+(400*3.2)+(500*2 )+(600*1)] / = 265.6Hz.
Reverberation parameters and concepts Absorption co-efficient
Sound source segregation (determination)
Acoustical Society of America, Chicago 7 June 2001 Effect of Reverberation on Spatial Unmasking for Nearby Speech Sources Barbara Shinn-Cunningham, Lisa.
PULSE MODULATION.
Invariance and context Nothing in the real world is interpreted on its own. –Low-level vision Completely different energy spectra are interpreted as the.
Inspire School of Arts and Science Jim White. Equalization (EQ) Equalization means boosting or cutting specific frequencies within an audio signal. A.
Acoustics Reverberation.
SPOKEN LANGUAGE COMPREHENSION Anne Cutler Addendum: How to study issues in spoken language comprehension.
Acoustics/Psychoacoustics Huber Ch. 2 Sound and Hearing.
Review Exam III. Chapter 10 Sinusoidally Driven Oscillations.
Architectural Acoustics II Indoor Acoustical Phenomena Prof S K Tang.
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
perceptual constancy in hearing speech played in a room, several metres from the listener has much the same phonetic content as when played nearby despite.
Developing a model to explain and stimulate the perception of sounds in three dimensions David Kraljevich and Chris Dove.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
 Space… the sonic frontier. Perception of Direction  Spatial/Binaural Localization  Capability of the two ears to localize a sound source within an.
Modelling compensation for reverberation: work done and planned Amy Beeston and Guy J. Brown Department of Computer Science University of Sheffield EPSRC.
Sounds and speech perception Productivity of language Speech sounds Speech perception Integration of information.
Hearing Research Center
Chapter 6. Effect of Noise on Analog Communication Systems
Human Detection and Localization of Sounds in Complex Environments W.M. Hartmann Physics - Astronomy Michigan State University QRTV, UN/ECE/WP-29 Washington,
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Scaling Studies of Perceived Source Width Juha Merimaa Institut für Kommunikationsakustik Ruhr-Universität Bochum.
MASKING BASIC PRINCIPLES CLINICAL APPROACHES. Masking = Preventing Crossover Given enough intensity any transducer can stimulate the opposite cochlea.
Katherine Morrow, Sarah Williams, and Chang Liu Department of Communication Sciences and Disorders The University of Texas at Austin, Austin, TX
CHAPTER 4 COMPLEX STIMULI. Types of Sounds So far we’ve talked a lot about sine waves =periodic =energy at one frequency But, not all sounds are like.
Listeners weighting of cues for lateral angle: The duplex theory of sound localization revisited E. A. MacPherson & J. C. Middlebrooks (2002) HST. 723.
SOES6002: Modelling in Environmental and Earth System Science CSEM Lecture 3 Martin Sinha School of Ocean & Earth Science University of Southampton.
Introduction to psycho-acoustics: Some basic auditory attributes For audio demonstrations, click on any loudspeaker icons you see....
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
Janine Wotton, Kristin Welsh, Crystal Smith, Rachel Elvebak, Samantha Haseltine (Gustavus Adolphus College) and Barbara Shinn-Cunningham (Boston University).
PSYC Auditory Science Spatial Hearing Chris Plack.
Sound Transmission Signal degradation in frequency and time domains Boundary effects and density gradients Noise Ranging Signal optimization.
Sound What is ultrasound? What do we use it for?.
distance, m (log scale) -25o 0o +25o C50, dB left right L-shaped
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
Auditory Localization in Rooms: Acoustic Analysis and Behavior
Auditorium Acoustics 1. Sound propagation (Free field)
PSYCHOACOUSTICS A branch of psychophysics
Precedence-based speech segregation in a virtual auditory environment
Loudness level (phon) An equal-loudness contour is a measure of sound pressure (dB SPL), over the frequency spectrum, for which a listener perceives a.
PREDICTION OF ROOM ACOUSTICS PARAMETERS
Cognitive Processes PSY 334
The lexical/phonetic interface: Evidence for gradient effects of within-category VOT on lexical access Bob McMurray Richard N. Aslin Mickey K. TanenMouse.
Mid-Term Review John W. Worley AudioGroup, WCL
FM Hearing-Aid Device Checkpoint 3
Attentional Tracking in Real-Room Reverberation
Loudness asymmetry in real-room reverberation: cross-band effects
PREDICTION OF ROOM ACOUSTICS PARAMETERS
Speech Perception (acoustic cues)
Presentation transcript:

speech, played several metres from the listener in a room - seems to have the same phonetic content as when played nearby - that is, perception is constant however, amount of reverberation varies with distance: - so temporal envelopes of speech signals vary considerably, and these envelopes are crucial for identifying the speech seems to be an instance of ‘constancy’ (e.g. Watkins & Makin, 2007) - speech perception ‘takes account’ of reflections’ level: reverberation in preceding sounds effects a compensation (Watkins, 2005)

real-room reflection patterns: - taken from an office room, volume=183.6 m 3 - recorded with dummy-head transducers, facing each other room’s impulse response (RIR) obtained at different distances, - this varies the amount of reflected sound in signals i.e.: early (50 ms) to late energy ratio: 18 dB at 0.32 m → 2 dB at 10 m, with an A-weighted energy decay rate of 60 dB per 960 ms at 10 m RIR convolved with ‘dry’ speech recordings - headphone presentation → monaural ‘real-room’ listening speakerlistener

distinguished by their temporal envelopes - notably, the gap (in ‘stir’ ) before voicing onset 11-step continuum: - end-point ‘stir’ (step 10), from amplitude modulation of other end-point, ‘sir’ (step 0) prominent effect of this AM is the gap intermediate steps, 1-9, by varying modulation depth 200 ms step 0 ‘sir’ ‘stir’ step 10 AM function time amplitude 200 ms test words; ‘sir’ vs. ‘stir’

constancy paradigm apply RIRs to sounds - vary their distances listeners identify these test words - measure category boundary: ‘extrinsic’ context: “next you’ll get _ ” mean proportion of ‘sir’ responses continuum step 0510 “sir”“stir” mean category boundary

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps far near test word increase test–word distance: - more sir responses - increases category boundary - substantial far-near difference increase context’s distance also: - reduces far-near difference - more ‘constancy’ of test words

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps Nielsen & Dau (2010): - noise context - speech-shaped, un-modulated - behaves like a far context - similar effect here - why? idea about modulation masking: - obscures gap [t] in ‘stir’ - increases ‘sir’ responses - mainly from the near contexts, where there’s more modulation - mostly affects far test-words less modulation masking from: - noise contexts - and from far contexts, hence, ‘constancy’ effect far near test word

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps modulation masking → prediction: - no context - even less masking - even smaller far-near difference opposite here, so effect is: - compensation from far context - not masking from near context constancy informed by context: - cue is ‘tails’ that reverb. adds at offsets near contexts have sharp offsets - clearly, no tails from reverb. other contexts; far, noise: - no sharp offsets; - presence of tails more likely far near test word

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps context effects are ‘extrinsic’ - are there any intrinsic effects? i.e., from within the test word ‘gating’; - removes tail from reverb., at end of test-word’s vowel far near test word

1980s drum sound: - ‘gated reverb.’ - reduces distance info. drums → ‘foreground’

2 4 6 frequency, kHz ms ‘sir’, near time

2 4 6 frequency, kHz ms near → far, adds tails time

200 ms far → gated, cuts tails frequency, kHz 8 time

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps constancy; - is reduced by gating, so: - it’s also informed by intrinsic info., that arrives after the consonant w/ extrinsic contexts - effects of gating not so apparent competition; extrinsic > intrinsic far near test word gated }

nearfarnone nearfarnoise context nearfarnone nearfarnoise context category boundary, step test far minus test near, steps Nielsen & Dau (2010), some no-context data (w/ only the far test words) - compared with near contexts, effects of reverb. were reduced. same pattern here for constancy: - on removal of extrinsic influence, intrinsic effect emerges far near test word gated }

constancy is not effected by modulation masking (or ‘adaptation’): - this is consistent with data on detection of sinusoidal AM (Wojtczak & Viemeister, 2005; speech mod. frequencies are too low) - and with ‘extrinsic’ compensation from preceding contexts in speech (Watkins, 2005; contexts w/ reversed reverb. don’t → compensation) there is also some ‘intrinsic’ compensation, from within test words - informed by tails that arise after the test-word’s consonant - less influential than the extrinsic compensation from preceding contexts, so its effects only emerge with isolated test-words

harp inharp out

200 ms frequency, kHz 8 ‘sir’, far; gated