CNBH, Physiology Department, Cambridge University www.mrc-cbu.cam.ac.uk/cnbh 2. Experimental procedure The experiment is a 2AFC paradigm design in which.

Slides:



Advertisements
Similar presentations
CNBH, PDN, University of Cambridge Roy Patterson Centre for the Neural Basis of Hearing Department of Physiology, Development and Neuroscience University.
Advertisements

Tom Lentz (slides Ivana Brasileiro)
Chapter 5: Space and Form Form & Pattern Perception: Humans are second to none in processing visual form and pattern information. Our ability to see patterns.
Central Auditory Processing
CNBH, PDN, University of Cambridge Roy Patterson Centre for the Neural Basis of Hearing Department of Physiology, Development and Neuroscience University.
Sounds that “move” Diphthongs, glides and liquids.
SPPA 403 Speech Science1 Unit 3 outline The Vocal Tract (VT) Source-Filter Theory of Speech Production Capturing Speech Dynamics The Vowels The Diphthongs.
Basic Spectrogram & Clinical Application: Consonants
Plasticity, exemplars, and the perceptual equivalence of ‘defective’ and non-defective /r/ realisations Rachael-Anne Knight & Mark J. Jones.
Glides (/w/, /j/) & Liquids (/l/, /r/) Degree of Constriction Greater than vowels – P oral slightly greater than P atmos Less than fricatives – P oral.
Hearing relative phases for two harmonic components D. Timothy Ives 1, H. Martin Reimann 2, Ralph van Dinther 1 and Roy D. Patterson 1 1. Introduction.
Voice quality variation with fundamental frequency in English and Mandarin.
1.0 Introduction Traditional View of phonetic laryngeal contrasts (/t/~/d/, VOICING): F0 drop, F1 drop, pulsing in the gap, CV Ratio, etc. (Kingston et.
“Connecting the dots” How do articulatory processes “map” onto acoustic processes?
Acoustic Characteristics of Vowels
CNBH, Physiology Department, Cambridge University Estimating vocal tract length from formant frequency data using a physical.
Multipitch Tracking for Noisy Speech
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
Splice: From vowel offset to vowel onset FIG 3. Example of stimulus spliced from the repetitive syllables. EXPERIMENT 2 (Voicing ID) METHOD Speech materials:
CNBH, PDN, University of Cambridge Roy Patterson Centre for the Neural Basis of Hearing Department of Physiology, Development and Neuroscience University.
“Speech and the Hearing-Impaired Child: Theory and Practice” Ch. 13 Vowels and Diphthongs –Vowels are formed when sound produced at the glottal source.
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Speech and speaker normalization (in vowel normalization)
Perception of syllable prominence by listeners with and without competence in the tested language Anders Eriksson 1, Esther Grabe 2 & Hartmut Traunmüller.
Vocal Emotion Recognition with Cochlear Implants Xin Luo, Qian-Jie Fu, John J. Galvin III Presentation By Archie Archibong.
CNBH, PDN, University of Cambridge Roy Patterson Centre for the Neural Basis of Hearing Department of Physiology, Development and Neuroscience University.
Development of coarticulatory patterns in spontaneous speech Melinda Fricke Keith Johnson University of California, Berkeley.
Profile of Phoneme Auditory Perception Ability in Children with Hearing Impairment and Phonological Disorders By Manal Mohamed El-Banna (MD) Unit of Phoniatrics,
Interrupted speech perception Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
CNBH, Physiology Department, Cambridge University The perception of size and sex in vowel sounds P60 David R. R. Smith and Roy.
Introduction Mel- Frequency Cepstral Coefficients (MFCCs) are quantitative representations of speech and are commonly used to label sound files. They are.
CNBH, Physiology Department, Cambridge University The perception of size in four families of instruments; brass, strings, woodwind.
Phonetics and Phonology
CSD 5400 REHABILITATION PROCEDURES FOR THE HARD OF HEARING Auditory Perception of Speech and the Consequences of Hearing Loss.
Time-Domain Methods for Speech Processing 虞台文. Contents Introduction Time-Dependent Processing of Speech Short-Time Energy and Average Magnitude Short-Time.
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
Chapter 4 Opener. Figure 4.1 A testing booth set up for the head-turn preference paradigm.
Acoustic Phonetics 3/9/00. Acoustic Theory of Speech Production Modeling the vocal tract –Modeling= the construction of some replica of the actual physical.
Una Y. Chow Stephen J. Winters Alberta Conference on Linguistics November 1, 2014.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Jiwon Hwang Department of Linguistics, Stony Brook University Factors inducing cross-linguistic perception of illusory vowels BACKGROUND.
Speech Science Fall 2009 Oct 26, Consonants Resonant Consonants They are produced in a similar way as vowels i.e., filtering the complex wave produced.
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
Speech Science Fall 2009 Oct 28, Outline Acoustical characteristics of Nasal Speech Sounds Stop Consonants Fricatives Affricates.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
LATERALIZATION OF PHONOLOGY 2 DAY 23 – OCT 21, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
SEPARATION OF CO-OCCURRING SYLLABLES: SEQUENTIAL AND SIMULTANEOUS GROUPING or CAN SCHEMATA OVERRULE PRIMITIVE GROUPING CUES IN SPEECH PERCEPTION? William.
The New Normal: Goodness Judgments of Non-Invariant Speech Julia Drouin, Speech, Language and Hearing Sciences & Psychology, Dr.
Temporal masking of spectrally reduced speech: psychoacoustical experiments and links with ASR Frédéric Berthommier and Angélique Grosgeorges ICP 46 av.
Neurophysiologic correlates of cross-language phonetic perception LING 7912 Professor Nina Kazanina.
IIT Bombay 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.
Sound Waveforms Neil E. Cotter Associate Professor (Lecturer) ECE Department University of Utah CONCEPT U AL TOOLS.
Scaling Studies of Perceived Source Width Juha Merimaa Institut für Kommunikationsakustik Ruhr-Universität Bochum.
Katherine Morrow, Sarah Williams, and Chang Liu Department of Communication Sciences and Disorders The University of Texas at Austin, Austin, TX
Speech Perception.
Nuclear Accent Shape and the Perception of Syllable Pitch Rachael-Anne Knight LAGB 16 April 2003.
Detection of Vowel Onset Point in Speech S.R. Mahadeva Prasanna & Jinu Mariam Zachariah Department of Computer Science & Engineering Indian Institute.
IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.
Control of prosodic features under perturbation in collaboration with Frank Guenther Dept. of Cognitive and Neural Systems, BU Carrie Niziolek [carrien]
Date of download: 5/27/2016 Copyright © 2016 American Medical Association. All rights reserved. From: The Importance of High-Frequency Audibility in the.
Danielle Werle Undergraduate Thesis Intelligibility and the Carrier Phrase Effect in Sinewave Speech.
4aPPa32. How Susceptibility To Noise Varies Across Speech Frequencies
University of Silesia Acoustic cues for studying dental fricatives in foreign-language speech Arkadiusz Rojczyk Institute of English, University of Silesia.
Somatosensory Precision in Speech Production
†Department of Speech Music Hearing, KTH, Stockholm, Sweden
Volume 66, Issue 6, Pages (June 2010)
Speech Perception (acoustic cues)
Attentive Tracking of Sound Sources
/r/ Place: palatal Articulatory phonetics Acoustics
Presentation transcript:

CNBH, Physiology Department, Cambridge University 2. Experimental procedure The experiment is a 2AFC paradigm design in which the subjects are required to discriminate the scale between sequences of utterances. Stimuli is speech syllables resynthesised by STRAIGHT [Kawahara (1997)], also P-centre corrected. STRAIGHT is a high-quality vocoder it separates the VTL and GPR information in speech sounds and resynthesises the same utterance over a range of VTL or GPR values. Speech is taken from a database created at the CNBH. The database contains 180 unique syllables and additional versions resynthesised at many different GPR’s and VTL’s. Discrimination performance is measured at five points in the GPR/SER space (see Fig. 1), where SER is 1/VTL. At each point a discrimination is made between the test size and three larger and three smaller speakers. Level and pitch cues are roved. Measurements are made for 6 groups: CV-sonorants, CV-stops, CV-fricatives VC-sonorants, VC-stops and VC-fricatives. Six subjects, 40 repetitions for each point on psychometric curve. Size discrimination in CV and VC English syllables. P63 D.T. Ives and R.D. Patterson 1. Introduction Size information is contained in sound. In humans, this information is conveyed mainly by vocal tract length. A wide range of different vocal tract lengths exist in the population but humans are able to distinguish the source (speaker) from the content (speech). This suggests that the size and shape information about the vocal-tract can be segregated (Irino and Patterson (2002)). Experiments by Smith and Patterson (2004) used vowel sounds to determine the range of speaker sizes over which the identification of vowel type and discrimination of speaker size was possible. We extend the size discrimination experiment using a more extended database of speech sounds and show that performance increases compared with vowel sounds. We also look at size discrimination performance for different speech sounds and show there is a small improvement for consonant-vowel speech as opposed to vowel-consonant speech. Figure 1. Measurement points on GPR-SER plane. 3. Results Figure 2. Psychometric functions for the five SER/GPR conditions, summed over all subjects and all syllable groups. Decreasing SER Increasing GPR Table 2. JND values for all syllable groups averaged over all subjects. Table 1. Stimulus set showing categories of CV’s and VC’s. 4. Analysis Figure 3. (a) JND values for SER/GPR condition as a function speech group; (b) JND values for speech group as a function SER/GPR condition. Figure 4. Duration of “periodic”, “vowel” and “noise” components for each speech group (error bars show ±1 standard deviation). Figure 5. Vocal tract length and glottal pulse rate distributions of men, women and children (replotted from Peterson and Barney (1952)). Numbered circles show the SER/GPR conditions used in the experiment. SER/GPR=1,3: JND’s stable at 4% for all speech groups. SER/GPR=2: JND increase slightly but more for stops. SER/GPR=4,5: JND’s increase to 7% and 6%, for stops increase is more. All speech groups except for stops and to a lesser degree VC’s the JND values are similar within an SER/GPR condition. Compared to vowels: (Smith and Patterson (2004) SER/GPR=3,4,5: 30% decrease in JND. SER/GPR=1: 60% decrease in JND. SER/GPR=2: 70% decrease in JND. Categorise syllable content into three components: (i) periodic component (including any vowel information), (ii) vowel component, (iii) noise component. CV’s have longer periodic components than VC’s and significantly longer vowel components. Fricatives have the least periodic and vowel content followed by stops and then sonorants, which have the greatest. Only a small variation in the duration of the noise part. The distribution shows 90% of the population. VTL’s and GPR’s that we would encounter in everyday life. Numbered circles show the SER/GPR conditions in the experiment. Size discrimination possible outside the normal ranges of size. Supports the argument of Smith et al. (2004): the auditory system includes a scale normalisation process. 5. Conclusions It is possible to discriminate size using speech. Experiments have shown JND values for size discrimination of between 4% and 6%. Discrimination is good and consistent outside normal range of speech, supports argument of scale transform in auditory system. Improvement in JND values over vowel of between 30% and 70%. Small performance increase for CV over VC, may be due to increased periodic content. References Irino and Patterson (2002). Segregating information about the size and shape of the vocal tract using a time-domain auditory model: The stabilized wavelet-Mellin transform. Speech Communication. Kawahara, H. (1997). Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited. Proceedings IEEE International Conference on Acoustics, Speech & Signal Processing (ICASSP ’97) 2, Smith, D. R. R., Patterson, R. D., Turner, R., Kawahara, H., and Irino, T. (2004). "The processing and perception of size information in speech sounds," (review process, submitted JASA). Peterson, G.E., Barney, H.L. (1952) Control methods used in the study of vowels. JASA 24(2): Acknowledgements Research supported by the U.K. Medical Research Council (G ).