2 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 1 On the Domain of Auditory Resoration in Speech European Conference on Cognitive Science New Bulgarian University, Sofia 21st May 2011

3 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 2 Series of experiments with an explicit focus on the perception of „sound segments“, starting with Warren (1970).  Speech stimuli in which sound segments were replaced by noise or silence. „auditory scene analysis involves a grouping of sounds. The principal of similarity of very important“ Finding: When the noise is suitable to be interpreted by the listeners as a masker of the speech signal, the missing sound segments are added in perception. An intact speech utterance is heard in addition to a noise pattern  Phonemic restoration Background Bregman (1990)

4 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 3 The notion of phonemic restoration was applied to connected speech… …based on the following concept that is reflected in the term „speech reduction“ = phonological information is lost Background /  / [  ] String of modified sound segments In speech comprehension, the deleted phonemes or phonemic features are perceptually restored ?

5 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 4 However, this phoneme-centered view is oversimplified.  A growing body of evidence shows that connected-speech processes are overall rather a non-linear reorganization than a deletion or „reduction“ of essential sound features. „The process stops where words are still distinct“ (Kohler 1990), i.e. the meaning-bearing word may be a better reference unit than the meaning-differentiating phoneme. Subtle “phonetic essences” of words remain even after a strong “de- segmentalization”, but these essences fall between the cracks of a phonemic perspective Background

6 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 5 Background 122ms, 72.8dB 90ms, 78.9dB „Voss Shombdon“, /s  / „Vosh Shombdon“, /  / (Niebuhr, 2009, 2011) The imprint of /s/ in the preceding vowel remains after /s/ assimilation

7 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 6 Background /n/ of „kann“ is very long, palatal [  ] and has two E-max [k  ] is produced as palatalized [  ] the two /a/ are fronted and sound like [  ] (Kohler and Niebuhr, 2011) Reorganization of the phonetic essence of German „Ihnen“ (you) (a) „Ich kann das ja mal sagen“ (b) „Ich kann Ihnen das ja mal sagen“ (I can mention this to you)

8 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 7 Background “Reduced” connected speech contains no maskers or interruptors. The seemingly deleted information can still be there, but reorganized in a suprasegmental, non-linear representation.  Perceptual conditions differ completely from the original line of research on phonemic restoration. (Q1): Do we find auditory restoration under these conditions, when there is actually nothing to “undelete”? (Q2): If there is restoration, is the “phoneme” (which is in the first line a heuristic, meta-linguistic concept) an adequate operational unit? Wouldn’t a suprasegmental unit like the “syllable” be more suitable? As regards (Q2), there is experimental evidence in favour of phoneme restoration. However, the corresponding studies used phoneme-oriented tasks like “phoneme-monitoring” and orthography-related judgments.  The presented study will use a more neutral task.

9 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 8 Stimulus material – The SYLLABLE series (German discount chain) (frog puppet, Muppet Show) (knowledge) All 2-syllable nouns, 650 ms long + (to see/to watch) 2-syllable verb, 700 ms long 7 syll. = +3 6 syll. = +2 5 syll. = +1 (2+2 syll., equal overall duration 1.350 ms) ISO condition WORD+ condition „Nun wollen wir mal gucken“ (Now, let us see) „Können wir mit gucken“ (May we watch with you) „Willst Du den gucken“ (Do you want to watch it)

10 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 9 Stimulus material – The PHONEME series All 2-syllable nouns, 620 ms long 2-syllable noun, 680 ms long 5 syll.= +1,+ 5 phonms. 5 syll.= +1,+ 3 phonms. 5 syll.= +1,+ 2 phonms. (2+2 syll., equal overall duration 1.300 ms) (German hanseatic city) (hammer) (take it) + (soft drink) ISO condition WORD+ condition „Willst Du mal Cola“ (Do you want Cola) „Haben wir Cola“ (Do we have Cola) „Nehmen Sie Cola“ (Do you take Cola)

11 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 10 Experiment/Task Pairwise comparisons with regard to stimulus duration between the target stimuli (A)… …and [  ]-like reference stimuli (X) Rationale: if there are more or less comprehensive restorations in the target stimuli, they will appear longer than the [  ]-like reference stimuli  (–) indirect judgments, which can only provide indirect evidence.  (+) listeners’ attention is not explicitly drawn to phonemic units, but to the utterance as a whole; both stimuli are speech or speech-like The [  ]-like reference stimuli were phonetically constant, but varied in duration from -150 ms to +600 ms relative to the duration of the respective target stimulus (620; 650; 1.300; 1.350 ms) Stimulus pairings were presented (with several randomized repetitions) in both AX and XA orders

12 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 11 Experiment/Task Two separate experimental sessions with two groups of naïve subjects ISO stimuli : SYLLABLE + WORD+ stimuli : SYLLABLE + WORD+ stimuli : PHONEME ISO stimuli : PHONEME Judged by subject group #1 in AX/XA pairs Judged by subject group #2 in AX/XA pairs

13 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 12 Results 1: Perceived wordings The target stimuli were presented (together with a list of filler items in an overall randomized order) to a third group of 12 Northern Standard German listeners. Their task: “Write down what you hear”. Result: The intended wordings were unequivocally identified in the target stimuli. It can be assumed that the same perceived wordings underlay the duration judgments of the other two groups of listeners. On this basis, the main results show the following…

14 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 13 Even ISO stimuli appeared slightly longer (on average 37 ms) than the [  ] references with physically identical durations. The WORD+ stimuli, for which auditory restoration can occur, appear longer than the ISO stimuli. The more syllables can be restored, the longer is the stimulus perceived relative to [  ] Results 2a: SYLLABLE stimuli +0 syll. +1 syll. +2 syll. +3 syll.

15 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 14 Average reaction times in the duration-oriented AX/XA comparisons increase, the more syllables can in principle be restored in the target stimuli. Results 2b: SYLLABLE stimuli +1 syll. +2 syll. +3 syll. +0 syll. +0 syll. +0 syll.

16 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 15 Even ISO stimuli appeared slightly longer (on average 66 ms) than the [  ] references with physically identical durations. The WORD+ stimuli, for which auditory restoration can occur in terms of +1 syll. and +2 to +5 phonemes, appear longer than the ISO stimuli. The WORD+ stimuli, in which no syllables, but only phonemes can be restored, do not differ from each other. Results 3a: PHONEME stimuli +0 syll. +2 phnms +3 phnms +5 phnms

17 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 16 Average reaction times increase for the PHONEME stimuli between ISO and WORD+, i.e. when there is a syllable to restore, but the reaction times remain constant within the WORD+ stimuli, for which auditory restoration can occur only in terms of +2 to +5 phonemes. Results 3b: PHONEME stimuli +0 syll. +0 syll. +0 syll. +1 syll. +1 syll. +1 syll. +2 phnms +3phnms +5 phnms

18 Analysis of Spoken Language at the Dept. of General Linguistics Christian-Albrechts-Universität zu Kiel 21.05.2011 Oliver Niebuhr 17 The present study provided clear, but indirect evidence that auditory restoration occurs not just for masked speech, but also for “reduced” connected speech.  Previous evidence in favour of auditory restoration was not an artefact of a phoneme or orthography-oriented task. However, the operational unit of the restoration seems to be a suprasegmental one, which is larger than the phoneme (i.e. a single speech sound). It need not be what we refer to as ‘syllable’. Besides the empirical support for auditory restoration, it seems that we do not really know yet, when it occurs and what is actually restored… canonical/full forms in all details? basic sound and pitch qualities? just a temporal/rhythmical grid? Discussion This must become a separate line of research in future studies on speech perception

