Representing Intonational Variation

Slides:



Advertisements
Similar presentations
ENG 528: Language Change Research Seminar
Advertisements

Prosody Modeling (in Speech) by Julia Hirschberg Presented by Elaine Chew QMUL: ELE021/ELED021/ELEM March 2012.
“Downstepped contours in the given/new distinction” Agustín Gravano Spoken Language Processing Group Columbia University, New York On the Role of Prosody.
INTONATION Chapters 15 & 16.
Varied, Vivid Expressive How can you use your voice to engage, express, and create meaning?
Syllables and Stress, part II October 22, 2012 Potentialities There are homeworks to hand back! Production Exercise #2 is due at 5 pm today! First off:
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
Modelling Polish Intonation for Speech Synthesis Dominika Oliver 23 May 2002.
Connecting Acoustics to Linguistics in Chinese Intonation Greg Kochanski (Oxford Phonetics) Chilin Shih (University of Illinois) Tan Lee (CUHK) with Hongyan.
FLST: Prosodic Models FLST: Prosodic Models for Speech Technology Bernd Möbius
J-ToBi Jennifer J. Venditti Presentation by James Rishe.
CS 4705 Lecture 22 Intonation and Discourse What does prosody convey? In general, information about: –What the speaker is trying to convey Is this a.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
Context in Multilingual Tone and Pitch Accent Recognition Gina-Anne Levow University of Chicago September 7, 2005.
Chapter three Phonology
Intonation September 18, 2014 The Plan for Today Also: I have posted a couple of readings on TOBI (an intonation transcription system) to the course.
Phonetics and Phonology
Intonation and Information Discourse and Dialogue CS359 October 16, 2001.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Alignment of tonal targets: 30 years on Bob Ladd University of Edinburgh.
A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.
Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
The Effect of Pitch Span on Intonational Plateaux Rachael-Anne Knight University of Cambridge Speech Prosody 2002.
TOBI, continued (continued) February 2, 2010 Languages! Polish2 Tagalog2 Urdu Spanish Afrikaans Korean Gujarati Italian Russian Swedish Also: Perception.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
TOBI Basics April 13, 2010.
INTONATION (Chapter 17).
Language and Speech, 2000, 43 (2), THE BEHAVIOUR OF H* AND L* UNDER VARIATIONS IN PITCH RANGE IN DUTCH RISING CONTOURS Carlos Gussenhoven and Toni.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
TOBI: Bi-Tonal Pitch Accents (the exciting conclusion!) February 4, 2016.
TOBI, continued January 29, 2008 The Outlook 1.Return course project reports. 2.New course schedule. 3.Today: Continue the discussion of English Intonation.
TOBI (the exciting conclusion!) February 1, 2011.
Pitch Tracking + Prosody January 19, 2012 Homework! For Tuesday: introductory course project report Background information on your consultant and the.
Suprasegmental features and Prosody Lect 6A&B LING1005/6105.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Lecture Overview Prosodic features (suprasegmentals)
Suprasegmental features and Prosody
English Intonation (introductory lecture)
Investigating Pitch Accent Recognition in Non-native Speech
4AOD Malinnikova Ekaterina
Phonetics SPAU 3343 Chap. 10 – Grasping the melody of language
Tone in Sherpa (Sino-Tibetan) Joyce McDonough1, Rebecca Baier2 and
Functions of intonation 1
Kuiper and Allan Chapter 6.2
Why Study Spoken Language?
Studying Intonation Julia Hirschberg CS /21/2018.
Meanings of Intonational Contours
Studying Intonation Julia Hirschberg CS /21/2018.
Intonational and Its Meanings
Intonational and Its Meanings
The American School and ToBI
Intonational Variation in Spoken Dialogue Systems
Meaningful Intonational Variation
Kuiper and Allan Chapter 6.2
Why Study Spoken Language?
Meanings of Intonational Contours
Representing Intonational Variation
Representing Intonational Variation
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
“Downstepped contours in the given/new distinction”
Predicting Phrasing and Accent
Discourse Structure in Generation
Comparative Studies Avesani et al 1995; Hirschberg&Avesani 1997
Intonational and Its Meanings
Jennifer J. Venditti Presentation by James Rishe
Presentation transcript:

Representing Intonational Variation Julia Hirschberg CS 4706 9/21/2018

Today How can we represent meaningful speech variation so we can compare utterances? assign in TTS? Expanded vs. compressed pitch range? Louder vs. softer speech? Faster vs. slower speech? Differences in intonational prominence? Differences in intonational phrasing? Differences in pitch contours? 9/21/2018

Joseph Steele, 1775 Figure 1. Transcription of Pope’s Happiness (Steele, 1775, p. 13) Since speech is composed of slides, and not stable, discrete pitches, he developed note heads which were lines, curves, and circumflex notes that could cover several pitches. To these curved heads, he then added a tail which marked the duration of the syllable in a number of beats within a musical bar. 9/21/2018

Language Learning Approaches A simpler approach / IS it INteresting / / d’you feel ANGry? / / WHAT’S the PROBlem? / (McCarthy, 1991:106) How much variation do we need to capture? How detailed? Continuous or categorical features? If categorical, what are the possible classes? 9/21/2018

How Do We Decide? Auditory: Language teachers: what representations can learners understand Acoustic: Examine the speech signal for critical vs. accidental variation Experimental approaches Identify potential meaningful variation Design production or perception studies to test E.g. what does a contour mean? 9/21/2018

Intonation Models Superpositional models (Fujisaki 1983, Möbius et al. 1993): acoustic/physiological Linear or Tone sequence models British school (Kingdon ’58, O’Connor & Arnold ’73, Cruttenden ’97): based on auditory analysis American School (Pierrehumbert ’80, ToBI): mainly acoustic analysis Dutch school (‘t Hart, Collier and Cohen 1990): perceptual data 9/21/2018

Superpositional models Pitch pattern of intonation modeled with two components: phrase component and accent component. Phrase has basic shape, and pitch movements for individual accents are superimposed over basic shape: plus = Apples, oranges and tomatoes 9/21/2018

Good for modeling utterance-level trends Declination: downtrend in f0 over the course of an utterance Successful in speech synthesis for languages like Japanese (little variation in accent type, e.g.) Lily and Rosa thought this was divine. Prince William was gorgeous and he was looking for a bride. They dreamed of wedding bells. 9/21/2018

Disadvantages Disadvantages Too rigid: All contours must be modeled with an accent and a phrase component Many SAE contours cannot be captured easily Cannot distinguish prominence types Cannot capture differences in phrase endings 9/21/2018

No account of different accent types, or variations in phrase endings No notation system which allows users to share observations from large speech corpora or to compare contours Used primarily for synthesis 9/21/2018

Tone Sequence Models Intonation generated from sequences of categorically different, phonologically distinctive tones Basic unit of intonational description: intonation phrase (tone unit, breath group) Delimited by pauses, phrase-final lengthening, pitch Syllables may be stressed or accented Accent aligned with primary stress -- telephone Indicated by F0, duration, intensity, voice quality O’Connor and Arnold 1972: Earliest textbook for English instruction that tells user which contour appropriate in which context 9/21/2018

British School Prenuclear accent unit Nuclear accent unit Prehead ‘Nucleus’ Stressed syllable Two types of accent unit in the British School: Prenuclear accent units; also called the Head Nuclear accent units; also called the Nucleus The nuclear accent unit is the last accent unit in the IP The head comprises all prenuclear accent units But JOHN’s never BEEN to Jamaica 9/21/2018

Six nuclear choices in English J a m i c falling i c rising J a m a c rising-falling i J m falling-rising J a m i c Rising-falling-rising a c i J m level J a m i c 9/21/2018

The American School American school-type models make a distinction between accents (what makes a particular word prominent) and boundary tones (how a phrase ends) Autosegmental metrical or two-tone models Only two tones, which may be combined H = high target L = low target 9/21/2018

Pierrehumbert 1980 Contours = pitch accents, phrase accents, boundary tones Pitch Accents* Phrase Accents* Boundary Tone Jd H* L* L*+H L+H* H*+L H+L* L% H% L- H- 9/21/2018

Price, Ostendorf et al Break indices: degree of juncture between words 0  8 (none to ‘a lot’) What I’d like is a nice roast beef sandwich. 9/21/2018

To(nes and)B(reak)I(ndices) Developed by prosody researchers in four meetings over 1991-94 Putting Pierrehumbert ’80 and Price, Ostendorf, et al together Goals: devise common labeling scheme for Standard American English that is robust and reliable promote collection of large, prosodically labeled, shareable corpora 9/21/2018

Minimal ToBI transcription: Recording of speech F0 contour ToBI tiers: ToBI standards also proposed for Japanese, German, Italian, Spanish, British and Australian English,.... Minimal ToBI transcription: Recording of speech F0 contour ToBI tiers: orthographic tier: words break-index tier: degrees of junction (Price et al ‘89) tonal tier: pitch accents, phrase accents, boundary tones (Pierrehumbert ‘80) miscellaneous tier: disfluencies, non-speech sounds, etc. 9/21/2018

Sample ToBI Labeling 9/21/2018

Online training material,available at: http://anita. simmons Evaluation Good inter-labeler reliability for expert and naive labelers: 88% agreement on presence/absence of tonal category, 81% agreement on category label, 91% agreement on break indices to within 1 level (Silverman et al. ‘92,Pitrelli et al ‘94) 9/21/2018

Pitch Accent/Prominence in ToBI Which items are made intonationally prominent and how: tonal targets/levels not movement Accent type: H* simple high (declarative) L* simple low (ynq) L*+H scooped, late rise (uncertainty/ incredulity) L+H* early rise to stress (contrastive focus) H+!H* fall onto stress (implied familiarity) 9/21/2018

Downstepped accents: !H*, L+!H*, L*+!H Degree of prominence: within a phrase: HiF0 (~nuclear accent) across phrases ?? 9/21/2018

Prosodic Phrasing in ToBI ‘Levels’ of phrasing: intermediate phrase: one or more pitch accents plus a phrase accent, H- or L- intonational phrase: 1 or more intermediate phrases + boundary tone, H% or L% ToBI break-index tier 0 no word boundary 1 word boundary 9/21/2018

2 strong juncture with no tonal markings 3 intermediate phrase boundary 4 intonational phrase boundary 9/21/2018

L*+H L* H* H-H% H-L% L-H% L-L% 9/21/2018

H* !H* H+!H* L+H* H-H% H-L% L-H% L-L% 9/21/2018

ToBI exercises NB: you will be submitting these exercises for the take-home part of the midterm, so save them! 9/21/2018

Next Class Predicting prosodic assignments from text 9/21/2018