Representing Intonational Variation

Slides:

Advertisements

Similar presentations

Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.

Advertisements

ENG 528: Language Change Research Seminar

Perceptual Organization in Intonational Phonology: A Test of Parallelism J. Devin McAuley 1 & Laura C. Dilley 2 Department of Psychology Bowling Green.

Prosody Modeling (in Speech) by Julia Hirschberg Presented by Elaine Chew QMUL: ELE021/ELED021/ELEM March 2012.

“Downstepped contours in the given/new distinction” Agustín Gravano Spoken Language Processing Group Columbia University, New York On the Role of Prosody.

INTONATION Chapters 15 & 16.

Varied, Vivid Expressive How can you use your voice to engage, express, and create meaning?

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.

Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.

Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.

Modelling Polish Intonation for Speech Synthesis Dominika Oliver 23 May 2002.

Connecting Acoustics to Linguistics in Chinese Intonation Greg Kochanski (Oxford Phonetics) Chilin Shih (University of Illinois) Tan Lee (CUHK) with Hongyan.

FLST: Prosodic Models FLST: Prosodic Models for Speech Technology Bernd Möbius

J-ToBi Jennifer J. Venditti Presentation by James Rishe.

CS 4705 Lecture 22 Intonation and Discourse What does prosody convey? In general, information about: –What the speaker is trying to convey Is this a.

Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.

Context in Multilingual Tone and Pitch Accent Recognition Gina-Anne Levow University of Chicago September 7, 2005.

Dianne Bradley & Eva Fern á ndez Graduate Center & Queens College CUNY Eliciting and Documenting Default Prosody ABRALIN23-FEB-05.

Chapter three Phonology

Intonation September 18, 2014 The Plan for Today Also: I have posted a couple of readings on TOBI (an intonation transcription system) to the course.

Phonetics and Phonology

Intonation and Information Discourse and Dialogue CS359 October 16, 2001.

1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.

Alignment of tonal targets: 30 years on Bob Ladd University of Edinburgh.

A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.

Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.

Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.

LATERALIZATION OF PHONOLOGY 2 DAY 23 – OCT 21, 2013 Brain & Language LING NSCI Harry Howard Tulane University.

The Effect of Pitch Span on Intonational Plateaux Rachael-Anne Knight University of Cambridge Speech Prosody 2002.

TOBI, continued (continued) February 2, 2010 Languages! Polish2 Tagalog2 Urdu Spanish Afrikaans Korean Gujarati Italian Russian Swedish Also: Perception.

Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.

TOBI Basics April 13, 2010.

INTONATION (Chapter 17).

Language and Speech, 2000, 43 (2), THE BEHAVIOUR OF H* AND L* UNDER VARIATIONS IN PITCH RANGE IN DUTCH RISING CONTOURS Carlos Gussenhoven and Toni.

TOBI: Bi-Tonal Pitch Accents (the exciting conclusion!) February 4, 2016.

TOBI, continued January 29, 2008 The Outlook 1.Return course project reports. 2.New course schedule. 3.Today: Continue the discussion of English Intonation.

TOBI (the exciting conclusion!) February 1, 2011.

Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.

Pitch Tracking + Prosody January 19, 2012 Homework! For Tuesday: introductory course project report Background information on your consultant and the.

Suprasegmental features and Prosody Lect 6A&B LING1005/6105.

On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.

A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,

Lecture Overview Prosodic features (suprasegmentals)

Suprasegmental features and Prosody

English Intonation (introductory lecture)

Investigating Pitch Accent Recognition in Non-native Speech

4AOD Malinnikova Ekaterina

Phonetics SPAU 3343 Chap. 10 – Grasping the melody of language

Tone in Sherpa (Sino-Tibetan) Joyce McDonough1, Rebecca Baier2 and

Functions of intonation 1

Why Study Spoken Language?

Studying Intonation Julia Hirschberg CS /21/2018.

Meanings of Intonational Contours

Representing Intonational Variation

Studying Intonation Julia Hirschberg CS /21/2018.

Intonational and Its Meanings

Intonational and Its Meanings

The American School and ToBI

Intonational Variation in Spoken Dialogue Systems

Meaningful Intonational Variation

Why Study Spoken Language?

Meanings of Intonational Contours

Representing Intonational Variation

Recognizing Structure: Sentence, Speaker, andTopic Segmentation

“Downstepped contours in the given/new distinction”

Predicting Phrasing and Accent

Comparative Studies Avesani et al 1995; Hirschberg&Avesani 1997

Intonational and Its Meanings

Jennifer J. Venditti Presentation by James Rishe

Presentation transcript:

Representing Intonational Variation Julia Hirschberg 11/24/2018

Today How can we represent meaningful speech variation so we can compare utterances? assign in TTS? Expanded vs. compressed pitch range? Louder vs. softer speech? Faster vs. slower speech? Differences in intonational prominence? Differences in intonational phrasing? Differences in pitch contours? 11/24/2018

Joseph Steele, 1775 Figure 1. Transcription of Pope’s Happiness (Steele, 1775, p. 13) Since speech is composed of slides, and not stable, discrete pitches, he developed note heads which were lines, curves, and circumflex notes that could cover several pitches. To these curved heads, he then added a tail which marked the duration of the syllable in a number of beats within a musical bar. 11/24/2018

Language Learning Approaches A simpler approach / IS it INteresting / / d’you feel ANGry? / / WHAT’S the PROBlem? / (McCarthy, 1991:106) How much variation do we need to capture? How detailed? Continuous or categorical features? If categorical, what are the possible classes? 11/24/2018

How Do We Decide? Auditory: Language teachers: what representations can learners understand Acoustic: Examine the speech signal for critical vs. accidental variation Experimental approaches Identify potential meaningful variation Design production or perception studies to test E.g. what does a contour mean? 11/24/2018

Intonation Models Superpositional models (Fujisaki 1983, Möbius et al. 1993): acoustic/physiological Linear or Tone sequence models British school (Kingdon ’58, O’Connor & Arnold ’73, Cruttenden ’97): based on auditory analysis American School (Pierrehumbert ’80, ToBI): mainly acoustic analysis Dutch school (‘t Hart, Collier and Cohen 1990): perceptual data 11/24/2018

Superpositional models Pitch pattern of intonation modeled with two components: phrase component and accent component. Phrase has basic shape, and pitch movements for individual accents are superimposed over basic shape: plus = Apples, oranges and tomatoes 11/24/2018

Good for modeling utterance-level trends Declination: downtrend in f0 over the course of an utterance Successful in speech synthesis for languages like Japanese (little variation in accent type, e.g.) Lily and Rosa thought this was divine. Prince William was gorgeous and he was looking for a bride. They dreamed of wedding bells. 11/24/2018

Disadvantages Disadvantages Too rigid: All contours must be modeled with an accent and a phrase component Many SAE contours cannot be captured easily Cannot distinguish prominence types Cannot capture differences in phrase endings 11/24/2018

No account of different accent types, or variations in phrase endings No notation system which allows users to share observations from large speech corpora or to compare contours Used primarily for synthesis 11/24/2018

Tone Sequence Models Intonation generated from sequences of categorically different, phonologically distinctive tones Basic unit of intonational description: intonation phrase (tone unit, breath group) Delimited by pauses, phrase-final lengthening, pitch Syllables may be stressed or accented Accent aligned with primary stress -- telephone Indicated by F0, duration, intensity, voice quality O’Connor and Arnold 1972: Earliest textbook for English instruction that tells user which contour appropriate in which context 11/24/2018

British School Prenuclear accent unit Nuclear accent unit Prehead ‘Nucleus’ Stressed syllable Two types of accent unit in the British School: Prenuclear accent units; also called the Head Nuclear accent units; also called the Nucleus The nuclear accent unit is the last accent unit in the IP The head comprises all prenuclear accent units But JOHN’s never BEEN to Jamaica 11/24/2018

Six nuclear choices in English J a m i c falling i c rising J a m a c rising-falling i J m falling-rising J a m i c Rising-falling-rising a c i J m level J a m i c 11/24/2018

The American School American school-type models make a distinction between accents (what makes a particular word prominent) and boundary tones (how a phrase ends) Autosegmental metrical or two-tone models Only two tones, which may be combined H = high target L = low target 11/24/2018

Pierrehumbert 1980 Contours = pitch accents, phrase accents, boundary tones Pitch Accents+ Phrase Accents+ Boundary Tone Jd H* L* L*+H L+H* H*+L H+L* L% H% L- H- 11/24/2018

Price, Ostendorf et al Break indices: degree of juncture between words 0  8 (none to ‘a lot’) What I’d like is a nice roast beef sandwich. 11/24/2018

To(nes and)B(reak)I(ndices) Developed by prosody researchers in four meetings over 1991-94 Putting Pierrehumbert ’80 and Price, Ostendorf, et al together Goals: devise common labeling scheme for Standard American English that is robust and reliable promote collection of large, prosodically labeled, shareable corpora 11/24/2018

Minimal ToBI transcription: Recording of speech F0 contour ToBI tiers: ToBI standards also proposed for Japanese, German, Italian, Spanish, British and Australian English,.... Minimal ToBI transcription: Recording of speech F0 contour ToBI tiers: orthographic tier: words break-index tier: degrees of junction (Price et al ‘89) tonal tier: pitch accents, phrase accents, boundary tones (Pierrehumbert ‘80) miscellaneous tier: disfluencies, non-speech sounds, etc. 11/24/2018

Sample ToBI Labeling 11/24/2018

Online training material,available at: http://anita. simmons Evaluation Good inter-labeler reliability for expert and naive labelers: 88% agreement on presence/absence of tonal category, 81% agreement on category label, 91% agreement on break indices to within 1 level (Silverman et al. ‘92,Pitrelli et al ‘94) 11/24/2018

Pitch Accent/Prominence in ToBI Which items are made intonationally prominent and how: tonal targets/levels not movement Accent type: H* simple high (declarative) L* simple low (ynq) L*+H scooped, late rise (uncertainty/ incredulity) L+H* early rise to stress (contrastive focus) H+!H* fall onto stress (implied familiarity) 11/24/2018

Downstepped accents: !H*, L+!H*, L*+!H Degree of prominence: within a phrase: HiF0 (~nuclear accent) across phrases ?? 11/24/2018

Prosodic Phrasing in ToBI ‘Levels’ of phrasing: intermediate phrase: one or more pitch accents plus a phrase accent, H- or L- intonational phrase: 1 or more intermediate phrases + boundary tone, H% or L% ToBI break-index tier 0 no word boundary 1 word boundary 11/24/2018

2 strong juncture with no tonal markings 3 intermediate phrase boundary 4 intonational phrase boundary 11/24/2018

L*+H L* H* H-H% H-L% L-H% L-L% 11/24/2018

H* !H* H+!H* L+H* H-H% H-L% L-H% L-L% 11/24/2018

Next Class Predicting prosodic assignments from text 11/24/2018