Studying Intonation Julia Hirschberg CS 4706 9/21/2018.

Slides:



Advertisements
Similar presentations
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Advertisements

The Role of F0 in the Perceived Accentedness of L2 Speech Mary Grantham O’Brien Stephen Winters GLAC-15, Banff, Alberta May 1, 2009.
Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
“Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus,
IBM Labs in Haifa © 2007 IBM Corporation SSW-6, Bonn, August 23th, 2007 Maximum-Likelihood Dynamic Intonation Model for Concatenative Text to Speech System.
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Agustín Gravano 1 · Stefan Benus 2 · Julia Hirschberg 1 Elisa Sneed German 3 · Gregory Ward 3 1 Columbia University 2 Univerzity Konštantína Filozofa.
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Sound and Speech. The vocal tract Figures from Graddol et al.
Primary Stress and Intelligibility: Research to Motivate the Teaching of Suprasegmentals By Laura D. Hahn Afra MA Carolyn MA Josh MA
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
Toshiba Update 04/09/2006 Data-Driven Prosody and Voice Quality Generation for Emotional Speech Zeynep Inanoglu & Steve Young Machine Intelligence Lab.
Schizophrenia and Depression – Evidence in Speech Prosody Student: Yonatan Vaizman Advisor: Prof. Daphna Weinshall Joint work with Roie Kliper and Dr.
Perceived prominence and nuclear accent shape Rachael-Anne Knight LAGB 5 th September 2003.
Una Y. Chow Stephen J. Winters Alberta Conference on Linguistics November 1, 2014.
Suprasegmentals Segmental Segmental refers to phonemes and allophones and their attributes refers to phonemes and allophones and their attributes Supra-
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
A prosodically sensitive diphone synthesis system for Korean Kyuchul Yoon Linguistics Department The Ohio State University.
Segmental encoding of prosodic categories: A perception study through speech synthesis Kyuchul Yoon, Mary Beckman & Chris Brew.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Teaching Listening Why teach listening?
Lecture Overview Prosodic features (suprasegmentals)
Investigating Pitch Accent Recognition in Non-native Speech
August 15, 2008, presented by Rio Akasaka
Sentence stress and intro to intonation
Phonetics SPAU 3343 Chap. 10 – Grasping the melody of language
Tone in Sherpa (Sino-Tibetan) Joyce McDonough1, Rebecca Baier2 and
Why Study Spoken Language?
Studying Intonation Julia Hirschberg CS /21/2018.
Meanings of Intonational Contours
Representing Intonational Variation
Spoken Dialogue Systems
Intonational and Its Meanings
Intonational and Its Meanings
What is Phonetics? Short answer: The study of speech sounds in all their aspects. Phonetics is about describing speech. (Note: phonetics ¹ phonics) Phonetic.
The American School and ToBI
Detecting Prosody Improvement in Oral Rereading
Meaningful Intonational Variation
Dialogue Acts Julia Hirschberg CS /18/2018.
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Information Structure and Prosody
Why Study Spoken Language?
Meanings of Intonational Contours
Turn-taking and Disfluencies
Representing Intonational Variation
Representing Intonational Variation
Advanced NLP: Speech Research and Technologies
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
“Downstepped contours in the given/new distinction”
Fadi Biadsy. , Andrew Rosenberg. , Rolf Carlson†, Julia Hirschberg
Predicting Phrasing and Accent
Agustín Gravano & Julia Hirschberg {agus,
Advanced NLP: Speech Research and Technologies
Spoken Dialogue Systems
Discourse Structure in Generation
Comparative Studies Avesani et al 1995; Hirschberg&Avesani 1997
Emotional Speech Julia Hirschberg CS /8/2018.
Intonational and Its Meanings
Agustín Gravano1 · Stefan Benus2 · Julia Hirschberg1
Recognizing Structure: Dialogue Acts and Segmentation
Discourse & Dialogue CMSC October 28, 2004
Low Level Cues to Emotion
Presentation transcript:

Studying Intonation Julia Hirschberg CS 4706 9/21/2018

Today Approaches to studying contour meaning Questions people ask Does contour X convey a different meaning from contour Y? Is contour X used more often in context Z than contour Y Despite what people say/think, not all phenomena X are uttered with contour Y What kind of evidence could we get? Found data Laboratory experiments: production, perception Corpus collection 9/21/2018

What features can we look at and how do we obtain them? Intonation labeling by hand Acoustic/prosodic analysis by automatic methods Pitch tracking, pause detection, intensity, duration, speaking rate extraction Computational linguistic techniques to extract transcript-based (text) features Part-of-speech Sentence length, … What techniques do we use for analysis? Statistical methods (Splus, Matlab) Machine learning techniques 9/21/2018

Some Sample Approaches Natural Corpus: Hedberg & Sosa 2002 Introspective, observational: Wilson 1993, Pierrehumbert & Hirschberg 1990/2 Laboratory -- Production/Perception: Syrdal & Jilka 2004 Laboratory – Brain Imaging (e.g. fMRI): Doherty et al 9/21/2018

A Prescriptive Approach: Wilson 1993 Declarative statements fall and yes-no-questions rise? Wh-questions fall? Small final rise signals ‘more to come’? 9/21/2018

Corpus Studies of Questions: Hedberg & Sosa 2002 How are yes-no and wh-questions uttered and how might we explain differences? Where is the nuclear stress? Where is the semantic ‘focus’? What is the ‘topic’? Are the ‘wh words’ accented or not? Corpus: 73 questions Who saw John?/Who didn’t see John? Did John leave?/Didn’t John leave? 35 whq’s and 38 ynq’s from the McLaughlin Group and Washington Week 9/21/2018

Intonational labeling (ToBI) from pitch tracks Topic/focus coding Analysis Intonational labeling (ToBI) from pitch tracks Topic/focus coding Frequency distributions of features with question categories Prosody of ‘locus of interrogation’ Wh word in wh-questions Fronted auxiliary in yes-no questions Results Ynq’s generally uttered w/ falling or level intonation, not rising (69%) Wh-q’s most often uttered with falling (80%) 9/21/2018

Conclusions/open questions: Wh-words (60%) in all wh-questions and neg aux in negative ynq’s (89%) most often uttered with L+H* accent (‘contrastive’ accent) -- why? Aux in positive ynq’s often deaccented (41%) or realized with L* (17%) accent – why? Conclusions/open questions: Why do ynq’s and wh-q’s sometimes rise and sometimes fall? Locus of interrogation is accented in wh-q’s and in negative ynq’s to “signal interrogative status of sentence” – but not in positive ynq’s “due to need to highlight a following element” 9/21/2018

Critique Is this a good corpus for this investigation? Size Genre What about the speakers? 9/21/2018

Syrdal & Jilka 2004 How are whq’s and ynq’s produced most naturally (for TTS)? Same initial hypothesis: whq’s fall and ynq’s rise in American English Different approach: production and perception studies Production: 8 (professional) speakers (5F, 3M) Read transcripts of actual dialogues 9/21/2018

Intonational (ToBI) labeling from pitch tracks of extracted questions Analysis: Intonational (ToBI) labeling from pitch tracks of extracted questions Results: Ynq’s rose in 83% of cases for females and 53% for males Wh-q’s always fell for females and fell 79% of time for male speakers; wh-q’s and statements generally fell Nuclear accents in ynq’s: majority L* 9/21/2018

Perception studies: acceptability judgments Forced choice, 12 listeners Stimuli: Pairs of ynq and whq’s with same voice/different intonation 17 natural (9 ynq’s, 8 whq’s) 12 synthesized 12 subjects (6 and 6) Judgments: Ynq: Natural speech: people preferred standard rise (L* H- H%) Synthetic speech: no results Whq: Natural speech: people preferred falling contours (L- L%) to rising (H-H%) and slightly to ‘continuation rise’ (L- H%) Synthetic: no preference 9/21/2018

Critique How many questions were produced? Are professional speakers a good choice? Read vs. spontaneous speech? For TTS? Why no results for synthetic speech? Comparison to Hedberg and Sosa 9/21/2018

Doherty et al 2004 How do people process intonation, e.g., in rising questions vs. falling statements vs. falling questions? She was talking to her father? She was talking to her father. Was she talking to her father. Research questions: Where is the ‘prosody’ portion of the brain? What other sectors is it ‘close’ to and what is their function? Do particular contours have particular locations? 9/21/2018

Note experimental condition! Method: functional Magnetic Resonance Imaging (fMRI) of subjects presented with digitized recordings 11 subjects (4M, 7F) Note experimental condition! 150 triples, of which each subjects heard only 1 version She was talking to her father? Was she talking to her father. She was talking to her father. Monitoring task: Is this a question or a statement? Press one key for question, another for statement 9/21/2018

Interpreting the rising contour as a question? Results: Increase in activation when subjects made judgments about tokens w/ rising intonation -- but not falling, whether syntactic question or syntactic statement Why? Semantic processing? No – illocutionary force is same in rising and falling questions Acoustic processing? Maybe… Interpreting the rising contour as a question? Check lesion studies to see if people with damage in these areas can interpret rising contours… 9/21/2018

Critique No rising inverted questions? “Was she talking to her father?” 9/21/2018

Next Class How do we represent intonational variation? 9/21/2018