Discourse Structure in Generation

Slides:



Advertisements
Similar presentations
“Effect of Genre, Speaker, and Word Class on the Realization of Given and New Information” Julia Agustín Gravano & Julia Hirschberg {agus,
Advertisements

Detecting Certainness in Spoken Tutorial Dialogues Liscombe, Hirschberg & Venditti Using System and User Performance Features to Improve Emotion Detection.
CS 4705 Discourse Structure and Text Coherence What makes a text/dialogue coherent? Incoherent? “Consider, for example, the difference between passages.
Pragmatics II: Discourse structure Ling 571 Fei Xia Week 7: 11/10/05.
1 Spoken Dialogue Systems Dialogue and Conversational Agents (Part IV) Chapter 19: Draft of May 18, 2005 Speech and Language Processing: An Introduction.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
CS 4705 Discourse Structure and Text Coherence. What makes a text/dialogue coherent? Incoherent? “Consider, for example, the difference between passages.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
Discourse Structure Grosz and Sidner. Why bother? Leads to an account of discourse meaning Constrains how utterances are related Useful for explaining.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
9/5/20051 Acoustic/Prosodic and Lexical Correlates of Charismatic Speech Andrew Rosenberg & Julia Hirschberg Columbia University Interspeech Lisbon.
Reasons for Teaching & Assessing Reading Fluency Reading Fluency.
Discourse Markers Discourse & Dialogue CS November 25, 2006.
AUTOMATIC DETECTION OF REGISTER CHANGES FOR THE ANALYSIS OF DISCOURSE STRUCTURE Laboratoire Parole et Langage, CNRS et Université de Provence Aix-en-Provence,
Teaching Productive Skills Which ones are they? Writing… and… Speaking They have similarities and Differences.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Background: Speakers use prosody to distinguish between the meanings of ambiguous syntactic structures (Snedeker & Trueswell, 2004). Discourse also has.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Dialog Models September 18, 2003 Thomas Harris.
Discourse & Dialogue CS 359 November 13, 2001
What vocal cues indicate sarcasm? By: Jack Dolan Rockwell, P. (2000). Lower, slower, louder: Vocal cues of sarcasm. Journal of Psycholinguistic Research,
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Acoustic Cues to Emotional Speech Julia Hirschberg (joint work with Jennifer Venditti and Jackson Liscombe) Columbia University 26 June 2003.
On the role of context and prosody in the interpretation of ‘okay’ Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Héctor Chávez, and Lauren Wilcox.
Discourse Structure and Text Coherence
Investigating Pitch Accent Recognition in Non-native Speech
Experimental Psychology
Towards Emotion Prediction in Spoken Tutoring Dialogues
Why Study Spoken Language?
Recognizing Structure: Dialogue Acts and Segmentation
Recognizing Structure: Sentence and Topic Segmentation
Studying Intonation Julia Hirschberg CS /21/2018.
Meanings of Intonational Contours
Studying Intonation Julia Hirschberg CS /21/2018.
Spoken Dialogue Systems
Intonational and Its Meanings
Intonational and Its Meanings
Prosody in Recognition/Understanding
The American School and ToBI
Detecting Prosody Improvement in Oral Rereading
THE NATURE OF SPEAKING Joko Nurkamto UNS Solo.
Dialogue Systems Julia Hirschberg CS /14/2018.
Ethnography of Communication Somayyeh Pedram GS31063
Comparing American and Palestinian Perceptions of Charisma Using Acoustic-Prosodic and Lexical Analysis Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg,
Why Study Spoken Language?
Meanings of Intonational Contours
Turn-taking and Disfluencies
Studying Spoken Language Text 17, 18 and 19
Representing Intonational Variation
Advanced NLP: Speech Research and Technologies
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
“Downstepped contours in the given/new distinction”
Fadi Biadsy. , Andrew Rosenberg. , Rolf Carlson†, Julia Hirschberg
Agustín Gravano & Julia Hirschberg {agus,
Advanced NLP: Speech Research and Technologies
Spoken Dialogue Systems
Comparative Studies Avesani et al 1995; Hirschberg&Avesani 1997
Intonational and Its Meanings
Communicative Competence (Canale and Swain, 1980)
Emotional Speech Julia Hirschberg CS /16/2019.
Recognizing Structure: Dialogue Acts and Segmentation
CS4705 Natural Language Processing
Acoustic-Prosodic and Lexical Entrainment in Deceptive Dialogue
Guest Lecture: Advanced Topics in Spoken Language Processing
Automatic Prosodic Event Detection
Presentation transcript:

Discourse Structure in Generation Julia Hirschberg CS 4706 12/6/2018

Today Models of Discourse Structure Do we have them? Grosz & Sidner ’86 What identifies discourse structure to Hearers? Textual cues Spoken cues How can we produce appropriate discourse structure in TTS systems? Can we identify discourse structure automatically, from speech? 12/6/2018

Is there structure in this discourse? A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

Is this a reasonable structure? A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

This? A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

This? A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

What information do we use in segmenting a discourse? ‘Topic’ coherence? Repeated reference? ‘Cue’ phrases? ???? 12/6/2018

Structures of Discourse Structure (Grosz & Sidner ‘86) A leading theory of discourse structure Based upon Speaker intentions and Speaker and Hearer attentional state Identifies a few, general relations that hold among Speaker intentions Identifies a model of attentional state Three components: Linguistic structure Intentional structure Attentional structure 12/6/2018

Linguistic Structure What is actually said or written How is the linguistic structure represented? Assume discourse is segmented into Discourse Segments (DS) What is the basic unit of analysis? Do we all segment alike? Do we all use the same cues? 12/6/2018

Linguistic Structure of Discourse D S1: A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. S2: Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

Intentional Structure Discourse purpose (DP): basic purpose of the Speaker in producing the discourse Discourse segment purposes (DSPs): the Speaker’s purpose in producing the segment Segments are related to one another by their purposes: Satisfaction-precedence: DSP1 must be satisfied before DSP2 Dominance: DSP1 dominates DSP2 if fulfilling DSP2 constitutes part of fulfilling DSP1 12/6/2018

Linguistic Structure of Discourse D DSP1: Describe murder of dove by duck. S1: A beautiful mallard spotted the dove I was feeding. The duck dove supply is small this year. That dove was history in a minute. DSP2: Describe meeting of old friend. S2: Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. To my surprise, I ran into a friend from back home. When I told her of my recent experience she questioned my sanity. 12/6/2018

DSP2: Describe recovery process. S2: DSP3: Describe snack S3: Well, to recover from this horrible scene, I went to the park snack bar for a cup of cocoa. DSP3: Describe meeting old friend. S4: To my surprise, I ran into a friend from back home. DSP5: Describe friend’s reaction S5: When I told her of my recent experience she questioned my sanity. 12/6/2018

Attentional State: The Focus Stack Stack of focus spaces, each containing objects, properties and relations salient during each DS, plus the DSP State changes: transition rules controlling the addition/deletion of focus spaces Information at lower levels may or may not be available at higher levels Focus spaces are pushed onto the stack when A new DS is begun 12/6/2018

Focus spaces are popped when they are completed An embedded DS (e.g. a DS dominated by another DS) is begun Focus spaces are popped when they are completed State of focus stack models felicitous reference, coherence in discourse S2: DSP2, scene, Speaker, snack_bar Cocoa, friend, home,sanity S1: DSP1, duck, dove, Speaker, duck_dove_supply 12/6/2018

Limits of the Theory Assumes discourses are task-oriented Assumes a single, hierarchical structure shared by S and H Questions: Do people really build such structures when they converse? Use them in interpreting what others say? How could they do it? 12/6/2018

How might people recognize discourse structure? Linguistic markers? tense and aspect cue phrases Inference of Speaker intentions? Inference from task structure? Intonational Information? 12/6/2018

Acoustic and Prosodic Cues to Discourse Structure Intuition: Speakers vary acoustic and prosodic cues to convey variation in discourse structure Systematic? In read or spontaneous speech? Evidence: Observations from recorded corpora Laboratory experiments Machine learning of discourse structure from acoustic/prosodic features 12/6/2018

Prosodic Correlates of Discourse/Topic Structure Pitch range Lehiste ’75, Brown et al ’83, Silverman ’86, Avesani & Vayra ’88, Ayers ’92, Swerts et al ’92, Grosz & Hirschberg’92, Swerts & Ostendorf ’95, Hirschberg & Nakatani ‘96 Preceding pause Lehiste ’79, Chafe ’80, Brown et al ’83, Silverman ’86, Woodbury ’87, Avesani & Vayra ’88, Grosz & Hirschberg’92, Passoneau & Litman ’93, Hirschberg & Nakatani ‘96 12/6/2018

Brown et al ’83, Grosz & Hirschberg’92, Hirschberg & Nakatani ‘96 Rate Butterworth ’75, Lehiste ’80, Grosz & Hirschberg’92, Hirschberg & Nakatani ‘96 Amplitude Brown et al ’83, Grosz & Hirschberg’92, Hirschberg & Nakatani ‘96 Contour Brown et al ’83, Woodbury ’87, Swerts et al ‘92 Add Audix tree?? 12/6/2018

Issues Do we find significant and reliable cues to discourse structure in prosodic variation When tested against an independent theory of discourse structure? In spontaneous as well as read speech? Are Hearers interpretations of discourse structure influenced by intonational variation? 12/6/2018

Grosz & Hirschberg ‘92 Small corpus of read AP newswire Read by professional speaker Labeled for discourse structure from text alone or from text and speech Pre-ToBI labeled Acoustic-prosodic features extracted for each intermediate (level 3) phrase Pitch range and change from prior phrase Intensity (rms) and change in db from prior phrase Preceding and subsequent pause Speaking rate 12/6/2018

ANOVA’s and t-tests on means Results: Analysis of phrases in different segment positions: SBEG, SF, parentheticals, quoted speech ANOVA’s and t-tests on means Results: Direct quotes: larger pitch range Parentheticals: smaller range, neg change from prior phrase, neg change in db, faster rate SBEG: larger range, louder, greater preceding pause, less subsequent pause SF: greater subsequent pause 12/6/2018

Machine learning experiments identified: SBEG with 91.5% est. accuracy (x-validation) SF, 92.5% Attributive tags, 96.9% Direct quotations, 86.4% Indirect quotations, 88.5% Parentheticals, 89.2% Conclusion: Acoustic/prosodic information is available to permit Hearers to identify discourse structure… 12/6/2018

Next The midterm Closed book, no notes or electronic devices Will include material through today 12/6/2018