Discourse & Dialogue: Introduction Ling 575 A Topics in NLP March 30, 2011.

Slides:



Advertisements
Similar presentations
Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon National Institute for Testing.
Advertisements

5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Explaining Contrasting Solution Methods Supports Problem-Solving Flexibility and Transfer Bethany Rittle-Johnson Vanderbilt University Jon Star Michigan.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
U1, Speech in the interface:2. Dialogue Management1 Module u1: Speech in the Interface 2: Dialogue Management Jacques Terken HG room 2:40 tel. (247) 5254.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
Information, action and negotiation in dialogue systems Staffan Larsson Kings College, Jan 2001.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
CSD 5230 Advanced Applications in Communication Modalities 7/3/2015 AAC 1 Introduction to AAC Orientation to Course Assessment Report Writing.
Welcome: Fact or Opinion
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Prosody and NLP Seminar by Nikhil: Adith: Prachur: 06D05011 We have a presentation this Friday ?
14: THE TEACHING OF GRAMMAR  Should grammar be taught?  When? How? Why?  Grammar teaching: Any strategies conducted in order to help learners understand,
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Discourse Markers Discourse & Dialogue CS November 25, 2006.
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente.
Overview of Issues in Discourse and Dialogue Gina-Anne Levow CS Discourse and Dialogue September 25, 2006.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
1 Computational Linguistics Ling 200 Spring 2006.
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
Overview of Issues in Discourse and Dialogue Gina-Anne Levow CS Discourse and Dialogue September 28, 2004.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Information Technology – Dialogue Systems Ulm University (Germany) Speech Data Corpus for Verbal Intelligence Estimation.
Topics in Artificial Intelligence: Discourse and Dialogue CS 359 Gina-Anne Levow September 25, 2001.
Introduction to CL & NLP CMSC April 1, 2003.
1 Introduction to Software Engineering Lecture 1.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
Towards a Theoretical Framework for the Integration of Dialogue Models into Human-Agent Interaction John R. Lee Assistive Intelligence Inc. Andrew B. Williams.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
HYMES (1964) He developed the concept that culture, language and social context are clearly interrelated and strongly rejected the idea of viewing language.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Building & Evaluating Spoken Dialogue Systems Discourse & Dialogue CS 359 November 27, 2001.
Anchor Standards ELA Standards marked with this symbol represent Kansas’s 15%
Dialog Models September 18, 2003 Thomas Harris.
Discourse & Dialogue CS 359 November 13, 2001
Supertagging CMSC Natural Language Processing January 31, 2006.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.
Introductions. Specialized instruction in Written Expression: The challenges of Learning to Write.
Natural conversation “When we investigate how dialogues actually work, as found in recordings of natural speech, we are often in for a surprise. We are.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Agent-Based Dialogue Management Discourse & Dialogue CMSC November 10, 2006.
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Some basic considerations a.The age and level of the learners who will be using the materials. b.The extent to which any adopted methodology meets the.
IB Assessments CRITERION!!!.
Natural Language Processing (NLP)
Natural Language Processing (NLP)
CS4705 Natural Language Processing
Natural Language Processing (NLP)
Presentation transcript:

Discourse & Dialogue: Introduction Ling 575 A Topics in NLP March 30, 2011

2 Roadmap Definition(s) of Discourse Different Types of Discourse Goals, Modalities Topics, Tasks in Discourse & Dialogue Course structure Overview of Theoretical Approaches Points of Agreement Points of Variance Dialogue Models and Challenges Issues and Examples in Practice Spoken dialogue systems

3 What is a Discourse? Discourse is: Extended span of text

4 What is a Discourse? Discourse is: Extended span of text Spoken or Written

5 What is a Discourse? Discourse is: Extended span of text Spoken or Written One or more participants

6 What is a Discourse? Discourse is: Extended span of text Spoken or Written One or more participants Language in Use

7 What is a Discourse? Discourse is: Extended span of text Spoken or Written One or more participants Language in Use Expresses goals of participants Processes to produce and interpret

8 Why Discourse? Understanding depends on context Referring expressions: it, that, the screen Word sense: plant Intention: Do you have the time?

9 Why Discourse? Understanding depends on context Referring expressions: it, that, the screen Word sense: plant Intention: Do you have the time? Applications: Discourse in NLP Question-Answering Information Retrieval Summarization Spoken Dialogue Automatic Essay Grading

10 Different Parameters of Discourse Number of participants Multiple participants -> Dialogue

11 Different Parameters of Discourse Number of participants Multiple participants -> Dialogue Modality Spoken vs Written

12 Different Parameters of Discourse Number of participants Multiple participants -> Dialogue Modality Spoken vs Written Goals Transactional (message passing) Interactional (relations, attitudes) Task-oriented

13 Spoken vs Written Discourse Speech Written text

14 Spoken vs Written Discourse Speech Written text No paralinguistic effects “Permanent” Off-line. Edited, Crafted More “structured” Full sentences Complex sentences Subject-Predicate Complex modification More structural markers No disfluencies

15 Spoken vs Written Discourse Speech Paralinguistic effects Intonation, gaze, gesture Transitory Real-time, on-line Less “structured” Fragments Simple, Active, Declarative Topic-Comment Non-verbal referents Disfluencies Self-repairs False Starts Pauses Written text No paralinguistic effects “Permanent” Off-line. Edited, Crafted More “structured” Full sentences Complex sentences Subject-Predicate Complex modification More structural markers No disfluencies

Major Topics & Tasks Reference: Resolution, Generation, Information Structure

Major Topics & Tasks Reference: Resolution, Generation, Information Structure Intention Recognition

Major Topics & Tasks Reference: Resolution, Generation, Information Structure Intention Recognition Discourse Structure Segmentation, Relations

Major Topics & Tasks Reference: Resolution, Generation, Information Structure Intention Recognition Discourse Structure Segmentation, Relations Fundamental components: How do they interact with dimensions of discourse? # Participants, Spoken vs Written,..

Dialogue Systems Components Dialogue Management Evaluation Turn-taking Politeness Stylistics

Course Structure Discussion-oriented course:

Course Structure Discussion-oriented course: Class participation

Course Structure Discussion-oriented course: Class participation Presentations Topic survey

Course Structure Discussion-oriented course: Class participation Presentations Topic survey Project: Proposal Progress Final report

Course Perspectives Foundational: Linguistic view: Understanding basic discourse phenomena Analyzing language use in context

Course Perspectives Foundational: Linguistic view: Understanding basic discourse phenomena Analyzing language use in context Practical/Implementational: Computational view: Developing systems and algorithms for discourse tasks

Course Projects Reflect linguistic and/or computational perspectives

Course Projects Reflect linguistic and/or computational perspectives Option 1: Analytic (Required for Ling elective credit) In-depth analysis of linguistic discourse phenomena Reflect understanding of literature Analyze real data ~15 page term paper

Course Projects Reflect linguistic and/or computational perspectives Option 1: Analytic (Required for Ling elective credit) In-depth analysis of linguistic discourse phenomena Reflect understanding of literature Analyze real data ~15 page term paper Option 2: Implementational Implement, extend algorithms for discourse/dialogue tasks Shorter write-up of approach, evaluation

30 U: Where is A Bug’s Life playing in Summit? S: A Bug’s Life is playing at the Summit theater. U: When is it playing there? S: It’s playing at 2pm, 5pm, and 8pm. U: I’d like 1 adult and 2 children for the first show. How much would that cost? Reference & Knowledge Knowledge sources: From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

31 U: Where is A Bug’s Life playing in Summit? S: A Bug’s Life is playing at the Summit theater. U: When is it playing there? S: It’s playing at 2pm, 5pm, and 8pm. U: I’d like 1 adult and 2 children for the first show. How much would that cost? Reference & Knowledge Knowledge sources: Domain knowledge From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

32 U: Where is A Bug’s Life playing in Summit? S: A Bug’s Life is playing at the Summit theater. U: When is it playing there? S: It’s playing at 2pm, 5pm, and 8pm. U: I’d like 1 adult and 2 children for the first show. How much would that cost? Reference & Knowledge Knowledge sources: Domain knowledge Discourse knowledge From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

33 U: Where is A Bug’s Life playing in Summit? S: A Bug’s Life is playing at the Summit theater. U: When is it playing there? S: It’s playing at 2pm, 5pm, and 8pm. U: I’d like 1 adult and 2 children for the first show. How much would that cost? Reference &Knowledge Knowledge sources: Domain knowledge Discourse knowledge World knowledge From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

34 U: What time is A Bug’s Life playing at the Summit theater? Intention Recognition From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

35 U: What time is A Bug’s Life playing at the Summit theater? Intention Recognition Using keyword extraction and vector-based similarity measures: From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

36 U: What time is A Bug’s Life playing at the Summit theater? Intention Recognition Using keyword extraction and vector-based similarity measures: Intention: Ask-Reference: _time From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

37 U: What time is A Bug’s Life playing at the Summit theater? Intention Recognition Using keyword extraction and vector-based similarity measures: Intention: Ask-Reference: _time Movie: A Bug’s Life From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

38 U: What time is A Bug’s Life playing at the Summit theater? Intention Recognition Using keyword extraction and vector-based similarity measures: Intention: Ask-Reference: _time Movie: A Bug’s Life Theater: the Summit quadplex From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

39 Computational Models of Discourse 1) Hobbs (1985): Discourse coherence based on small number of recursively applied relations

40 Computational Models of Discourse 1) Hobbs (1985): Discourse coherence based on small number of recursively applied relations 2) Grosz & Sidner (1986): Attention (Focus), Intention (Goals), and Structure (Linguistic) of Discourse

41 Computational Models of Discourse 1) Hobbs (1985): Discourse coherence based on small number of recursively applied relations 2) Grosz & Sidner (1986): Attention (Focus), Intention (Goals), and Structure (Linguistic) of Discourse 3) Mann & Thompson (1987): Rhetorical Structure Theory: Hierarchical organization of text spans (nucleus/satellite) based on small set of rhetorical relations

42 Computational Models of Discourse 1) Hobbs (1985): Discourse coherence based on small number of recursively applied relations 2) Grosz & Sidner (1986): Attention (Focus), Intention (Goals), and Structure (Linguistic) of Discourse 3) Mann & Thompson (1987): Rhetorical Structure Theory: Hierarchical organization of text spans (nucleus/satellite) based on small set of rhetorical relations 4) McKeown (1985): Hierarchical organization of schemata

43 Computational Models of Discourse 1) Hobbs (1985): Discourse coherence based on small number of recursively applied relations 2) Grosz & Sidner (1986): Attention (Focus), Intention (Goals), and Structure (Linguistic) of Discourse 3) Mann & Thompson (1987): Rhetorical Structure Theory: Hierarchical organization of text spans (nucleus/satellite) based on small set of rhetorical relations 4) McKeown (1985): Hierarchical organization of schemata

44 Discourse Models: Common Features Hierarchical, Sequential structure applied to subunits Discourse “segments” Need to detect, interpret

45 Discourse Models: Common Features Hierarchical, Sequential structure applied to subunits Discourse “segments” Need to detect, interpret Referring expressions provide coherence Explain and link

46 Discourse Models: Common Features Hierarchical, Sequential structure applied to subunits Discourse “segments” Need to detect, interpret Referring expressions provide coherence Explain and link Meaning of discourse more than that of component utterances

47 Discourse Models: Common Features Hierarchical, Sequential structure applied to subunits Discourse “segments” Need to detect, interpret Referring expressions provide coherence Explain and link Meaning of discourse more than that of component utterances Meaning of units depends on context

48 Theoretical Differences Informational ( Hobbs/RST) Meaning and coherence/reference based on inference/abduction Versus

49 Theoretical Differences Informational ( Hobbs/RST) Meaning and coherence/reference based on inference/abduction Versus Intentional (G&S) Meaning based on (collaborative) planning and goal recognition, coherence based on focus of attention

50 Theoretical Differences Informational ( Hobbs/RST) Meaning and coherence/reference based on inference/abduction Versus Intentional (G&S) Meaning based on (collaborative) planning and goal recognition, coherence based on focus of attention “Syntax” of dialog act sequences versus

51 Theoretical Differences Informational ( Hobbs/RST) Meaning and coherence/reference based on inference/abduction Versus Intentional (G&S) Meaning based on (collaborative) planning and goal recognition, coherence based on focus of attention “Syntax” of dialog act sequences versus Rational, plan-based interaction

52 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction?

53 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction? Are discourse segments psychologically real or just useful? How can they de recognized/generated automatically?

54 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction? Are discourse segments psychologically real or just useful? How can they de recognized/generated automatically? How do you define and represent “context”? How does representation interact with ambiguity resolution (sense/reference)

55 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction? Are discourse segments psychologically real or just useful? How can they de recognized/generated automatically? How do you define and represent “context”? How does representation interact with ambiguity resolution (sense/reference) How do you identify topic, reference, and focus?

56 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction? Are discourse segments psychologically real or just useful? How can they de recognized/generated automatically? How do you define and represent “context”? How does representation interact with ambiguity resolution (sense/reference) How do you identify topic, reference, and focus? Identifying relations without cues? Discourse and domain structures

57 Challenges Relations: What type: Text, Rhetorical, Informational, Intention, Speech Act? How many? What level of abstraction? Are discourse segments psychologically real or just useful? How can they de recognized/generated automatically? How do you define and represent “context”? How does representation interact with ambiguity resolution (sense/reference) How do you identify topic, reference, and focus? Identifying relations without cues? Computational complexity of planning/plan recognition Discourse and domain structures

58 Dialogue Modeling Two or more participants – spoken or text Often focus on task-oriented collaborative dialogue

59 Dialogue Modeling Two or more participants – spoken or text Often focus on task-oriented collaborative dialogue Models: Dialogue Grammars: Sequential, hierarchical constraints on dialogue states with speech acts as terminals Small finite set of dialogue acts, often “adjacency pairs” Question/response, check/confirm

60 Dialogue Modeling Two or more participants – spoken or text Often focus on task-oriented collaborative dialogue Models: Dialogue Grammars: Sequential, hierarchical constraints on dialogue states with speech acts as terminals Small finite set of dialogue acts, often “adjacency pairs” Question/response, check/confirm Plan-based Models: Dialogue as special case of rational interaction, model partner goals, plans, actions to extend

61 Dialogue Modeling Two or more participants – spoken or text Often focus on task-oriented collaborative dialogue Models: Dialogue Grammars: Sequential, hierarchical constraints on dialogue states with speech acts as terminals Small finite set of dialogue acts, often “adjacency pairs” Question/response, check/confirm Plan-based Models: Dialogue as special case of rational interaction, model partner goals, plans, actions to extend Multi-layer Models: Incorporate high-level domain plan, discourse plan, adjacency pairs

62 Dialogue Modeling Challenges How rigidly do speakers adhere to dialogue grammars? How many acts? Which ones? How can we recognize these acts? Pairs? Larger structures?

63 Dialogue Modeling Challenges How rigidly do speakers adhere to dialogue grammars? How many acts? Which ones? How can we recognize these acts? Pairs? Larger structures? Mental models How do we model the beliefs and knowledge state of speakers?

64 Dialogue Modeling Challenges How rigidly do speakers adhere to dialogue grammars? How many acts? Which ones? How can we recognize these acts? Pairs? Larger structures? Mental models How do we model the beliefs and knowledge state of speakers? Discourse and domain structures

65 Practical Considerations Full reference resolution, interpretation:

66 Practical Considerations Full reference resolution, planning: Worst case NP-complete, AI-complete

67 Practical Considerations Full reference resolution, planning: Worst case NP-complete, AI-complete Systems must be (close to) real-time

68 Practical Considerations Full reference resolution, planning: Worst case NP- complete, AI-complete Systems must be (close to) real-time Complex models of reference -> Interaction history Often stack-based recency of mention

69 Practical Considerations Full reference resolution, planning: Worst case NP-complete, AI-complete Systems must be (close to) real-time Complex models of reference -> Interaction history Often stack-based recency of mention Planning/Inference -> state-based interaction model

70 Practical Considerations Full reference resolution, planning: Worst case NP-complete, AI-complete Systems must be (close to) real-time Complex models of reference -> Interaction history Often stack-based recency of mention Planning/Inference -> state-based interaction model Questions: Initiative (system/user driven?) Corpus collection Evaluation

71 Spoken Dialogue Modeling Building interactive spoken language systems Based on speech recognition and (often) synthesis

72 Spoken Dialogue Modeling Building interactive spoken language systems Based on speech recognition and (often) synthesis Dominated by practical considerations Limitations of: speech recognizer accuracy, speed, coverage; speech synthesizer speed, fluency, naturalness; plan/intention recognition and reasoning speech and effectiveness

73 Spoken Dialogue Modeling Building interactive spoken language systems Based on speech recognition and (often) synthesis Dominated by practical considerations Limitations of: speech recognizer accuracy, speed, coverage; speech synthesizer speed, fluency, naturalness; plan/intention recognition and reasoning speech and effectiveness Often simplistic but implementable models

74 Spoken Dialogue Modeling Building interactive spoken language systems Based on speech recognition and (often) synthesis Dominated by practical considerations Limitations of: speech recognizer accuracy, speed, coverage; speech synthesizer speed, fluency, naturalness; plan/intention recognition and reasoning speech and effectiveness Often simplistic but implementable models Design and evaluation challenges What is the best dialogue? Fastest? Fewest errors? Most “natural”?

75 Dialogue Management: Motivating Examples Dialogue 1: S: Would you like movie showtime or theater playlist information? U: Movie showtime. S: What movie do you want showtime information about? U: Saving Private Ryan. S: At what theater do you want to see Saving Private Ryan? U: Paramount theater. S: Saving Private Ryan is not playing at the Paramount theater. From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

76 DM: Motivating Examples (Cont’d) Dialogue 2: S: How may I help you? U: When is Saving Private Ryan playing? S: For what theater? U: The Paramount theater. S: Saving Private Ryan is not playing at the Paramount theater, but it’s playing at the Madison theater at 3:00, 5:30, 8:00, and 10:30. From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

77 DM: Motivating Examples (Cont’d) Dialogue 3: S: How may I help you? U: When is Saving Private Ryan playing? S: For what theater? U: The Paramount theater. S: Saving Private Ryan is playing at the Fairmont theater at 6:00 and 8:30. U: I wanted to know about the Paramount theater, not the Fairmont theater. S: Saving Private Ryan is not playing at the Paramount theater, but it’s playing at the Madison theater at 3:00, 5:30, 8:00, and 10:30. From Carpenter and Chu-Carroll, Tutorial on Spoken Dialogue Systems, ACL ‘99

78 Comparison of Sample Dialogues Dialogue 1: System-initiative Implicit confirmation Merely informs user of failed query Mechanical Least efficient Dialogue 2: Mixed-initiative No confirmation Suggests alternative when query fails More natural Most efficient Dialogue 3: –Mixed-initiative –No confirmation –Suggests alternative when query fails –More natural –Moderately efficient

79 Comparison of Sample Dialogues Dialogue 1: System-initiative Dialogue 2: Mixed-initiative Dialogue 3: –Mixed-initiative

80 Comparison of Sample Dialogues Dialogue 1: System-initiative Implicit confirmation Dialogue 2: Mixed-initiative No confirmation Dialogue 3: –Mixed-initiative –No confirmation

81 Comparison of Sample Dialogues Dialogue 1: System-initiative Implicit confirmation Merely informs user of failed query Dialogue 2: Mixed-initiative No confirmation Suggests alternative when query fails Dialogue 3: –Mixed-initiative –No confirmation –Suggests alternative when query fails

82 Comparison of Sample Dialogues Dialogue 1: System-initiative Implicit confirmation Merely informs user of failed query Mechanical Least efficient Dialogue 2: Mixed-initiative No confirmation Suggests alternative when query fails More natural Most efficient Dialogue 3: –Mixed-initiative –No confirmation –Suggests alternative when query fails –More natural –Moderately efficient

83 Dialogue Evaluation System-initiative, explicit confirmation better task success rate lower WER longer dialogues fwer recovery subdialogues less natural Mixed-initiative, no confirmation lower task success rate higher WER shorter dialogues more recovery subdialogues more natural Candidate measures from Chu-Carroll and Carpenter

84 Dialogue System Evaluation Black box: Task accuracy wrt solution key Simple, but glosses over many features of interaction

85 Dialogue System Evaluation Black box: Task accuracy wrt solution key Simple, but glosses over many features of interaction Glass box: Component-level evaluation: E.g. Word/Concept Accuracy, Task success, Turns-to- complete More comprehensive, but Independence? Generalization?

86 Dialogue System Evaluation Black box: Task accuracy wrt solution key Simple, but glosses over many features of interaction Glass box: Component-level evaluation: E.g. Word/Concept Accuracy, Task success, Turns-to-complete More comprehensive, but Independence? Generalization? Performance function: PARADISE[Walker et al]: Incorporates user satisfaction surveys, glass box metrics Linear regression: relate user satisfaction, completion costs

87 Controls flow of dialogue –Openings, Closings, Politeness, Clarification,Initiative –Link interface to backend systems Mechanisms: increasing flexibility, complexity –Finite-state –Template-based –Learning-based Acquisition –Hand-coding, probabilistic dialogue grammars, automata, HMMs Dialogue Management

88 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources? Task plans and discourse plans Multi-modal cues: Multi-scale syntax, semantics, cue words, intonation, gaze, gesture How can we learn? Cues to discourse structure Dialogue strategies, models

89 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources? Task plans and discourse plans Multi-modal cues: Multi-scale syntax, semantics, cue words, intonation, gaze, gesture How can we learn? Cues to discourse structure Dialogue strategies, models

90 Broad Challenges How should we represent discourse?

91 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty

92 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources? Task plans and discourse plans Multi-modal cues: Multi-scale syntax, semantics, cue words, intonation, gaze, gesture How can we learn? Cues to discourse structure Dialogue strategies, models

93 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources?

94 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources? Task plans and discourse plans Multi-modal cues: Multi-scale syntax, semantics, cue words, intonation, gaze, gesture How can we learn?

95 Broad Challenges How should we represent discourse? One general model? Fundamentally different? Text/Speech; Monologue/Multiparty How do we integrate different information sources? Task plans and discourse plans Multi-modal cues: Multi-scale syntax, semantics, cue words, intonation, gaze, gesture How can we learn? Cues to discourse structure Dialogue strategies, models

96 Relation Recognition: Intention (Redux) Goals: Match utterance with 1+ dialogue acts, capture information Sample dialogue actions: –Maptask Acknowledgement Instruction/Explanation/Clarification Alignment/Check Question Yes-No/Other Question Affirmative/Negative Reply Other Reply Ready Unidentifiable

97 Relation Recognition: Intention Knowledge sources: Overall dialogue goals Orthographic features, e.g.: punctuation cue words/phrases: “but”, “furthermore”, “so” transcribed words: “would you please”, “I want to” Dialogue history, i.e., previous dialogue act types Dialogue structure, e.g.: subdialogue boundaries, dialogue games dialogue topic changes Prosodic features of utterance: duration, pause, F0, speaking rate Empirical methods/ Manual rule construction: Probabilistic dialogue act classifiers: HMMs Rule-based dialogue act recognition: CART, Transformation-based learning

98 Corpus Collection How would someone accomplish task? What would they say? Sample interaction collection: Wizard-of-Oz: Simulate all or part of a system Subjects interact Provides data for modeling, training, etc