Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Center for Advanced Vehicular Systems Mississippi State University Computer Science.

Slides:



Advertisements
Similar presentations
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Advertisements

National Technical University of Athens Department of Electrical and Computer Engineering Image, Video and Multimedia Systems Laboratory
Map of Human Computer Interaction
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.
Help and Documentation zUser support issues ydifferent types of support at different times yimplementation and presentation both important yall need careful.
Requirements Analysis 8. 1 Storyboarding b508.ppt © Copyright De Montfort University 2000 All Rights Reserved INFO2005 Requirements Analysis Human.
MITRE © 2001 The MITRE Corporation. ALL RIGHTS RESERVED. What Works, What Doesn’t -- And What Needs to Work Lynette Hirschman Information Technology Center.
Designing Help… Mark Johnson Providing Support Issues –different types of support at different times –implementation and presentation both important.
Spoken Dialogue Technology How can Jerry Springer contribute to Computer Science Research Projects?
Verbal (symbol) Based Interactions Dr.s Barnes and Leventhal.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
1 error handling – Higgins / Galatea Dialogs on Dialogs Group July 2005.
Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.
4. Interaction Design Overview 4.1. Ergonomics 4.2. Designing complex interactive systems Situated design Collaborative design: a multidisciplinary.
Chapter 7 design rules.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
INTRODUCTION. Concepts HCI, CHI Usability User-centered Design (UCD) An approach to design (software, Web, other) that involves the user Interaction Design.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
1. Human – the end-user of a program – the others in the organization Computer – the machine the program runs on – often split between clients & servers.
Introduction to SDLC: System Development Life Cycle Dr. Dania Bilal IS 582 Spring 2009.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Center for Human Computer Communication Department of Computer Science, OG I 1 Designing Robust Multimodal Systems for Diverse Users and Mobile Environments.
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Assistant Research Professor Center for Advanced Vehicular Systems Mississippi.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
1 Computational Linguistics Ling 200 Spring 2006.
Evaluation of SDS Svetlana Stoyanchev 3/2/2015. Goal of dialogue evaluation Assess system performance Challenges of evaluation of SDS systems – SDS developer.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Learning Automata based Approach to Model Dialogue Strategy in Spoken Dialogue System: A Performance Evaluation G.Kumaravelan Pondicherry University, Karaikal.
Towards A Context-Based Dialog Management Layer for Expert Systems Victor Hung, Avelino Gonzalez & Ronald DeMara Intelligent Systems Laboratory University.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Introduction to Dialogue Systems. User Input System Output ?
Towards a Method For Evaluating Naturalness in Conversational Dialog Systems Victor Hung, Miguel Elvir, Avelino Gonzalez & Ronald DeMara Intelligent Systems.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
公司 標誌 Question Answering System Introduction to Q-A System 資訊四 B 張弘霖 資訊四 B 王惟正.
Process Asad Ur Rehman Chief Technology Officer Feditec Enterprise.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science.
Agent-Based Dialogue Management Discourse & Dialogue CMSC November 10, 2006.
ASSEMBLY AND DISASSEMBLY: AN OVERVIEW AND FRAMEWORK FOR COOPERATION REQUIREMENT PLANNING WITH CONFLICT RESOLUTION in Journal of Intelligent and Robotic.
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Speech and multimodal Jesse Cirimele. papers “Multimodal interaction” Sharon Oviatt “Designing SpeechActs” Yankelovich et al.
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Issues in Spoken Dialogue Systems
Spoken Dialogue Systems
Tomás Murillo-Morales and Klaus Miesenberger
Spoken Dialogue Systems
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Map of Human Computer Interaction
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Center for Advanced Vehicular Systems Mississippi State University Computer Science Graduate Seminar November 27, 2002

Overview Define dialog systems Describe research issues Present current work Give conclusions and discuss future work

What is a Dialog System? Current commercial voice products require adherence to “command and control” language, e.g.,  User: “Plan Route” Such interfaces are not robust to variations from the fixed words and phrases.

What is a Dialog System? Dialog systems seek to provide a natural conversational interaction between the user and the computer system, e.g.,  User: “Is there a way I can get to Canal Street from here?

Domains for Dialog Systems Travel reservation Weather forecasting In-vehicle driver assistance On-line learning environments

Dialog Systems: Information Flow Must model two-way flow of information User-to-system System-to-user

Dialog System

Research Issues Many fundamental problems must be solved for these systems to mature. Three general areas include: Automatic Speech Recognition (ASR) Natural Language Processing (NLP) Human-computer Interaction (HCI)

NLP Issue for Dialog Systems: Semantics Must assess meaning, not just syntactic correctness. Therefore, must handle ungrammatical inputs, e.g.,  “The ……nearest.....station is… …is there a gas station nearby?”

NLP Issue: Semantic Representation 1 For NLP, use semantic grammars Semantic frame with slots and fillers: -> -> “nearest” -> “gas station”

NLP Issue: Semantic Representation 2 Must also represent: “How do I get from Canal Street to Royal Street?” -> -> | -> “Canal St”| “Royal St” -> -> “nearest”|“closest” …

NLP Issue: Semantic Representation 3 Two Approaches: Hand-craft the grammar for the application, using robust parsing to understand meaning [1,2].  Problem: time, expense Use statistical approach, generating initial rules and using annotated tree- banked data to discover the full rule set [3,4].  Problem: annotated training data

ASR/NLP Issue: Reducing Errors Most systems use a loose coupling of ASR and NLP. Try earlier integration of semantics with recognizer. Incorporate dialog “state” into underlying statistical model. Problems:  Increases search space  Training Data

NLP Issue: Resolving Meaning Using Context Must maintain knowledge of the conversational context. After request for nearest gas station, user says, “What is it close to?”  Resolving “it” - anaphora Another follow-up by the user, “How about …restaurant?”  Resolving “…” with “nearest”- ellipsis

Resolving Meaning: Discourse Analysis To resolve such requests, system must track context of the conversation. This is typically handled by a discourse analysis component in the Dialog Manager.

Dialog Manager: Discourse Analysis Anaphora resolution approach: Use focus mechanism, assuming conversation has focus [5]. For our example, “gas station” is current focus. But how about:  “I’m at Food Max. How do I get to a gas station close to it and a video store close to it?” Problem: Resolving the two “its”.

Dialog System

Dialog Manager: Clarification Often cannot satisfy request in one iteration. The previous example may require clarification from the user,  “Do you want to go to the gas station first?”

HCI Issue: System vs. User Initiative What level of control do you provide user in the conversation?

Mixed Initiative Total system initiative provides low usability. Total user initiative introduces higher error rate. Thus, mixed initiative approach, balancing usability and error rate, is taken most often. Allowing user to adapt the level explicitly has also shown merit [6].

ASR/HCI Issue: Error Handling How to handle possible errors? Assign confidence score to result of recognizer. For results with lower confidence score, request clarification or revert to system-oriented initiative. Can incorporate dialog state in computing confidence score [7].

HCI Issue: Response Generation How to present response to user in a way that minimizes cognitive load? Varies depending on whether output is speech-only or speech /visual.  Speech-only output must respect user short-term memory limitations, e.g., lists must be short, timed appropriately, and allow repetition.  Speech/visual output must be complimentary, e.g., importance of redundancy and timing.

HCI Issue: Evaluating Dialog Systems How to compare and evaluate dialog systems? PARADISE (Paradigm for Dialog Systems Evaluation) provides a standard framework [8].

PARADISE: Evaluating Dialog Systems Task success  Was the necessary information exchanged? Efficiency/Cost  Number dialog turns, task completion time Qualitative  ASR rejections, timeouts, helps Usability  User satisfaction with ASR, task ease, interaction pace, system response

Current Work Sponsored by CAVS Examining:  In-vehicle Environment  Manufacturing Environment Multidisciplinary Team:  CS, ECE, IE  Baca, Picone, Duffy  ECE graduate students  Hualin Gao, Zheng Feng

Current Work: In-vehicle Dialog System Specific ASR Issues for In-vehicle Environment:  Real-time performance  Noise cancellation

Current Work: In-vehicle Dialog System Other Significant Issues:  Reducing error rate  Graceful error handling and mixed initiative strategy  Response generation to reduce user cognitive load  Evaluation

Current Work: In-vehicle Dialog System Approach  Develop prototype in-vehicle system  Initial focus on ASR and NLP issues  Integrate real-time recognizer [9]  Employ noise-cancellation techniques [10]  Use semantic grammar for NLP  Examine tighter integration of ASR and NLP  Incorporate dialog state in underlying statistical models for ASR

Current Work: In-vehicle Dialog System Second phase, focus on:  Response generation  Mixed initiative strategies  Evaluation

Current Work: Workforce Training Dialog System Significant issues in manufacturing environment:  Recognition issues:  Real-time performance  Noisy environments  Understanding issues:  Multimodal interface for reducing error rate, e.g., voice and pen [11].  HCI/Human Factors Issues:  Response generation to integrate speech and visual output

Research Significance Advance the development of dialog systems technology through addressing fundamental issues as they arise in the automotive domains. Potential areas: ASR, NLP, HCI

References [1] S.J. Young and C.E. Proctor, “The design and implementation of dialogue control in voice operated database inquiry systems,” Computer Speech and Language, Vol.3, no. 4, pp , [2] W. Ward, “Understanding spontaneous speech,” in Proceedings of International Conference on Acoustics, Speech and Signal Processing, Toronto, Canada, 1991, pp [3] R. Pieraccini and E. Levin, “Stochastic representation of semantic structure for speech understanding,” Speech Communication, vol. 11., no.2, pp , [4] Y. Wang and A. Acero, “Evaluation of spoken grammar learning in the atis domain,” in Proceedings International Conference on Acoustics, Speech, and Signal Processing, Orlando, Florida, [5] C. Sidner, “Focusing in the comprehension of definite anaphora,” in Computational Model of Discourse, M. Brady, Berwick, R., eds, 1983, Cambridge, MA, pp , The MIT Press. [6] D. Littman and S. Pan, “Empirically evaluating an adaptable spoken language dialog system,” in The Proceedings of International Conference on User Modeling, UM ’99, Banff, Canada, [7] S. Pradham and W. Ward, “Estimating Semantic Confidence for Spoken Dialogue Systems, “ Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processijng (ICASSP-2002), Orlando, Florida, USA, May 2002.

References [ 8] M. Walker, et al., “PARADISE: A Framework for Evaluating Spoken Dialogue Agents, “ Proceedings of the 35 th Annual Meeting of the Association for Computational Linguistics (ACL-97), pp , [9] F. Zheng, J. Hamaker, F. Goodman, B. George, N. Parihar, and J. Picone, “The ISIP 2001 NRL Evaluation for Recognition of Speech in Noisy Environments,” presented at the Speech In Noisy Environments (SPINE) Workshop, Orlando, Florida, USA, November [10] F. Zheng and J. Picone, "Robust Low Perplexity Voice Interfaces,“ MITRE Corporation, December 31, [11] S. Oviatt, “Taming Speech Recognition Errors within a Multimodal Interface, “ Communications of the ACM, Sept. 2000, 43 (9), (special issue on "Conversational Interfaces").