16.0 Spoken Dialogues References: 1. 11.1 - 11.2.1, Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,

Slides:



Advertisements
Similar presentations
Spoken Dialogue Systems SIG-AI Fall 2003 By: Sachin Kamboj.
Advertisements

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
15.0 Utterance Verification and Keyword/Key Phrase Spotting References: 1. “Speech Recognition and Utterance Verification Based on a Generalized Confidence.
Spoken Dialogue Technology Achievements and Challenges
© GyrusLogic, Inc AI and Natural Language VUI design Peter Trompetter, VP Global Development GyrusLogic, Inc. Tuesday, August 21 at 1:30 PM - C202.
Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
U1, Speech in the interface:2. Dialogue Management1 Module u1: Speech in the Interface 2: Dialogue Management Jacques Terken HG room 2:40 tel. (247) 5254.
EE 4272Spring, 2003 EE4272: Computer Networks Instructor: Tricia Chigan Dept.: Elec. & Comp. Eng. Spring, 2003.
1 error handling – Higgins / Galatea Dialogs on Dialogs Group July 2005.
Equal-party Conversation System for Language Learning Chih-yu Chao (advisor: Stephanie Seneff) April 14 th, 2006 Dialogs on Dialogs Reading Group.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
6/28/20151 Spoken Dialogue Systems: Human and Machine Julia Hirschberg CS 4706.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Semantic Parsing for Robot Commands Justin Driemeyer Jeremy Hoffman.
IVR PROCESS. Introduction IVR is a technology that allows a computer to interact with humans through the use of voice and DTMF tones input via keypad.
© GyrusLogic, Inc A Conversational System That Reduces Development Cost Luis Valles, Chief Scientist GyrusLogic, Inc. Monday, August 7 at 1:30 PM.
Chapter 6 – Architectural Design Lecture 2 1Chapter 6 Architectural design.
MMP - M204 Information Design/Cross Media Publishing - Spoken Language Interfaces - Dr. Ingrid Kirschning (UDLA)1 4. Speech Synthesis –Introduction to.
9/8/20151 Natural Language Processing Lecture Notes 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Role-plays for CALL: System Architecture and Resources Sabrina Wilske & Magdalena Wolska Saarland University ICL, Villach, September.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Spoken Dialogue Systems and the GALAXY Architecture 29 October 2000 Advanced Technology Laboratories 1 Federal Street A&E Building 2W Camden, New Jersey.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Assistant Research Professor Center for Advanced Vehicular Systems Mississippi.
Develop a fast semantic decoder for dialogue systems Capability to parse 10 – 100 ASR hypotheses in real time Robust to speech recognition noise Semantic.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
17.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Center for Advanced Vehicular Systems Mississippi State University Computer Science.
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Develop a fast semantic decoder Robust to speech recognition noise Trainable on different domains: Tourist information (TownInfo) Air travel information.
Learning Automata based Approach to Model Dialogue Strategy in Spoken Dialogue System: A Performance Evaluation G.Kumaravelan Pondicherry University, Karaikal.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
1 Representing New Voice Services and Their Features Ken Turner University of Stirling 11th June 2003.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
L C SL C S SpeechBuilder: Facilitating Spoken Dialogue System Creation Eugene Weinstein Project Oxygen Core Team MIT Laboratory for Computer Science
Dialogue systems Volha Petukhova Saarland University 03/07/2015 Einführung in Diskurs and Pragmatik, Sommersemester
Extending VERA (Conference Information) Design Specification & Schedules Arthur Chan (AC) Rohit Kumar (RK) Lingyun Gu (LG)
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit, Larsson and Traum 2000 D&QA Reading Group, Feb 20 th 2007 Genevieve.
Ontology-driven VoiceXML Dialogues Generation Marta Gatius, Meritxell González TALP Research Center, Technical University of Catalonia, Barcelona Berlin,
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Theban Stanley, Julie Baca, Matt Elliott and Joseph Picone Human and Systems Engineering Center for Advanced Vehicular Systems Mississippi State University.
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Develop a fast semantic decoder for dialogue systems Capability to parse 10 – 100 ASR hypothesis in real time Robust to speech recognition noise Trainable.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
CS223: Software Engineering
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Grounding and Repair Joe Tepperman CS 599 – Dialogue Modeling Fall 2005.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Analysis of spontaneous speech
17.0 Spoken Dialogues References: , Chapter 17 of Huang
Spoken Dialogue Systems: Human and Machine
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Data, Databases, and DBMSs
Managing Dialogue Julia Hirschberg CS /28/2018.
Tiers vs. Layers.
On the Integration of Speech Recognition into Personal Networks
PROJ2: Building an ASR System
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE, Aug “The AT&T spoken language understanding system”, IEEE Trans. on Speech and Audio Processing, vol.14, no.1, pp , “Talking to machine” in ICSLP, “A telephone-based conversational interface for weather information” IEEE Trans. On Speech and Audio Processing, vol. 8, no. 1, pp , “Spoken language understanding”, IEEE Signal Processing Magazine, vol.22, no. 5, pp , 2005

Spoken Dialogue Systems Almost all human-network interactions can be made by spoken dialogue Speech understanding, speech synthesis, dialogue management, discourse analysis System/user/mixed initiatives Reliability/efficiency, dialogue modeling/flow control Transaction success rate/average dialogue turns Database s Sentence Generation and Speech Synthesis Output Speech Input Speech Dialogue Manager Speech Recognition and Understanding User’s Intention Discourse Analysis Response to the user Internet Networks Users Dialogue Server

Key Processes in A Spoken Dialogue A Basic Formulation –goal: the system takes the right actions after each dialogue turn and complete the task successfully finally Three Key Elements –speech recognition and understanding: converting X n to some semantic interpretation F n –discourse analysis: converting S n-1 to S n, the new discourse semantics (dialogue state), given all possible F n –dialogue management: select the most suitable action A n given the discourse semantics (dialogue state) S n X n : speech input from the user in the n-th dialogue turn S n : discourse semantics (dialogue state) at the n-th dialogue turn A n : action (response, actions, etc.) of the system (computer, hand-held device, network server, etc.) after the n-th dialogue turn by dialogue management by discourse analysis by speech recognition and understanding F n : semantic interpretation of the input speech X n

Dialogue Structure Turns –an uninterrupted stream of speech(one or several utterances/sentences) from one participant in a dialogue –speaking turn: conveys new information back-channel turn: acknowledgement and so on(e.g. O. K.) Initiative-Response Pair –a turn may include both a response and an initiative –system initiative: the system always leads the interaction flow user initiative: the user decides how to proceed mixed initiative: both acceptable to some degree Speech Acts(Dialogue Acts) –goal or intention carried by the speech regardless of the detailed linguistic form –forward looking acts conversation opening(e.g. May I help you?), offer(e.g. There are three flights to Taipei…), assert(e.g. I’ll leave on Tuesday), reassert(e.g. No, I said Tuesday), information request(e.g. When does it depart?), etc. –backward looking acts accept(e.g. Yes), accept-part(e.g. O.K., but economy class), reject(e.g. No), signal not clear(e.g. What did you say?), etc. –speech acts  linguistic forms : a many-to-many mapping e.g. “O.K.”  request for confirmation, confirmation –task dependent/independent –helpful in analysis, modeling, training, system design, etc. Sub-dialogues –e.g. “asking for destination”, “asking for departure time”, …..

Language Understanding for Limited Domain Semantic Frames  An Example for Semantic Representation –a semantic class defined by an entity and a number of attributes(or slots) e.g. [Flight]: [Airline]  (United) [Origin]  (San Francisco) [Destination]  (Boston) [Date]  (May 18) [Flight No]  (2306) –“slot-and-filler” structure Sentence Parsing with Context-free Grammar (CFG) for Language Understanding –extension to Probabilistic CFG, integration with N-gram(local relation without semantics), etc. Grammar(Rewrite Rules) S  NP VP NP  N VP  V-cluster PP V-cluster  (would like to) V V  fly| go PP  Prep NP N  Boston | I Prep  to S NP N V-cluster PP VP V NP Prep I would like toflytoBoston N (would like to)

Robust Parsing for Speech Understanding Problems for Sentence Parsing with CFG –ungrammatical utterances –speech recognition errors (substitutions, deletions, insertions) –spontaneous speech problems: um–, cough, hesitation, repetition, repair, etc. –unnecessary details, irrelevant words, greetings, unlimited number of linguistic forms for a given act e.g. to Boston I’m going to Boston, I need be to at Boston Tomorrow um– just a minute– I wish to – I wish to – go to Boston Robust Parsing as an Example Approach –small grammars for particular items in a very limited domain, others handled as fillers e.g. Destination→ Prep CityName Prep → to |for| at CityName → Boston |Los Angeles|... –different small grammars may operate simultaneously –keyword spotting helpful –concept N-gram may be helpful Speech Understanding –two-stage: speech recognition (or keyword spotting) followed by semantic parsing (e.g. robust parsing) –single-stage: integrated into a single stage CityName (Boston,...) direction (to, for...) similar to class-based N-gram Prob(c i |c i-1 ), c i : concept

Discourse Analysis and Dialogue Management Discourse Analysis –conversion from relative expressions(e.g. tomorrow, next week, he, it…) to real objects –automatic inference: deciding on missing information based on available knowledge(e.g. “how many flights in the morning? ” implies the destination/origin previously mentioned) –inconsistency/ambiguity detection (e.g. need clarification by confirmation) –example approach: maintaining/updating the dialogue states(or semantic slots) Dialogue Management –controlling the dialogue flow, interacting with the user, generating the next action e.g. asking for incomplete information, confirmation, clarify inconsistency, filling up the empty slots one-by-one towards the completion of the task, optimizing the accuracy/efficiency/user friendliness of the dialogue –dialogue grammar: finite state machines as an example –plan-based dialogue management as another example –challenging for mixed-initiative dialogues Performance Measure –internal: word error rate, slot accuracy (for understanding), etc. –overall: average success rate (for accuracy), average number of turns (for efficiency), etc. Subdialogue: Conversation Opening Subdialogue: Asking for Destination Subdialogue: Asking for Departure Time Destinatio n filled up Departur e Tine filled up no yes no yes

Client-Server Architecture Galaxy, MIT Integration Platform, AT& T Domain Dependent/Independent Servers Shared by Different Applications/Clients –reducing computation requirements at user (client) by allocating most load at server –higher portability to different tasks Application (Client) Dialogue/ Application Manager The Client API (Application Programming Interface) Resource Manager/Proxy Server) The Server SPI (Server Provider Interface) ASR Server TTS Server Audio Server Database Server GDC Server Telephone/Audio Interface Domain Server HLT Server Client Network Hub