A Framework For Developing Conversational User Interfaces

Slides:



Advertisements
Similar presentations
CHART or PICTURE INTEGRATING SEMANTIC WEB TO IMPROVE ONLINE Marta Gatius Meritxell González TALP Research Center (UPC) They are friendly and easy to use.
Advertisements

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Enriching the Self-Service Experience Interactive and Intelligent Agent that communicate with web customers in real-time © All Right Reserved.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
Component-Based Software Engineering Oxygen Paul Krause.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
Speech recognition, understanding and conversational interfaces Alexander Rudnicky School of Computer Science
Verbal (symbol) Based Interactions Dr.s Barnes and Leventhal.
Requirements Analysis Concepts & Principles
Equal-party Conversation System for Language Learning Chih-yu Chao (advisor: Stephanie Seneff) April 14 th, 2006 Dialogs on Dialogs Reading Group.
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Guided Conversational Agents and Knowledge Trees for Natural Language Interfaces to Relational Databases Mr. Majdi Owda, Dr. Zuhair Bandar, Dr. Keeley.
6/28/20151 Spoken Dialogue Systems: Human and Machine Julia Hirschberg CS 4706.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
Beginning Oral Language and Vocabulary Development
Universe Design Concepts Business Intelligence Copyright © SUPINFO. All rights reserved.
Computer Science 101 Web Access to Databases Overview of Web Access to Databases.
Copyright © 2008, Zend Technologies Inc. Zend_Tool: Rapid Application Development In Zend Framework Ralph Schindler Software Engineer, Zend Technologies.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Track: Speech Technology Kishore Prahallad Assistant Professor, IIIT-Hyderabad 1Winter School, 2010, IIIT-H.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Akhila Kondai October 30, 2013.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
Class 6 Data and Business MIS 2000 Updated: September 2012.
ArcGIS Workflow Manager An Introduction
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Conversational Applications Workshop Introduction Jim Larson.
Spoken Dialogue Systems and the GALAXY Architecture 29 October 2000 Advanced Technology Laboratories 1 Federal Street A&E Building 2W Camden, New Jersey.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture Introduction Speechbuilder Tutorial.
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
1 Computational Linguistics Ling 200 Spring 2006.
Configuration Management (CM)
New Teachers’ Induction January 20, 2011 Office of Curriculum and Instruction.
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
The Glance Project ATLAS Management January 2012.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
Database Management Systems.  Database management system (DBMS)  Store large collections of data  Organize the data  Becomes a data storage system.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
Introduction to Dialogue Systems. User Input System Output ?
L C SL C S SpeechBuilder: Facilitating Spoken Dialogue System Creation Eugene Weinstein Project Oxygen Core Team MIT Laboratory for Computer Science
CSC USI Class Meeting 10 November 9, 2010.
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
L C S Spoken Language Systems Group Stephanie Seneff Spoken Language Systems Group MIT Laboratory for Computer Science January 13, 2000 Multilingual Conversational.
Class 3 Data and Business MIS 2000 Updated: Jan
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Teaching Study Strategies Using WYNN Peggy Dalton
Presented By Sharmin Sirajudeen S7 CS Reg No :
Introduction to DBMS Purpose of Database Systems View of Data
Issues in Spoken Dialogue Systems
Tools of Software Development
Retrieval of audio testimonials via voice search
Managing Dialogue Julia Hirschberg CS /28/2018.
David Cyphert CS 2310 – Software Engineering
Introduction to DBMS Purpose of Database Systems View of Data
ITEC 3220A Using and Designing Database Systems
Presentation transcript:

A Framework For Developing Conversational User Interfaces James Glass, Eugene Weinstein, Scott Cyphers, Joseph Polifroni MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA USA Grace Chung Corporation for National Research Initiatives Reston, VA USA Mikio Nakano NTT Corporation Atsugi, Japan

Conversational User Interfaces Coding deals with: bit rate reduction high quality transformation Synthesis: generation of intelligible speech of high quality from unrestricted text Both fields are much further advanced than ASR What is an ASR system? Conversational User Interfaces Speech Speech Synthesis Recognition Text Human Computer Generation Text Understanding Meaning Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Types of Conversational Interfaces Conversational systems differ in the degree with which human or computer controls the conversation (initiative) Human Computer Initiative Computer maintains tight control Human is highly restricted C: Please say the departure city. Human takes complete control Computer is totally passive H: I want to visit my grandmother. Directed Dialogue Mixed Initiative Dialogue Free Form Dialogue Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Conversational Interfaces Can understand verbal input Speech recognition Language understanding (in context) Language Generation Can engage in dialogue with a user during the interaction Speech Synthesis Dialogue Management Can verbalize response Language generation Speech synthesis Audio Back End Speech Recognition Context Resolution Language Understanding Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

The Problem With Conversational Interfaces Advanced conversational systems are out there Both user and computer can take initiative Goal: conversational skill of system should approach that of human operator But… These systems are built by experts Huge learning curve for novices, and Tremendous iterative effort required even from experts For this reason Most advanced conversational systems remain in research labs e.g. Jupiter weather info system (+1-888-573-TALK) : Zue et al, IEEE Trans. SAP, 8(1), 2000 However, we have seen limited commercial deployment e.g. AT&T’s “How May I Help You”, Gorin et al, Speech Communication, 23, 1997 Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Simplifying Conversational System Creation Goal: make it easier for both expert and novice developers to create conversational interfaces But still use advanced human language technologies Strategy: simplify configuration process Automatically configure technology components bases on examples Allow specification through web interface or unified configuration file SpeechBuilder Configuration Engine Web Interface Configuration File Context Resolution Dialogue Management Generation Synthesis Understanding Recognition Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Configuring a Conversational Interface: Knowledge Representation First, define example sentences for in-domain actions Action Examples identify I would like to know today’s weather in Denver What will the temperature be on Tuesday set Turn on the radio in the kitchen please Can you turn the dining room lights off Then, define the important concepts present in the actions (attributes): Concept values make up recognizer vocabulary! Examples of attributes automatically matched to attribute classes Attribute Values city Boston, Denver, San Francisco, … room living room, dining room, kitchen, … Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Starting with a Database Table Provide database table to configure speech interface: Name Phone Email Office Jim Glass x3-1640 glass@mit.edu 601 Scott Cyphers x3-0248 cyphers@mit.edu 604 Eugene Weinstein X3-8569 ecoder@mit.edu 633 Only some columns are used to access entries (e.g., Name) Values of those columns become values for domain concepts Default action sentences are automatically generated But, every table cell can potentially be an answer to a question All Names of columns become one concept – “property” Attributes name Jim Glass, Scott Cyphers… property Name, Phone, Email, Office Actions request_property What is the email for Jim Glass? request_office Where can I find Jim Glass? Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Generic Dialogue Manager Language Understanding Dialogue Management Generic Dialogue Manager (Polifroni & Chung, ICSLP 2002) Language Generation Hotels Generic Dialogue Manager Dialogue Management Air Travel Speech Synthesis Sports Weather Audio Back End Plan system responses Regularize common concepts Summarize database results Speech Recognition Context Resolution Language Understanding Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Context Resolution “Show me restaurants in Cambridge.” Resolve Deixis Input Query “Show me restaurants in Cambridge.” Resolve Deixis “What does this one serve?” Resolve Pronouns “What is their phone number?” Inherit Predicates “Are there any on Main Street?” Incorporate Fragments “What about Massachusetts Ave?” Fill in Default Values “Give me directions from MIT.” Query Interpreted in Context Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Human Language Technology Details Approach: Use same technologies as deployed in our mainstream, more complex systems Speech Recognizer (Glass, Computer, Speech, and Language, 2003) Trained on 100+ hours of mostly telephone speech Word pronunciations supplied by large dictionary, generated by rule, or provided by developer Natural Language Understanding: (Seneff, Computational Linguistics, 1992) Hierarchical sentence grammar used to parse sentence hypothesis Back off to concept spotting when no full parse is made Language Generation: (Baptist&Seneff, ICSLP 2000) Used in: SQL (DB Query) generation, paraphrasing & URL-encoding meaning representation, responses Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Concepts (Attributes) Web-based Interface Defining Actions and Concepts (Attributes) Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Web-based Interface: Viewing Sentences Examining how sentences are reduced to an action and a set of attribute-value pairs Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Web-based Interface: Response Generation Domain independent system prompts Customizing system responses Domain specific system prompts Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Web-based Interface: Editing Pronunciations Modifying system generated pronunciations for the vocabulary Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Web-based Interface: Context Resolution Context Resolution configured through Masking and Inheritance of concepts Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Voice Configuration File: An Alternative to the Web Interface Entire domain can be specified in single configuration file Allows for automated generation of conversational systems <actions> <request_name> = i would like a restaurant | can you (show|give) me a Chinese restaurant in Arlington; </actions> <attributes> <cuisine> = Chinese|Taiwanese; <city> = Washington | Boston | Arlington; </attributes> <discourse> name masks(city cuisine neighborhood); </discourse> <constraints> <request_name> (city|neighborhood) {prompt_for_city}; </constraints> Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 Deployment SpeechBuilder functional for the past three years Some example domains: Office appliance control Laboratory directory (auto-attendant) Restaurant query system Has been used by MIT researchers (experts) as well as novice developers at our sponsor companies Used in technology transfer workshop for pervasive computing project (Oxygen) SpeechBuilder has been used as an educational tool Computational linguistics class at Georgetown University Summer class at Johns Hopkins University Youngest SpeechBuilder developer: 9 years old Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Japanese SpeechBuilder Created in collaboration with NTT Challenge: Segmentation (no spaces between words) Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 Example Domain A hotel application using the generic dialogue manager Compiled via SpeechBuilder using constraints shown previously Other generic functionality is automatically included Illustrated technical issues: Soliciting necessary information from user Interpreting fragments correctly in context Canonicalizing relative dates Ordering and summarizing results of query to content provider Resolving superlatives/updating discourse context Interpreting pronouns in context Returning and speaking specific properties Repeating previous replies Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Another Example Domain: Object Manipulation System Stock SpeechBuilder domain for spoken dialogue Custom back-end connected to stereo camera and person tracking algorithm (Demirdjian, WOMOT 2003) Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Ongoing and Future Work Incorporate speech synthesis Allow use of concatenative speech synthesizer (Yi et al, ICSLP 2000) in SpeechBuilder Allow use of multiple modalities Provide functionality to incorporate multimodal input into systems Improve dialogue management tools and modules Improve ability of SpeechBuilder systems to use more sophisticated dialogue strategies Provide additional generic semantic concepts for use in domains Allow system refinement by unsupervised learning Use confidence scores to improve domain language model (Nakano&Hazen, Eurospeech 2003) Allow system modification in real-time Need ability to re-train recognizer during runtime (Schalkwyk et al, Eurospeech 2003) Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Thank You! For more information: http://www.sls.csail.mit.edu/ Email us! ecoder@mit.edu Jupiter weather Information system: +1-617-258-0300 (outside USA) 1-888-573-TALK (USA toll-free) Mercury flight information system: +1-617-258-6040 (outside USA) 1-877-MIT-TALK (USA toll-free) Pegasus flight status system: +1-617-258-0301 (outside USA) 1-877-LCS-TALK (USA toll-free) Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 THE END Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 Utility for rapid prototyping of speech-based interfaces Used to create demonstrations for NTT CS Labs open house Prototypes were developed with a few days of effort Three papers submitted for publishing Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 Human Language Technologies Only some columns are used to access entries (e.g., Name) Values of those columns become values for domain concepts Default action sentences are automatically generated But, every table cell can potentially be an answer to a question Names of non-access columns become a concept Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

To Configure Response Generation… For each concept present in the domain, define how queries about that concept should be answered <telephone> = “The telephone for :name is :phone” Define some prompts for generic events, e.g. welcome and goodbye <welcome> = “Welcome to the auto-attendant” <no_data> = “Sorry, there was no data matching your request.” Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Conversational User Interfaces: Input Side Human Language Technologies Speech Text Recognition “Find me a flight to Boston on Tuesday” Meaning Understanding “Back-end” Technologies action=flights to_city=Boston day=Tuesday Action DB Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Conversational User Interfaces: Output Side Human Language Technologies Speech Synthesis Delta flight, number fifty five from La Guardia to Boston… Text Generation flight_num=55 airline=Delta origin=LGA dest=BOS Meaning DB Action Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Conversational User Interfaces: The Whole Picture Or Is It? Speech Speech Text Recognition Synthesis Text Meaning Understanding Generation Meaning Action Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

The Missing Pieces: Context and Dialogue Context Resolution: action=flights to_city=Boston day=Tuesday Last time, the user asked for a flight from LGA action=flights origin=BOS dest=LGA day=Tuesday + = Dialogue Management: action=flights to_city=Boston day=Tuesday “Which city would you like to fly from?” + = Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Conversational User Interfaces: The Whole Picture Speech Speech Text Recognition Synthesis Text Understanding Generation Meaning Meaning Context Resolution, Dialogue Management Action Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

The Problem With Conversational Interfaces… Complex conversational systems are out there Both user and computer can take initiative Goal: conversational skill of system should approach that of human operator But… These systems are built by experts Huge learning curve for novices, and Tremendous iterative effort required even from experts For this reason Most advanced conversational systems remain in research labs e.g. Jupiter weather info system (+1-888-573-TALK) : Zue et al, IEEE Trans. SAP, 8(1), 2000 However, we have seen limited commercial deployment e.g. AT&T’s “How May I Help You”, Gorin et al, Speech Communication, 23, 1997 Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Configuring Response Generation… For each concept present in the domain, define how queries about that concept should be answered Configure some generic prompts for summarizing long results Define some prompts for generic events, e.g. welcome Property/ Condition Response phone The phone number for :restaurant_name is :phone cuisine :restaurant_name serves :cuisine cuisine Welcome Welcome to the restaurants domain No matches I’m sorry, I couldn’t find any restaurants matching your request Many matches I found five restauraunts :items item (what to return when summarizing) :restaurant_name Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Configuring Context Resolution Context Resolution (discourse) configured through Masking and Inhertiance of concepts Inheritance configures how actions remember concepts, e.g.: User: “What is the phone number for Jim Glass” System: “Jim Glass’ phone number is 3-1640 User: “What about his email address?” System: “Jim Glass’ email address is glass@mit.edu” Name concept is inherited Masking configures how certain concepts block other concepts, even in the presence of inheritance, e.g. User: “Do you have any restaurants in Boston?” System: “In Boston, I have the following…” User: “What about in Times Square?” System: “In Times Square, New York, I have…” City concept is masked by Neighborhood concept Name is inherited City is masked Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Voice Configuration File Developers can also use Voice Configuration (VCFG) file format to configure SpechBuilder domains: <actions> <request_name> = i would like a restaurant | can you (show|give) me a Chinese restaurant in Arlington; </actions> <attributes> <cuisine> = Chinese|Taiwanese; <city> = Washington | Boston | Arlington; </attributes> <discourse> name masks(city cuisine neighborhood); </discourse> <constraints> <request_name> (city|neighborhood) {prompt_for_city}; </constraints> Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Generic Dialogue Manager Language Understanding Dialogue Management Generic Dialogue Manager (Polifroni & Chung, ICSLP 2002) Hotels Language Generation Generic Dialogue Manager Air Travel Speech Synthesis Sports Dialogue Management Weather Plan system responses Regularize common concepts Summarize database results Audio Database Speech Recognition Context Resolution Language Understanding Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Computer Aided Design on User Interfaces – Jan 16th, 2004 Deployment SpeechBuilder functional for the past three years Some example domains: Office appliance control Laboratory directory (auto-attendant) Restaurant query system Has been used by MIT researchers (experts) as well as novice developers at our partner companies SpeechBuilder has been used by students in Computational linguistics class at Georgetown University Summer class at Johns Hopkins University Technology transfer workshop for pervasive computing project (Oxygen) In collaboration with NTT, we have developed a Japanese version of SpeechBuilder. Japanese domains: Bus timetable system Weather information system Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004

Configuring a Speech Interface with SpeechBuilder: Knowledge Representation First define some concepts present in the domain (attributes): Concept values make up recognizer vocabulary! Attribute Values city Boston, Denver, San Francisco, … room living room, dining room, kitchen, … Then, define examples of things to do with the concepts (actions) Examples of attributes automatically matched to attribute classes Action Examples identify I would like to know today’s weather in Denver What will the temperature be on Tuesday set Turn on the radio in the kitchen please Can you turn the dining room lights off Eugene Weinstein — MIT Computer Science and Artificial Intelligence Laboratory Computer Aided Design on User Interfaces – Jan 16th, 2004