SPOKEN LANGUAGE SYSTEMS Spoken Conversational Interaction for Language Learning Stephanie Seneff, Chao Wang, and Julia Zhang Spoken Language Systems Group.

Slides:



Advertisements
Similar presentations
Take a piece of pizza from the counter.
Advertisements

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Second Language Acquisition
Natalie Fong English Centre, The University of Hong Kong Good Practices in a Second Language Classroom: An Alternating Use of ICT in Independent Learning.
An Overview of Teaching Meghan Kurtz EDU 415/515.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Language and Cognition Colombo, June 2011 Day 8 Aphasia: disorders of comprehension.
Chapter 3 Listening for intermediate level learners Helgesen, M. & Brown, S. (2007). Listening [w/CD]. McGraw-Hill: New York.
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Chapter 3: The Direct Method
U1, Speech in the interface:2. Dialogue Management1 Module u1: Speech in the Interface 2: Dialogue Management Jacques Terken HG room 2:40 tel. (247) 5254.
Learning in the Wild Satanjeev “Bano” Banerjee Dialogs on Dialog March 18 th, 2005 In the Meeting Room Scenario.
CS CS 5150 Software Engineering Lecture 12 Usability 2.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Equal-party Conversation System for Language Learning Chih-yu Chao (advisor: Stephanie Seneff) April 14 th, 2006 Dialogs on Dialogs Reading Group.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
1 Spontaneous-Speech Dialogue System In Limited Domains ( ) Development of an oral human-machine interface, by way of dialogue, for a semantically.
High-quality Speech Translation for Language Learning Chao Wang and Stephanie Seneff June 24, 2004 Spoken Language Systems Group MIT Computer Science and.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
SCILL: Spoken Conversational Interaction for Language Learning
12.0 Computer-Assisted Language Learning (CALL) References: 1.“An Overview of Spoken Language Technology for Education”, Speech Communications, 51, pp ,
Towards Natural Clarification Questions in Dialogue Systems Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg AISB 2014 Convention at Goldsmiths, University.
31 st October, 2012 CSE-435 Tashwin Kaur Khurana.
14: THE TEACHING OF GRAMMAR  Should grammar be taught?  When? How? Why?  Grammar teaching: Any strategies conducted in order to help learners understand,
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
C omputer S cience and A rtificial I ntelligence L aboratory Multilingual Conversational Systems SPEECH RECOGNITION LANGUAGE UNDERSTANDING LANGUAGE GENERATION.
Grammar-Translation Approach Direct Approach
U.S.A. Learns A Web site to teach English to adult learners Leslie Petty Project IDEAL Support Center A Web site to teach English to adult.
MMP - M204 Information Design/Cross Media Publishing - Spoken Language Interfaces - Dr. Ingrid Kirschning (UDLA)1 4. Speech Synthesis –Introduction to.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Role-plays for CALL: System Architecture and Resources Sabrina Wilske & Magdalena Wolska Saarland University ICL, Villach, September.
Spoken Dialogue Systems and the GALAXY Architecture 29 October 2000 Advanced Technology Laboratories 1 Federal Street A&E Building 2W Camden, New Jersey.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
DEVELOPING SPEAKING SKILL Programa Inglés Abre Puertas Unidad de Curriculum y Evaluación Ministerio de Educación.
1 Computational Linguistics Ling 200 Spring 2006.
May 2006CLINT-CS Verbmobil1 CLINT-CS Dialogue II Verbmobil.
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
Ideas and Activities to Differentiate Instruction through Comprehensible Input.
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
Are you ready to play…. Deal or No Deal? Deal or No Deal?
Introduction to Dialogue Systems. User Input System Output ?
L C SL C S SpeechBuilder: Facilitating Spoken Dialogue System Creation Eugene Weinstein Project Oxygen Core Team MIT Laboratory for Computer Science
Volgograd State Technical University Applied Computational Linguistic Society Undergraduate and post-graduate scientific researches under the direction.
Second Language Acquisition
The Direct Method 1. Background It became popular since the Grammar Translation Method was not very effective in preparing students to use the target.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.
The Audio-Lingual Method
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Spoken Dialog Systems Diane J. Litman Professor, Computer Science Department.
L C S Spoken Language Systems Group Stephanie Seneff Spoken Language Systems Group MIT Laboratory for Computer Science January 13, 2000 Multilingual Conversational.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Goal :Communicative Competence
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
T H E D I R E C T M E T H O D DM. Background DM An outcome of a reaction against the Grammar- Translation Method. It was based on the assumption that.
PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
Evaluation of a multimodal Virtual Personal Assistant Glória Branco
Presentation transcript:

SPOKEN LANGUAGE SYSTEMS Spoken Conversational Interaction for Language Learning Stephanie Seneff, Chao Wang, and Julia Zhang Spoken Language Systems Group MIT Computer Science and Artificial Intelligence Lab June 18, 2004

C omputer S cience and A rtificial I ntelligence L aboratory Language Acquisition through Conversational Interaction Language teachers have limited time to interact with students in dialogue exchanges Computers provide non-threatening environment in which to practice communicating We can leverage our extensive prior research in spoken conversational systems to support language learning Three-phase interaction framework is envisioned: –Preparation: practice phrases, simulated dialogues –Conversational Interaction *Telephone conversation with graphical support *Seamless translation aid –Assessment *Review dialog interaction *Feedback and fluency scores

C omputer S cience and A rtificial I ntelligence L aboratory SCILL: A Spoken Computer Interface for Language Learning Speaks only target language. Has access to information sources. Can provide translations for both user queries and system responses Domain Expert Tutor Leverage from our existing conversational systems to produce interactive environment for language learning

C omputer S cience and A rtificial I ntelligence L aboratory Technology Requirements Robust recognition and understanding of foreign-accented speech –If recognition is too poor, student may become frustrated –Customize vocabulary and linguistic constructs to lesson plans High quality language translation (limited domains) Natural and fluent speech synthesis Ability to automatically generate simulated dialogues –System should be able to generate multiple dialogues based on a given lesson topic on the fly –Allows the student to see example sentence constructs for a particular lesson Ability to reconfigure quickly and easily to new lessons Automatic scoring for fluency, pronunciation, tone quality, etc.

C omputer S cience and A rtificial I ntelligence L aboratory Voyager Galaxy Conversational System Architecture Audio Server Audio Server Database Language Generation Language Generation Speech Recognition Speech Recognition Context Tracking Context Tracking Speech Synthesis Speech Synthesis Language Understanding Language Understanding Hub GUI Server GUI Server Jupiter Pegasus Mercury

C omputer S cience and A rtificial I ntelligence L aboratory USER INTERFACE SCILL System Overview WEB SERVER

C omputer S cience and A rtificial I ntelligence L aboratory Domains and Languages Currently focusing on English and Mandarin System can be configured reversibly to support L1 = English; L2 = Mandarin, or vice versa Domains center around the travel scenario: –Flight reservations –Hotel reservations –Weather –Wake-up call and reminders –Navigation assistance (direction finding)

C omputer S cience and A rtificial I ntelligence L aboratory Graphical and Telephone Interactions Stage 1: Preparation –User sees dialogue in both target and native language; mouse clicks support playback of each utterance in target language –Web page presents different simulated dialogue each time it is visited Stage 2: Telephone Interaction –User issues call me at request to instantiate telephone conversation –User can speak in L1 at any time to find out how to say something they forgot –User can also ask for translation of replies Stage 3: Assessment –User views their own dialogue at the Web page –Mispronounced words are highlighted in red –Synthetic speech versions of their utterances can be played

C omputer S cience and A rtificial I ntelligence L aboratory Stage 1: Simulated Hotel Dialogues Simulated hotel is periodically regenerated with different settings for number of rooms and available amenities At each turn, simulated user randomly chooses among presented options Once room is selected, simulated user asks specific questions about the room or the hotel Thousands of different simulated dialogues can be created by running the system in a batch-mode configuration Synthetic waveforms for both sides of the conversation are automatically generated

C omputer S cience and A rtificial I ntelligence L aboratory Generating Simulated Conversation Simulated system configured within Galaxy Framework Intended for Mandarin speaker learning English Simulated User English Dialogue System Language Generation User Query Frame Response Frame Language Generation English Response Mandarin Response Speech Synthesis English Query Mandarin Query

C omputer S cience and A rtificial I ntelligence L aboratory Example Simulated Dialogue (in English) User: Id like a room next Thursday and Friday System: I have found 48 rooms. I have rooms with double, queen, or twin beds. User: Id like a queen bed. System: I have found 13 rooms with a queen bed. There are both smoking and non-smoking rooms User: How about a non-smoking room?... System: How about room 210? It is a non-smoking room with a queen bed. It comes with a view. The room costs $115. User: That would be fine. System: You have reserved a room for Thursday and... User: Id like a wake-up call tomorrow at 9 a.m. System: Okay, youll get a wake-up call at 9 a.m.

C omputer S cience and A rtificial I ntelligence L aboratory Initial version of end-to-end system is in place for the weather domain –Rain, snow, wind, temperature, warnings (e.g., tornado), etc. Recognizer supports both English and Mandarin –Seamless language switching English queries are translated into Mandarin Mandarin queries are answered in Mandarin –User can ask for a translation into English of the response at any time Currently using Mandarin synthesizer provided by DELTA Electronics –Plan to develop high quality domain-dependent Mandarin synthesis using our Envoice concatenative speech synthesizer (Yi, 2003) System can be configured as telephone-only or as telephone augmented with a Web-based gui interface Stage 2: Conversational Interaction

C omputer S cience and A rtificial I ntelligence L aboratory Bilingual Recognizer Construction English corpus Chinese corpus Generate English Recognizer Langage Model Chinese Recognizer Langage Model Recognizer English Network Chinese Network Parse Interlingua Automatically translate existing English corpus into Mandarin Use NL grammar to automatically induce language model for both English and Mandarin recognizers Two recognizers compete in common search space

C omputer S cience and A rtificial I ntelligence L aboratory clause: weather_event topic: precip_act, name: thunderstorm, num: pl quantifier: some pred: accompanied_by adverb: possibly topic: wind, num: pl, pred: gusty and: precip_act, name: hail English source: Some thunderstorms may be accompanied by gusty winds and hail wind hail rain/storm Frame indexed under wind, rain, storm, and hail Multilingual Weather Responses Japanese: Spanish:Algunas tormentas posiblement acompanadas por vientos racheados y granizo Chinese: ¨Ç ¹p «B ¥i ¯à ·| ¦ñ ¦³ °} ­· ©M ¦B ¹r

C omputer S cience and A rtificial I ntelligence L aboratory Example Spoken Dialogue Interaction in Weather Domain User: Bo1 Shi4 Dun4 ming2 tian1 hui4 xia4 yu3 ma5? (Is it going to rain tomorrow in Boston?) System: Tian1 qi4 yu4 bao4 ming2 tian1 Bo1 shi4 dun4 mei2 you3 yu3. (The forecast calls for no rain tomorrow in Boston) User: (in English) What is the temperature? System: (translates) Qi4 wen1 shi4 duo1 shao3? User:(emulates)Qi4 wen1 shi4 duo1 shao3? System: Bo1 Shi4 Dun4 ming2 tian1 zui4 gao1 qi4 wen1 er4 she4 shi4 du4, ming2 tian1 ye4 jian1, zui4 di4 qi4 wen1 ling2 xia4 wu3 she4 shi4 du4. User: Could you translate that? System: In Boston tomorrow, high 2 degrees Celsius, Tomorrow night, low -5 Celsius.

C omputer S cience and A rtificial I ntelligence L aboratory Example Telephone Interaction in Weather Domain Will it be hot in Dallas today? System paraphrases query, then replies with temperature information System translates previous response User speaks English to find out how to say it in Mandarin User attempts to repeat what they just heard To my knowledge, the forecast calls for no rain in Seattle tomorrow. Is there something else?

C omputer S cience and A rtificial I ntelligence L aboratory Stage 3: Plans for Assessment Phonetic aspects –Study relationship between existing confidence scores and phonetic productions –Allow realizations of selected phones from native language to compete in recognizer search Tonal aspects (Mandarin) –Use tone recognition system (Wang et al., 1998) to score tone productions; highlight worst-scoring words –Use phase-vocoder techniques (Tang et al., 2001) to repair users tone productions by replacing prosodic contour with native speech patterns –Tabulate frequencies of tone errors in typed inputs (pinyin) Fluency measures –Word-by-word speaking rate (Chung & Seneff, 1999) –Percentage of utterance containing pauses and disfluencies

C omputer S cience and A rtificial I ntelligence L aboratory Future Plans (Near Term and Long Term) Build high quality synthesis capability Improve recognition, understanding, dialogue, and translation performance Develop various scoring algorithms for quality assessment of students speech Collect and transcribe data from language learners and evaluate both system and students Refine all aspects of system based on collected data Develop tools to rapidly port to new domains and languages –Automatic grammar induction –Generic dialogue modeling –Simulated dialogue interactions

C omputer S cience and A rtificial I ntelligence L aboratory Thank you!

C omputer S cience and A rtificial I ntelligence L aboratory NLG Synthesis NLU Recognition Multilingual Translation Framework Common meaning representation: semantic frame Semantic Frame Parsing Rules Generation Rules Models Speech Corpora SUMMIT ENVOICE GENESIS TINA English Chinese Spanish Japanese English Chinese Spanish Japanese Introduction || Multilinguality || Orion || Phrasebook || Summary

C omputer S cience and A rtificial I ntelligence L aboratory Testing the Effectiveness of Training on Typed Input: Proposed Measures Compare the quality of spoken dialogue recorded before and after a Web-based training session Measures of fluency: –Syntactic well-formedness –Tone production accuracy –Frequency of pauses, edits, and filler words –Phonetic quality, etc. Measures of communication success: –Frequency of usage of translation assistance –Understanding error rate –Task completion –Time to completion, etc.

C omputer S cience and A rtificial I ntelligence L aboratory Example Telephone Interaction User asks: Will it rain tomorrow in Boston? System paraphrases query, then responds in Chinese Please repeat that in English or Chinese interpreted identically System repeats response in Chinese User speaks query in English: seamless language switching System paraphrases, then translates query into Chinese User attempts to repeat translation –Recognition error: hallucinates an erroneous date (February 30) which will be remembered System supplies known cities in England User chooses London System has no weather for London on February 30 User asks how about today? System provides Londons weather today User asks for a translation into English, which is provided