SCILL: Spoken Conversational Interaction for Language Learning

Slides:



Advertisements
Similar presentations
SPOKEN LANGUAGE SYSTEMS Spoken Conversational Interaction for Language Learning Stephanie Seneff, Chao Wang, and Julia Zhang Spoken Language Systems Group.
Advertisements

CODE/ CODE SWITCHING.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
EE3P BEng Final Year Project – 1 st meeting SLaTE – Speech and Language Technology in Education Martin Russell
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis
MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Equal-party Conversation System for Language Learning Chih-yu Chao (advisor: Stephanie Seneff) April 14 th, 2006 Dialogs on Dialogs Reading Group.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
High-quality Speech Translation for Language Learning Chao Wang and Stephanie Seneff June 24, 2004 Spoken Language Systems Group MIT Computer Science and.
1 Speech synthesis 2 What is the task? –Generating natural sounding speech on the fly, usually from text What are the main difficulties? –What to say.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Bootstrapping pronunciation models: a South African case study Presented at the CSIR Research and Innovation Conference Marelie Davel & Etienne Barnard.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
31 st October, 2012 CSE-435 Tashwin Kaur Khurana.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
C omputer S cience and A rtificial I ntelligence L aboratory Multilingual Conversational Systems SPEECH RECOGNITION LANGUAGE UNDERSTANDING LANGUAGE GENERATION.
CAREERS IN LINGUISTICS OUTSIDE OF ACADEMIA CAREERS IN INDUSTRY.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Spoken Dialogue Systems and the GALAXY Architecture 29 October 2000 Advanced Technology Laboratories 1 Federal Street A&E Building 2W Camden, New Jersey.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
A Cognitive Substrate for Natural Language Understanding Nick Cassimatis Arthi Murugesan Magdalena Bugajska.
1 Computational Linguistics Ling 200 Spring 2006.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Arizona English Language Learner Assessment AZELLA
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
AQUAINT Herbert Gish and Owen Kimball June 11, 2002 Answer Spotting.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Introduction to Computational Linguistics
L C SL C S SpeechBuilder: Facilitating Spoken Dialogue System Creation Eugene Weinstein Project Oxygen Core Team MIT Laboratory for Computer Science
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Theban Stanley, Julie Baca, Matt Elliott and Joseph Picone Human and Systems Engineering Center for Advanced Vehicular Systems Mississippi State University.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
© 2013 by Larson Technical Services
Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
L C S Spoken Language Systems Group Stephanie Seneff Spoken Language Systems Group MIT Laboratory for Computer Science January 13, 2000 Multilingual Conversational.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
Approaches to Machine Translation
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
Approaches to Machine Translation
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

SCILL: Spoken Conversational Interaction for Language Learning Stephanie Seneff (seneff@csail.mit.edu) Jim Glass (jrg@csail.mit.edu) Spoken Language Systems Group MIT Computer Science and Artificial Intelligence Lab Steve Young (sjy@eng.cam.ac.uk) Speech Group CUED Machine Intelligence Lab

Conversational Interfaces Language Generation Speech Synthesis Dialogue Management Database Audio Speech Recognition Context Resolution Language Understanding

Conversational Interfaces Language Generation Hub Galaxy Architecture Speech Synthesis Dialogue Management Audio Database Speech Recognition Context Resolution Language Understanding

Bilingual Weather Domain: Video Clip

Computer Aids through Conversational Interaction Language teachers have limited time to interact with students in dialogue exchanges Computers provide non-threatening environment in which to practice communicating Three-phase interaction framework is envisioned: Preparation: practice phrases, simulated dialogues Conversational Interaction Telephone conversation with graphical support Seamless translation aid Assessment Review dialog interaction Feedback and fluency scores

SCILL: A Spoken Computer Interface for Language Learning Conversational systems for interactive environment for language learning Speaks only target language. Has access to information sources. Can provide translations for both user queries and system responses Domain Expert Tutor MIT SLS Bilingual Conversational Dialogue Systems CU Speech Group Speech Recognition and Pronunciation Scoring

Technology Requirements Robust recognition and understanding of foreign-accented speech If recognition is too poor, student may become frustrated Customize vocabulary and linguistic constructs to lesson plans High quality cross-lingual language generation Natural and fluent speech synthesis Ability to automatically generate simulated dialogues System should be able to generate multiple dialogues based on a given lesson topic on the fly Allows the student to see example sentence constructs for a particular lesson Ability to reconfigure quickly and easily to new lessons Automatic scoring for fluency, pronunciation, tone quality, use of vocabulary, etc.

SCILL System Overview WEB SERVER USER INTERFACE

Bilingual Spoken Dialogue Interaction: Current Status Initial version of end-to-end system is in place for the weather domain Rain, snow, wind, temperature, warnings (e.g., tornado), etc. MIT Recognizer supports both English and Mandarin Seamless language switching English queries are translated into Mandarin Mandarin queries are answered in Mandarin User can ask for a translation into English of the response at any time Currently using off-the-shelf Mandarin synthesizer from ITRI Plan to develop high quality domain-dependent Mandarin synthesis using our Envoice tools System can be configured as telephone-only or as telephone augmented with a Web-based GUI interface

Bilingual Recognizer Construction Parse Interlingua English corpus Chinese corpus Generate English Recognizer Language Model Chinese Recognizer Language Model Recognizer English Network Chinese Create Mandarin corpus by automatically translating existing English corpus Automatically induce language model for both English and Mandarin recognizers using NL grammar Two recognizers compete in common search space

HTK Mandarin Speech Recognizer Standard HTK LVCSR Setup: PLP Front-end with 1st/2nd/3rd Derivatives transformed using HLDA 3 state cross-word hidden Markov models Decision tree clustered context dependent triphones N-gram language model smoothed with class-based language model Except: Standard PLP front-end augmented with F0+derivatives (F0 added after HLDA transformation) 46 phone acoustic model set with long final phones split eg uang -> ua ng Questions about tone added to decision tree context clustering

HMM-Based Pronunciation Scoring Basic approach: estimate posterior probabilities (ie confidence score) of each phone or syllable given acoustics map confidence scores to good/bad decision using data labelled by experts sh ih d ax . . . A simple approximation Relates confidence scores to human perception P(p | A) Bad Good Expert Rankings

Multilingual Translation Framework Common meaning representation: semantic frame English Chinese Spanish Japanese Parsing Rules Generation Models Speech Corpora NLG Synthesis NLU Recognition Semantic Frame English Chinese Spanish Japanese

Content Understanding and Translation English: Some thunderstorms may be accompanied by gusty winds and hail clause: weather_event topic: precip_act, name: thunderstorm, num: pl quantifier: some pred: accompanied_by adverb: possibly topic: wind, num: pl, pred: gusty and: precip_act, name: hail wind hail rain/storm weather Frame indexed under weather, wind, rain, storm, and hail Japanese: Spanish: Algunas tormentas posiblement acompanadas por vientos racheados y granizo Chinese: ¤@ ¨Ç ¹p «B ¥i ¯à ·| ¦ñ ¦³ °} ­· ©M ¦B ¹r

Audio Demonstration User asks: “Will it rain tomorrow in Boston?” System paraphrases query, then responds in Chinese “Please repeat that” in English or Chinese interpreted identically System repeats response in Chinese User speaks query in English: seamless language switching System paraphrases, then translates query into Chinese User attempts to repeat translation Recognition error: hallucinates an erroneous date (February 30) which will be remembered System supplies known cities in England User chooses London System has no weather for London on February 30 User asks “how about today?” System provides London’s weather today User asks for a translation into English, which is provided

Proposed Translation Procedure {c wh_question :topic {q name :poss “you” } :auxil “link” :complement {q object :trace “what” } {c eform :attribute “name” :person “you” } {c wh_question :topic {q name } :pro “you” :verb “call” :complement {q object :trace “what” } If generated query fails to parse, simplify interlingua and generation Key-value Representation generate generate Linguistic Frame English query parse Linguistic Frame transfer generate parse Chinese query “what is your name” “ni3 jiao4 shen2_me5 ming2_zi4”

Proposed Exercise using Typed Inputs Query: Response: Type-in Window Reply Window Input: Input: Da2 la2 si4 hui4 xia4 yu3 ming2 tian1 ma5? System is able to parse query in spite of tone errors and (limited) syntax errors Next: Dallas rain tomorrow Next: Los Angeles wind Saturday Query: Da2 la1 si1 ming2 tian1 hui4 xia4 yu3 ma5? System color codes errors in tone and in syntactic constructs Response: Da2 la1 si1 ming2 tian1 xia4 wu3 xia4 te4 da4 yu3

Testing the Effectiveness of Training on Typed Input: Proposed Measures Compare the quality of spoken dialogue recorded before and after a Web-based training session Measures of fluency: Syntactic well-formedness Tone production accuracy Frequency of pauses, edits, and filler words Phonetic quality , etc. Measures of communication success: Frequency of usage of translation assistance Understanding error rate Task completion Time to completion, etc.

Technology Goal: Automated Language Understanding Once translation ability exists from English to target language, can create reverse system almost effortlessly English Sentence Interlingual Representation parse Mandarin Sentence generate Corpus Pairs Grammar Induction Utilizes English parse tree and Mandarin generation lexicon to induce Mandarin parse tree Mandarin Parsing Grammar

Building NxN Translation Efficiently English Mandarin Japanese Interlingua Interlingua French Arabic Spanish Urdu Korean Automatic Grammar Induction

Future Plans (Near Term and Long Term) Install current version of system at Cambridge University Incorporate CU Mandarin recognizer Add support for audio input at the computer Build high quality synthesis capability Improve understanding, dialogue, and translation performance Collect and transcribe data from language learners and assess both system and students Develop various scoring algorithms for student fluency Refine all aspects of system based on collected data