SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical.

Slides:



Advertisements
Similar presentations
Web Passive Voice Tutor: an Intelligent Computer Assisted Language Learning System over the WWW Maria Virvou & Victoria Tsiriga Department of Informatics,
Advertisements

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
The Silent Way Tell me and I forget Teach me and I remember
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Second Language Acquisition
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Speech Synthesis Markup Language SSML. Introduced in September 2004 XML based Assists the generation of synthetic speech Specifies the way speech is outputted.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Tone, Accent and Stress February 14, 2014 Practicalities Production Exercise #2 is due at 5 pm today! For Monday after the break: Yoruba tone transcription.
Emotion in Meetings: Hot Spots and Laughter. Corpus used ICSI Meeting Corpus – 75 unscripted, naturally occurring meetings on scientific topics – 71 hours.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Context in Multilingual Tone and Pitch Accent Recognition Gina-Anne Levow University of Chicago September 7, 2005.
Simulation.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Equal-party Conversation System for Language Learning Chih-yu Chao (advisor: Stephanie Seneff) April 14 th, 2006 Dialogs on Dialogs Reading Group.
Input-Output Relations in Syntactic Development Reflected in Large Corpora Anat Ninio The Hebrew University, Jerusalem The 2009 Biennial Meeting of SRCD,
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
SCILL: Spoken Conversational Interaction for Language Learning
12.0 Computer-Assisted Language Learning (CALL) References: 1.“An Overview of Spoken Language Technology for Education”, Speech Communications, 51, pp ,
Teaching Oral Communication Skills
ILDP CORP Spoken English Training Course Accredited By FLUENCY SUPER.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Towards Natural Clarification Questions in Dialogue Systems Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg AISB 2014 Convention at Goldsmiths, University.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
C omputer S cience and A rtificial I ntelligence L aboratory Multilingual Conversational Systems SPEECH RECOGNITION LANGUAGE UNDERSTANDING LANGUAGE GENERATION.
Automated Essay Evaluation Martin Angert Rachel Drossman.
Background Infants and toddlers have detailed representations for their known vocabulary items Consonants (e.g., Swingley & Aslin, 2000; Fennel & Werker,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Lafford, B. A. (2009). Toward an Ecological CALL: Update to Garrett (1991). The Modern Language Journal, 93, doi: /j x.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
1 The Ferret Copy Detector Finding short passages of similar texts in large document collections Relevance to natural computing: System is based on processing.
Lafford, B. A. (2009). Toward an Ecological CALL: Update to Garrett (1991). The Modern Language Journal, 93, doi: /j x.
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Learning English: Language-nutrition method Olga Kozhunova Institute for the Informatics Problems, The Russian Academy of Sciences (Moscow, Russia) CML-2011,
Bernd Möbius CoE MMCI Saarland University Lecture 7 8 Dec 2010 Unit Selection Synthesis B Möbius Unit selection synthesis Text-to-Speech Synthesis.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
EXPLOITING DYNAMIC VALIDATION FOR DOCUMENT LAYOUT CLASSIFICATION DURING METADATA EXTRACTION Kurt Maly Steven Zeil Mohammad Zubair WWW/Internet 2007 Vila.
Artificial Intelligence 2004 Speech & Natural Language Processing Speech Recognition acoustic signal as input conversion into written words Natural.
Xinhao Wang, Jiazhong Nie, Dingsheng Luo, and Xihong Wu Speech and Hearing Research Center, Department of Machine Intelligence, Peking University September.
Tone, Accent and Quantity October 19, 2015 Thanks to Chilin Shih for making some of these lecture materials available.
Benjamin Rifkin The College of New Jersey.  Background  Development  ACTFL and ILR  Modalities  Levels and sublevels.
 Ontology Induction (Chen et al., 2013 & 2014) Frame-semantic parsing on ASR results (Das et al., 2013) frame  slot candidate lexical unit  slot filler.
Stages of Test Development By Lily Novita
Audio/Speech CS376: November 4, 2004 as presented by Jessica Kuo.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Pitch Tracking + Prosody January 19, 2012 Homework! For Tuesday: introductory course project report Background information on your consultant and the.
3. Nine-Twentieth-Century Approaches to Language Teaching
Suprasegmental features and Prosody Lect 6A&B LING1005/6105.
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
English Reading Guidance with Learning Portfolio Analysis Ting-Ting Wu Graduate School of Technological and Vocational Education, National Yunlin University.
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
Investigating Pitch Accent Recognition in Non-native Speech
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 11/8/2018.
SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 12/3/2018.
Research on the Modeling of Chinese Continuous Speech Recognition
Presentation transcript:

SPOKEN LANGUAGE SYSTEMS MIT Computer Science and Artificial Intelligence Laboratory Mitchell Peabody, Chao Wang, and Stephanie Seneff June 19, 2004 Lexical Tone Acquisition through Typed Interactions

MIT Computer Science and Artificial Intelligence Laboratory SLS Overview Motivation Experimental structure Approach –Tone analysis –Lexical tone correction –Interface –Experiment Discussion Future work

MIT Computer Science and Artificial Intelligence Laboratory SLS Motivation Dialogue systems in language learning –Simulated conversations –Small domains centered around travel scenarios *Flight reservations *Hotel reservations *Weather *Wake-up call and reminders *Navigation assistance –Feedback on performance Leverage technology that is mature Can use existing dialogue systems to enable data collection from non-native speakers

MIT Computer Science and Artificial Intelligence Laboratory SLS Motivation Improve pronunciation in Mandarin –Phonetic and syllable level –Tone / pitch level Non-native pitch contours do not conform to native contours in Mandarin –Affects understanding and interaction with native speakers –In possibly embarrassing ways (gan1 vs. gan4) Recent work has focused on tone production –Perceptual training isolated words (Wang et al., 1999, 2003) –Production training (Leather, 1990) What about non-native speakers’ tone production as it relates to their lexical tone knowledge? –Non-native speakers typically confuse or forget the correct lexical tones for less commonly used words –How does this affect their ability to speak with proper tones?

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Experiment conducted in weather domain (Jupiter) Includes 5 phases Intention is to introduce student to new, uncommon vocabulary (city names)

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Speaking Phase 1 Record 10 read sentences in pinyin –Can record as many times as desired –Baseline when student has perfect knowledge of lexical tone

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Speaking Phase 1 Typing Phase 2 Given 10 prompts, e.g., windy – Monday – Los Angeles –Instructed to create well-formed Mandarin sentences from prompts *luo1 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5 ? –Sentences typed in pinyin with numeric tone markers –Only general feedback is given *“Your sentence is grammatically correct but contains one or more tone mistakes.”

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Speaking Phase 1 Typing Phase 2 Speaking Phase 3 Record 10 sentences from prompts –Can record as many times as desired –Used as a “before” model for pitch

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Speaking Phase 1 Typing Phase 2 Speaking Phase 3 Typing Phase 4 Given 10 prompts, e.g., windy – Monday – Los Angeles –Instructed to create well-formed Mandarin sentences from prompts *luo1 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5 ? –Specific feedback on tone mistakes is given *“You input luo1 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5 but it should be luo4 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5.” –Student is required to fix mistakes

MIT Computer Science and Artificial Intelligence Laboratory SLS Experiment Structure Speaking Phase 1 Typing Phase 2 Speaking Phase 3 Typing Phase 4 Speaking Phase 5 Record 10 sentences from prompts –Can record as many times as desired –Used as an “after” model for pitch

MIT Computer Science and Artificial Intelligence Laboratory SLS Overview Motivation Experimental Structure Approach –Tone analysis –Lexical tone correction –Interface –Experiment Discussion Future work

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Tone analysis Native versus non-native speaker pitch contours –Pitch extracted using algorithm in (Wang and Seneff, 2000) –Statistics of each pitch contour over each syllable considered without regard for left or right contexts Normalization –Duration normalized by sampling pitch at 10% intervals –Pitch normalized according to: Comparisons of pitch based on (Wang et al., 2003) –Include normalized pitch value, peak, valley, range, peak position, valley position, falling range, and rising range Example –One native speaker, one non-native student –DLI Corpus: corpus contains 4 native (2065 utterances), 20 non-native (4657 utterances)

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Tone analysis example

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Tone analysis example

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Lexical Tone Correction Normally written in characters – 洛杉矶星期一刮风吗? Pinyin methods –Diacritic: luò shān jī xīng qī yī guā fēng ma? –Numeric: luo4 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5? If a student does not know the lexical tone for some word, then this will be reflected in the typed input –luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2? How do we correct these mistakes?

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Lexical Tone Correction Exploit some features of Chinese –Syllable lexicon is small, approximately 420 unique syllables –5 tones (including neutral tone) Exploit some abilities of TINA –Ability to parse weighted word FST using probabilistic models –FST normally represents a list of recognizer hypotheses –A path through the FST represents the most likely correct parse Given some input 1)Generate FST of single sentence 2)Expand the tones on each syllable 3)Attempt to parse FST 4)Path through FST represents corrected tones

MIT Computer Science and Artificial Intelligence Laboratory SLS FST Example: Step 1 Step 1: Generate simple FST Given: luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2

MIT Computer Science and Artificial Intelligence Laboratory SLS FST Example: Step 2 Step 2: Assign benefit of doubt to items that appear in lexicon Given: luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2 Items that do not appear in lexicon are removed.

MIT Computer Science and Artificial Intelligence Laboratory SLS FST Example: Step 3 Step 3: Expand each syllable to alternate tones. More compact than specifying each possible sentence variant. Given: luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2

MIT Computer Science and Artificial Intelligence Laboratory SLS FST Example: Step 4 Step 4: Remaining probability is uniformly distributed among alternate tones Given: luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2

MIT Computer Science and Artificial Intelligence Laboratory SLS FST Example: Step 5 Step 5: Parsing reveals the correct tones Given: luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2 Correct: luo4 shan1 ji1 xing1 qi1 yi1 gua1 feng1 ma5

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Web interface

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Web interface Student is prompted for city, time, and event

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Web interface Student types in: A question concerning this topic in Mandarin using pinyin OR An English word or phrase for a translation Student types in: A question concerning this topic in Mandarin using pinyin OR An English word or phrase for a translation

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Web interface Student is given feedback

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Web interface

MIT Computer Science and Artificial Intelligence Laboratory SLS Approach: Experiment 5 phases –Read speech –Typed with only general feedback in typed portion –Recorded prompts –Typed with specific feedback in typed portion –Recorded prompts Students, so far, are all students in their early to mid-20s and in the 1 st year of MIT’s Chinese program. We have made arrangements with the Defense Language Institute to have their students participate in future experiments

MIT Computer Science and Artificial Intelligence Laboratory SLS Overview Motivation Experimental Structure Approach –Tone analysis –Lexical tone correction –Interface –Experiment Discussion Future work

MIT Computer Science and Artificial Intelligence Laboratory SLS Discussion Laid out a framework for a set of exercises to help students acquire competency in a foreign language on a specific topic (weather) Designed an experiment for examining the effects of lexical tone knowledge in non-native speakers Implemented a robust method capable of correcting lexical tone errors in typed pinyin Outlined a method for pitch assessment Premature to make any claims due to data sparseness Unforeseen benefits of lexical tone correction –Can correct erroneous recognizer output with language model –Enables non-native speakers with imperfect lexical tone knowledge to accurately transcribe user utterances

MIT Computer Science and Artificial Intelligence Laboratory SLS Future work Data collection –Invite a large group of students to participate in the exercise –Allow students to interact with weather dialogue system System extensions –Provide examples of native speech for sentences typed by students with high quality Mandarin from ENVOICE (Yi 2003) –Automatic pitch correction using phase vocoder techniques (Tang et al., 2001) Assessment –Develop context-dependent models to account for tone sandhi and co-articulation effects –Develop algorithms for tone assessment –Augment with segmental assessment techniques (Kim et al., 2004) –Analyze syntactic errors made by non-natives (since prompts require students to form their own sentences)

MIT Computer Science and Artificial Intelligence Laboratory SLS Thank you! 谢谢! Questions?