Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.

Slides:



Advertisements
Similar presentations
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
Advertisements

MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
Bootstrapping a Language- Independent Synthesizer Craig Olinsky Media Lab Europe / University College Dublin 15 January 2002.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
PRESENTING NEW LANGUAGE STRUCTURE LANGUAGE STUDENTS ARE NOT ABLE TO USE YET LANGUAGE SHOULD BE PRESENTED IN CONTEXT CHARACTERISTICS TYPES SHOWS WHAT LANGUAGE.
Why is ASR Hard? Natural speech is continuous
The CMU TransTac 2007 Eyes-free and Hands-free Two-way Speech-to-Speech Translation System Thilo Köhler and Stephan Vogel Nguyen Bach, Matthias Eck, Paisarn.
Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
Speech & Language Modeling Cindy Burklow & Jay Hatcher CS521 – March 30, 2006.
GOOD, MULTILINGUAL interpretation, translation, resources What can we do for the OG-08? Christian BOITET GETA, CLIPS, IMAG-campus UJF & CNRS, Grenoble,
Linguistic Representation of Finnish in the Medical Domain Spoken Language Translation System Marianne Santaholma, University of Geneva, TIM/ISSCO.
Enlightening minds. Enriching lives. Tamil Digital Industry Badri Seshadri K.S.Nagarajan New Horizon Media.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
CMU Shpinx Speech Recognition Engine Reporter : Chun-Feng Liao NCCU Dept. of Computer Sceince Intelligent Media Lab.
SIG IL 2000 Evaluation of a Practical Interlingua for Task-Oriented Dialogue Lori Levin, Donna Gates, Alon Lavie, Fabio Pianesi, Dorcas Wallace, Taro Watanabe,
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Reading Aid for Visually Impaired Veera Raghavendra, Anand Arokia Raj, Alan W Black, Kishore Prahallad, Rajeev Sangal Language Technologies Research Center,
Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
Kishore Prahallad IIIT-Hyderabad 1 Unit Selection Synthesis in Indian Languages (Workshop Talk at IIT Kharagpur, Mar 4-5, 2009)
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Professor Alan W. Black Language Technologies Institute, Carnegie Mellon University Erik Jonsson School of Engineering and Computer Science The University.
Virach Sornlertlamvanich Information R&D Division (iTech) National Electronics and Computer Technology Center (NECTEC) THAILAND 19 January 2001 Symposium.
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 8.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
© 2013 by Larson Technical Services
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Domain Specific Models John D. McGregor M13S1. Tool development Eclipse is an environment intended as a basis for building software engineering tools.
LREC 2004, 26 May 2004, Lisbon 1 Multimodal Multilingual Resources in the Subtitling Process S.Piperidis, I.Demiros, P.Prokopidis, P.Vanroose, A. Hoethker,
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
© 2013 by Larson Technical Services
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
Utkal University We Work On Image Processing Speech Processing Knowledge Management.
Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.
Language Technologies Capability Demonstration Alon Lavie, Lori Levin, Alex Waibel Language Technologies Institute Carnegie Mellon University CATANAL Planning.
Approaching a New Language in Machine Translation Anna Sågvall Hein, Per Weijnitz.
Recent Advances in Speech Translation Systems ESSLLI-2002 Tutorial Course August 12-16, 2002 Course Organizers: Alon Lavie – Carnegie Mellon University.
Natural Language and Speech (parts of Chapters 8 & 9)
Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.
CS223: Software Engineering
Natural Language Processing (NLP)
Speech Recognition Created By : Kanjariya Hardik G.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Carnegie Mellon IRST-itc Balancing Expressiveness and Simplicity in an Interlingua for Task Based Dialogue Lori Levin, Donna Gates, Dorcas Wallace, Kay.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Named Entities in Domain Unlimited Speech Translation Alex Waibel, Stephan Vogel, Tanja Schultz Carnegie Mellon University Interactive Systems Labs.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Presented By Sharmin Sirajudeen S7 CS Reg No :
G. Anushiya Rachel Project Officer
Approaches to Machine Translation
Text-To-Speech System for English
3.0 Map of Subject Areas.
--Mengxue Zhang, Qingyang Li
EXPERIMENTS WITH UNIT SELECTION SPEECH DATABASES FOR INDIAN LANGUAGES
Approaches to Machine Translation
Artificial Intelligence 2004 Speech & Natural Language Processing
Huawei CBG AI Challenges
Presentation transcript:

Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of Thai script in order to: allows non-Thai researchers to work with the Roman representation like in the grammar development romanized output basically provides the pronunciation > easier for speech synthesis component Current dictionary covers the given 6-hours database = 734 words Rapid bootstrapping of acoustic models using a 7-lingual GlobalPhone model set (Ch, Cr, Fr, Ge, Ja, Sp, Tu) Results on ASR indicate that rapid bootstrapping can be done successfully for limited domain (see table) Word accuracy [%] in Thai language on the evaluation set: CI-AM83.63% CD-AM (500)84.44% CD-AM (1000) 82.71% A Thai Speech Translation System for Medical Dialogs Tanja Schultz, Dorcas Alexander, Alan W Black, Kay Peterson, Sinaporn Suebvisai, Alex Waibel Speech Recognition Tcl/Tk based Communication Server Runs on Windows and Linux platforms Integrates several languages: Thai, English, Spa, Ch,... Integrates different speech recognition approaches Decoding along n-grams versus Context Free Grammars Integrates different translation approaches IF-based Translation versus statistical MT Integrates two natural language generations from IF knowledge-based generation with the pseudo-unification statistical generation Allows transmission of IF across devices for (wireless) multi-party translation (see demo: Laptop  PDA ) Speech Synthesis Translation Symbolic Generation GenKit Recognition/Analysis SR+LM IF Source Lang Speech Synthesis Cepstral Statistical Generation IF2NL Target Language Text Target Lang Speech Stat. Analysis SOUP Direct SMT SR+Parsing (CFG-Grammar) Thai / English medical English / Thai medical System Architecture First Thai voice built in the Festival Speech Synthesis System Limited domain targeting the Hotel Reservation domain 235 sentence that covered the main aspects of immediate interest Recorded, auto-labeled, and built a synthetic voice using FestVox tools Converted to small footprint portable version using Cepstral's Theta engine Rapid synthesis development in new languages: Phoneme set shared with Speech Recognition Lexicon of 522 words vocabulary constructed by hand Statistically trained letter to sound rules to bootstrap the required word coverage Unit selection concatenative synthesis Phones tagged with syllable and tone information for more fluent results Interlingua based Machine Translation component - Interchange Format (IF) abstracts from variation in syntax across languages allows monolingual development for analysis and generation provides paraphrase back into source language can be easily extended to new languages due to STAR structure Some extensions due to Thai characteristics: The use of a term to indicate the gender of the person: Thai: zookhee kha1 - Eng: okay (ending) s[acknowledge] (zookhee *[speaker=]) An affirmation that means more than simply "yes." Thai: saap khrap - Eng: know (ending) s[affirm+knowledge](saap *[speaker=]) Verb separation of terms for feasibility and other modalities Interface: Hypothesis Thai+ Roman script Parse tree (CFG) Translation IF representation