Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of.

Similar presentations

Presentation on theme: "Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of."— Presentation transcript:

1 Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of Thai script in order to: allows non-Thai researchers to work with the Roman representation like in the grammar development romanized output basically provides the pronunciation > easier for speech synthesis component Current dictionary covers the given 6-hours database = 734 words Rapid bootstrapping of acoustic models using a 7-lingual GlobalPhone model set (Ch, Cr, Fr, Ge, Ja, Sp, Tu) Results on ASR indicate that rapid bootstrapping can be done successfully for limited domain (see table) Word accuracy [%] in Thai language on the evaluation set: CI-AM83.63% CD-AM (500)84.44% CD-AM (1000) 82.71% A Thai Speech Translation System for Medical Dialogs Tanja Schultz, Dorcas Alexander, Alan W Black, Kay Peterson, Sinaporn Suebvisai, Alex Waibel Speech Recognition Tcl/Tk based Communication Server Runs on Windows and Linux platforms Integrates several languages: Thai, English, Spa, Ch,... Integrates different speech recognition approaches Decoding along n-grams versus Context Free Grammars Integrates different translation approaches IF-based Translation versus statistical MT Integrates two natural language generations from IF knowledge-based generation with the pseudo-unification statistical generation Allows transmission of IF across devices for (wireless) multi-party translation (see demo: Laptop  PDA ) Speech Synthesis Translation Symbolic Generation GenKit Recognition/Analysis SR+LM IF Source Lang Speech Synthesis Cepstral Statistical Generation IF2NL Target Language Text Target Lang Speech Stat. Analysis SOUP Direct SMT SR+Parsing (CFG-Grammar) Thai / English medical English / Thai medical System Architecture First Thai voice built in the Festival Speech Synthesis System Limited domain targeting the Hotel Reservation domain 235 sentence that covered the main aspects of immediate interest Recorded, auto-labeled, and built a synthetic voice using FestVox tools Converted to small footprint portable version using Cepstral's Theta engine Rapid synthesis development in new languages: Phoneme set shared with Speech Recognition Lexicon of 522 words vocabulary constructed by hand Statistically trained letter to sound rules to bootstrap the required word coverage Unit selection concatenative synthesis Phones tagged with syllable and tone information for more fluent results Interlingua based Machine Translation component - Interchange Format (IF) abstracts from variation in syntax across languages allows monolingual development for analysis and generation provides paraphrase back into source language can be easily extended to new languages due to STAR structure Some extensions due to Thai characteristics: The use of a term to indicate the gender of the person: Thai: zookhee kha1 - Eng: okay (ending) s[acknowledge] (zookhee *[speaker=]) An affirmation that means more than simply "yes." Thai: saap khrap - Eng: know (ending) s[affirm+knowledge](saap *[speaker=]) Verb separation of terms for feasibility and other modalities Interface: Hypothesis Thai+ Roman script Parse tree (CFG) Translation IF representation

Download ppt "Rapid Development in new languages Limited training data (6hrs) provided by NECTEC from 34 speakers, + 8 spks for development and test Romanization of."

Similar presentations

Ads by Google