Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation.

Similar presentations


Presentation on theme: "Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation."— Presentation transcript:

1 Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation Port human-language technologies for English conversational interfaces to Japanese Use existing Jupiter domain as test case –A telephone-only conversational interface for weather information –More than 500 cities worldwide (~350 in US) –On-line information from four Web sites –Use the Galaxy client server architecture Speech Recognition (SUMMIT: Glass et al., ICSLP ‘96) –Lexicon: >2,000 words with phonemic pronunciations –Phonological modeling: *Japanese specific phonological rules, e.g., desu ka  /d e s k a/ *Japanese phonetic units mapped into English ones –Acoustic modeling: *Used English models to generate forced transcriptions utterances *Retrained acoustic models to create hybrid models –Language modeling: *Class n-gram using 60 word classes. trained on ~3,500 read & spontaneous sentences *Also exploring a class n-gram derived automatically from TINA Speech Synthesis –NTT Fluet text-to-speech system Note: Sample sentences from Japanese speakers can be played from PC S. Seneff, J. Glass, T.J. Hazen, J. Polifroni, and V. Zue MIT Laboratory for Computer Science Y. Minami NTT Cyberspace Laboratories Language as Interface Language Understanding (TINA: Seneff, Comp Ling, ‘92 ) –Japanese grammar contains >900 unique non-terminals –Translation file maps Japanese words to English equivalent –Produces same semantic frame as for English inputs –Left recursive structure of Japanese requires look-ahead to resolve role of content words *Parse each content word into structure labeled “object” *Drop off “object” after next particle, which defines role and position in hierarchy Language Generation (GENESIS, Glass et al., ICSLP ‘94) –Used English language generation tables as template –Modified ordering of constituents –Provided translation lexicon for words –Many language specific challenges, including constituent ordering, quantifier translation, and multiple meanings Language as Content Use the same internal representation for Japanese and English Update from Web sites and satellite feeds at frequent intervals Parse all data into semantic frames to capture meaning Scan frames for semantic content and prepare new relational database table entries English:Some thunderstorms may be accompanied by gusty winds and hail Japanese: clause: weather_event topic: precip_act, name: thunderstorm, num: pl quantifier: some pred: accompanied_by adverb: possibly topic: wind, num: pl, pred: gusty and: precip_act, name: hail weather wind hail rain/storm Frame indexed under weather, wind, rain, storm, and hail Our approach to developing multilingual interfaces appears feasible A top-down approach to parsing can be made effective for left-recursive languages Word order divergence between English and Japanese motivated a redesign of our language generation component Novel technique of generating a class n-gram language model using the NL component appears promising Involvement of Japanese researcher is essential Additional data collection from native Japanese speakers –Nearly 1000 sentences were collected in December Improvement of individual components –Vocabulary coverage, acoustic and language models –Parse coverage –Continued development of a more sophisticated language generation component Expansion of weather content for Japan Research Objectives


Download ppt "Lessons Learned Mokusei: Multilingual Conversational Interfaces Future Plans Explore language-independent approaches to speech understanding and generation."

Similar presentations


Ads by Google