Download presentation
Presentation is loading. Please wait.
Published byCurtis Horn Modified over 9 years ago
1
L C S Spoken Language Systems Group Stephanie Seneff Spoken Language Systems Group MIT Laboratory for Computer Science January 13, 2000 Multilingual Conversational Interfaces: An NTT-MIT Collaboration
2
L C S Spoken Language Systems Group Collaborators James Glass (Co-PI, MIT-LCS) T.J. Hazen (MIT-LCS) Yasuhiro Minami (NTT Cyberspace Labs) Joseph Polifroni (MIT-LCS) Victor Zue (MIT-LCS)
3
L C S Spoken Language Systems Group What are Conversational Interfaces Can communicate with users through conversation Can understand verbal input –Speech recognition –Language understanding (in context) Can retrieve information from on-line sources Can verbalize response –Language generation –Speech synthesis Can engage in dialogue with a user during the interaction Introduction || Approach || Mokusei || Summary
4
L C S Spoken Language Systems Group Components of Conversational Interfaces DISCOURSE CONTEXT DISCOURSE CONTEXT DIALOGUE MANAGEMENT DIALOGUE MANAGEMENT DATABASE Graphs & Tables LANGUAGE UNDERSTANDING LANGUAGE UNDERSTANDING Meaning Representation Meaning Representation Meaning LANGUAGE GENERATION LANGUAGE GENERATION SPEECH SYNTHESIS SPEECH SYNTHESIS Speech Sentence SPEECH RECOGNITION SPEECH RECOGNITION Speech Words Introduction || Approach || Mokusei || Summary
5
L C S Spoken Language Systems Group Language Generation Language Generation Speech Recognition Speech Recognition Discourse Resolution Discourse Resolution Text-to-Speech Conversion Text-to-Speech Conversion Dialogue Management Dialogue Management Language Understanding Language Understanding Application Back-end Application Back-end Audio Server Audio Server Audio Server Audio Server I/O Servers I/O Servers System Architecture: Galaxy Hub Application Back-end Application Back-end Application Back-end Application Back-end Introduction || Approach || Mokusei || Summary
6
L C S Spoken Language Systems Group Application Development at MIT Jupiter: Weather reports (1997) –500 cities worldwide –Information updated three times daily from four web sites, plus a satellite feed Pegasus: Flight status (1998) –~4,000 flights in US airspace for 55 major cities –Information updated every three minutes –Also uses flight schedule information, updated daily Voyager: (Greater Boston) traffic and navigation (1998) –Traffic information updated every three minutes –Also uses maps and navigation information Mercury: Travel planning (1999) –Flight information and reservation for ~250 cities worldwide –Flight schedule information and pricing Demonstration: Jupiter in English Introduction || Approach || Mokusei || Summary
7
L C S Spoken Language Systems Group NTT-MIT Collaborative Research: Mokusei Explore language-independent approaches to speech understanding and generation Develop necessary human-language technologies to enable porting of conversational interfaces from English to Japanese Use existing Jupiter weather-information domain as test case –It is the most mature English system –It allows us to explore language technology for interface and content Speech Generation Speech Understanding Common Meaning Representation DATABASE Speech Understanding Speech Generation Introduction || Approach || Mokusei || Summary
8
L C S Spoken Language Systems Group Multilingual Conversational Systems: Our Approach SPEECH RECOGNITION LANGUAGE UNDERSTANDING LANGUAGE GENERATION Language Independent Language Dependent Rules SPEECH SYNTHESIS SPEECH SYNTHESIS SPEECH SYNTHESIS Models Language Transparent DIALOGUE MANAGER DATABASE Graphs & Tables Meaning Representation DISCOURSE CONTEXT Introduction || Approach || Mokusei || Summary
9
L C S Spoken Language Systems Group Mokusei: Speech Recognition Lexicon: >2,000 words Phonological modeling: –Japanese specific phonological rules, e.g., *Deletion of /i/ and /u/:desu ka /d e s k a/ Acoustic modeling: –Used English models to generate transcriptions for Japanese (read and spontaneous) utterances –Retrained acoustic models to create hybrid models from a mixture of English and Japanese utterances Language modeling: –Class n-gram using 60 word classes –Also exploring a class n-gram derived automatically from TINA Introduction || Approach || Mokusei || Summary
10
L C S Spoken Language Systems Group Mokusei: Language Understanding Parse query into meaning representation –Uses same NL system (TINA) as for English –Top-down parsing strategy with trace mechanism –Probability model automatically trained –Chooses best hypothesis from proposed word graph Japanese grammar contains –>900 unique nonterminals –Nearly 2,500 vocabulary items Translation file maps Japanese words to English equivalent Produces same semantic frame (i.e., meaning representation) as for English inputs Introduction || Approach || Mokusei || Summary
11
L C S Spoken Language Systems Group Mokusei: Language Understanding (cont’d) Problem: Left recursive structure of Japanese requires look-ahead to resolve role of content words –Nihon wa... –Nihon no tenki wa... –Nihon no Tokyo no tenki wa... Solution: Use trace mechanism –Parse each content word into structure labeled “object” –Drop off “object” after next particle, which defines role and position in hierarchy Nihon no Tokyo no tenki wa doo desu ka object-1 subject wa object-3 no object-2 no object-1 predicate object-2 no object-1 object-3 no object-2 no object-1 Introduction || Approach || Mokusei || Summary
12
L C S Spoken Language Systems Group Mokusei: Content Processing Update sources from Web sites and satellite feeds at frequent intervals –Now harvesting weather reports for ~50 additional Japanese cities Use the same representation for English and Japanese Parse all linguistic data into semantic frames to capture meaning Scan frames for semantic content and prepare new relational database table entries Introduction || Approach || Mokusei || Summary
13
L C S Spoken Language Systems Group English:Some thunderstorms may be accompanied by gusty winds and hail clause: weather_event topic: precip_act, name: thunderstorm, num: pl quantifier: some pred: accompanied_by adverb: possibly topic: wind, num: pl, pred: gusty and: precip_act, name: hail weather wind hail rain/storm Frame indexed under weather, wind, rain, storm, and hail Mokusei: Example of Content Processing Japanese: Spanish:Algunas tormentas posiblement acompanadas por vientos racheados y granizo German: Einige Gewitter moeglichenweise begleitet von boeigem Wind und Hagel Introduction || Approach || Mokusei || Summary
14
L C S Spoken Language Systems Group Mokusei: Language Generation Using Genesis Used English language generation tables as template Modified ordering of constituents Provided translation lexicon for >4,000 words Challenges: –Prepositions had to be marked for role: in_loc, in_time –Multiple meanings for some other words: e.g., “well inland” –Complex sentences presented difficulties for constituent ordering A new version of GENESIS is being developed to support finer control of constituent ordering Introduction || Approach || Mokusei || Summary
15
L C S Spoken Language Systems Group Mokusei: Speech Synthesis Currently use the NTT Fluet text-to-speech system Fully integrated into the system –Runs as a server communicating with the Galaxy hub Introduction || Approach || Mokusei || Summary
16
L C S Spoken Language Systems Group Mokusei Demonstration Entire system running at MIT Access via international telephone call Scenario: inquiring about weather conditions in Japan and worldwide Potential problems: –The system is VERY new! –System reliability –14 hour time difference –Transmission conditions and environmental noise Introduction || Approach || Mokusei || Summary
17
L C S Spoken Language Systems Group Lessons Learned Our approach to developing multilingual interfaces appears feasible –Performance is similar to the English system two years ago A top-down approach to parsing can be made effective for left-recursive languages Word order divergence between English and Japanese motivated a redesign of our language generation component Novel technique of generating a class n-gram language model using the NL component appears promising Involvement of Japanese researcher is essential Introduction || Approach || Mokusei || Summary
18
L C S Spoken Language Systems Group Future Work Additional data collection from native Japanese speakers –Nearly 2,000 sentences were collected in December and January Improvement of individual components –Vocabulary coverage, acoustic and language models –Parse coverage –Continued development of a more sophisticated language generation component Expansion of weather content for Japan Introduction || Approach || Mokusei || Summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.