Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mobile Speech Translation Systems Design for 2020 11/19/2013 INST603 Term Project MIM, UMD Makoto Asami.

Similar presentations


Presentation on theme: "Mobile Speech Translation Systems Design for 2020 11/19/2013 INST603 Term Project MIM, UMD Makoto Asami."— Presentation transcript:

1

2 Mobile Speech Translation Systems Design for 2020 11/19/2013 INST603 Term Project MIM, UMD Makoto Asami

3 Table of Contents ・ Overview of the Project ・ Outline of Speech Translation Systems ・① Automatic Speech Recognition (Speech-to-Text) ・② Machine Translation ・③ Voice Synthesis (Text-to-Speech) ・ Google Translate (for iOS) - Current Mobile Speech Translation Systems - ・ How Mobile Speech Translation Systems Work (Online) ・ Forecast of Mobile Speech Translation Systems in 2020 ・ My User Interface Design for Future Systems ・ Conclusion

4 Overview of the Project - A System Design Practice for Mobile Speech Translation Systems in 2020 - ■ Reasons I chose this to be the final project: ・ It is expected that advancement of raw computing power would significantly improve capability of language translation systems in the near future. ・ Meanwhile we generally anticipate more and more people in the world will communicate with each other in the future. Also my home country, Japan, expects many foreign people to come for 2020 Tokyo Olympic Games. ・ Thus, this project aims to study current situation of speech translation systems and provide feasible mobile speech translation systems solution which would benefit people while traveling abroad and in their daily lives.

5 Outline of Speech Translation Systems ・ A speech translation system typically integrate the following three technologies: Automatic Speech Recognition (ASR), Machine Translation (MT) and Voice Synthesis (TTS). ① Automatic Speech Recognition (ASR) ② Machine Translation (MT) ③ Voice Synthesis (TTS) Can I reserve a room? 部屋を予約でき ますか? Speech recognition databases (Japanese) Japanese- English translation databases Voice synthesis databases (English) Speech, Japanese Text, Japanese Text, English Speech, English

6 ① Automatic Speech Recognition (ASR/SRT) - Speech to Text (STT) - Application includes Voice User Interfaces such as dictation (e.g. Word Processors, Emails, Google Voice Recognition, medical transcription) and Hands free computing (e.g. Windows, Siri).Google Voice Recognition Nuance Dragon NaturallySpeakingNuance Dragon NaturallySpeaking ($99.99~): Accuracy rate of 93% CMUSphinxCMUSphinx (Open Source Toolkit For Speech Recognition) by Carnegie Mellon Univ. Speaker Dependent (use “training”): Large-vocabulary/limited-users (e.g. Windows Speech Recognition) Speaker Independent (do not use training): Small-vocabulary/many-users (e.g. automated telephone answering)

7 Often processed on clowd Require Processing Power and Storage ou pu nn fe isu buk “Open Facebook ”

8 ② Machine Translation (MT) Research has been continued since it began in 1951 in MIT. The human translation process may be described as: 1. Decoding the meaning of the source text; and 2. Re-encoding this meaning in the target language. To decode the meaning of the source text, the translator must interpret and analyze all the features of the text. The process requires in-depth knowledge of the grammar, idioms, etc., of the source language, as well as the culture of its speakers. Machine Translation Approaches: Rule-based, Transfer-based, Interlingual, Dictionary-based, Statistical, Example-based, Hybrid (statistical + rule-based) Inside Google Translate Beginning in the late 1980s, as computational power increased and became less expensive, more interest was shown in statistical models for machine translation.

9 ③ Voice / Speech Synthesis - Text to Speech (TTS) - Artificial production of human speech from language text Applied to screen readers as assistive technology for blind, visually impaired person or others: ・ Microsoft Narrator: Navigating operations on Windows ・ NaturalReader (NaturalSoft): Free version available. Text (Webpages, PDF files, Emails, …) to spoken words. NaturalReader (NaturalSoft) Also applied to entertainment: games and animations “Can I reserve a room?”

10 Current Mobile Speech Translation System Google Translate (for iOS) ・ More than 70 languages can be translated. ・ Free to download and use. ・ Requires internet connection. ※ Offline mode is available for Android (2.3+) ・ Users can speak, type or handwrite text to translate. ・ Translated results are provided in text and speech. ・ Transcribes and translates speedy, provided sufficient network speed. ・ Keeps history.

11 How Mobile Speech Translation Systems Work (Online) more than 352Kbps required

12 Forecast of Mobile Speech Translation Systems in 2020 ① Since the 1950s, a number of scholars have questioned the possibility of achieving fully automatic machine translation of high quality. Some critics claim that there are in-principle obstacles to automatizing the translation process. When a human translator need a whole workday to translate five pages, about 10% of an average text requires him/her to research, which requires six [more] hours of work. Accomplishing this with machines would require a higher degree of AI than has yet been attained. → Architecture would improve, but will still be imperfect.

13 Forecast of Mobile Speech Translation Systems in 2020 ② Online Systems in 2020: ・ Development of processing power of CPU and server storage would improve accuracy of speech recognition and translation (although not perfect yet). ・ Improvement of network penetration would expand usable areas. ・ Network connection costs. Offline Systems in 2020: ・ Development of processing power of mobile CPU and mobile storage would improve accuracy of speech recognition and translation (although not as accurate as online systems). ・ No connection cost, can be used anywhere without network. → Both online and offline systems will be used. Network penetratio n CPUMobile CPUServer Storage Mobile Storage Network Cost Accuracy Online××-×-×High Offline--×-×-Low

14 My User Interface Design for Future Systems ① Users can correct misrecognition of the system by writing or choosing from alternatives. → Speakers can confirm what they say is recognized correctly. ② Users can see different patterns of interpretation and choose according to the context. → The same sentence could be interpreted differently in different situation. ③ Corresponding words or phrases are shown in the same color. → Users could better understand the language rather than just receive the result. [Movie] Fail-Proof Speech Translation System User Interface Design

15

16 Conclusion ① Considering complex nature of human language communication, Speech Translation Systems in 2020 will still be imperfect. → It is essential for the systems to have fail-proof user interface to avoid critical misunderstanding. → Learning of foreign languages will continuously be important, so the systems should be designed not to be solely relied on but to assist users to improve their knowledge. ② Development of the Systems will increase overall population of people who can communicate with foreigners. Thus people’s eyes will be more opened to international community and we will be mentally closer to each other in 2020. → We need to anticipate social impact of this.

17 Reference ■ Speech Translation Overcoming the Language Barrier with Speech Translation Technology (April 2009) http://www.nistep.go.jp/achiev/ftx/eng/stfc/stt031e/qr31pdf/STTqr3103.pdfhttp://www.nistep.go.jp/achiev/ftx/eng/stfc/stt031e/qr31pdf/STTqr3103.pdf Google Translate For Android Gets Offline Mode With Support For 50 Languages (Mar 27, 2013) http://techcrunch.com/2013/03/27/google-translate-offline-mode/http://techcrunch.com/2013/03/27/google-translate-offline-mode/ [iTunes Preview] Google Translate https://itunes.apple.com/ca/app/google-translate/id414706506https://itunes.apple.com/ca/app/google-translate/id414706506 [Toshiba] Research and Development Center, News Release (Japanese) http://www.toshiba.co.jp/rdc/rd/detail_j/0912_03.htmhttp://www.toshiba.co.jp/rdc/rd/detail_j/0912_03.htm A Speech Translation System with Mobile Wireless Clients http://aclweb.org/anthology//P/P03/P03-2023.pdfhttp://aclweb.org/anthology//P/P03/P03-2023.pdf ■ Automatic Speech Recognition (ASR) [Wikipedia] Speech recognition http://en.wikipedia.org/wiki/Automatic_speech_recognition [howstuffworks] How Speech Recognition Works http://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition.htmhttp://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition.htm [Windows] Set up Speech Recognition http://windows.microsoft.com/en-us/windows7/set-up-speech-recognitionhttp://windows.microsoft.com/en-us/windows7/set-up-speech-recognition [TopTenReviews] Voice Recognition Software Review http://voice-recognition-software-review.toptenreviews.com/http://voice-recognition-software-review.toptenreviews.com/ ■ Machine Translation (MT) [Wikipedia] Machine Translation http://en.wikipedia.org/wiki/Machine_translationhttp://en.wikipedia.org/wiki/Machine_translation Why your smartphone will NEVER be a universal translator http://www.fluentin3months.com/translator-app/http://www.fluentin3months.com/translator-app/ ■ Speech Synthesis [Wikipedia] Speech synthesis http://en.wikipedia.org/wiki/Speech_synthesishttp://en.wikipedia.org/wiki/Speech_synthesis [YouTube] Using Narrator the basic screen reading tool built into MS Windows http://www.youtube.com/watch?v=0mACOm0SuhEhttp://www.youtube.com/watch?v=0mACOm0SuhE


Download ppt "Mobile Speech Translation Systems Design for 2020 11/19/2013 INST603 Term Project MIM, UMD Makoto Asami."

Similar presentations


Ads by Google