Download presentation
Presentation is loading. Please wait.
Published byBenedict Watson Modified over 9 years ago
1
Senior Project – Electrical Engineering - 2008 Tool for Improving Non-Native French Speech Pronunciation Joseph Ciaburri Advisor – Professor Catravas, Professor Chilcoat Abstract: For foreign language students, pronunciation can be one of the most frustrating aspects of mastering the language. The art of correct pronunciation is more difficult to encapsulate into a set of rules than grammar. As a result, students must either rely on instructor critiques, or their own aural judgment. In language laboratories, the student typically hears a native speaker, speaks into a recorder and listens to his or her voice replayed. Such an approach does not take advantage of the potential for facial movement to provide feedback. In this work, the introduction of a video monitor of facial movement into a pronunciation software tool, along with traditional aural and signal processing based techniques, is investigated. Much like a language laboratory, a native speaker reads a phrase, which the student repeats. Matlab acquires the student response via a webcam and microphone, which replays the student's attempt, allowing the student to self-diagnose. The audio signal is analyzed and displayed in the frequency domain as Short-time Fourier Transform in the form of a spectrogram and in the quefrency domain as the cepstrum. The initial focus is on vowel sounds. Future work will include efforts to provide a bull's eye comparing a numerical figure of merit with a target reference. A software tool that focuses on both audio and video for language learning has the potential increase pronunciation skill while decreasing the learning time. This project can also provide a platform to enable testing of the effectiveness and significance of the different feedback mechanisms employed for language pronunciation. http://www.logitech.com/repository/471/jpg/3770.1.0.jpg USER Window 1 Native Speaker Audio Speech MicrophoneWebcam Video Data Acquisition Window 2 Audio and Video of User Speaking Data Window 3 Diagnostics Video Audio Video Audio Audio/Visual Databank Time Domain Acknowledgements Professor Rudko Professor Hanson Professor Streignitz Professor Cotter Professor Catravas Professor Chilcoat Professor Pickering Listen and Repeat System Language Lab Proposed System Native SpeakerNon-Native Speaker Cepstrum Native Speaker Non-Native Speaker Spectrogram Native Speaker Non-Native Speaker Results: The building blocks for modules of the teaching tool, shown in the upper right, are ready for implementation. The ability to acquire synchronized audio and video, and to play the audio and video (not synchronized) form the basis for the visual and aural feedback. The audio signal,as shown in the analysis to the right, can be displayed in the time domain, the frequency domain, and the quefrency domain, which will allow for the quantization of the signal. The time domain can be used for defining the phonemes and stressing. The frequency domain is used to define the spectrogram, which is used to determine vowels using formants, and consonants using transitions. The quefrency domain is used to create the cepstrum which is used to find the fundamental frequency. Frequency Domain Quefrency Domain Analysis for the word “Analyste” (Analyst)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.