Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Listen and Repeat System Language Lab Proposed System http://www.logitech.com/repository/471/jpg/3770.1.0.jpg Senior Project – Electrical Engineering - 2008Tool for Improving Non-Native French Speech PronunciationJoseph CiaburriAdvisor – Professor Catravas, Professor Chilcoat Abstract: For foreign language students, pronunciation can be one of the most frustrating aspects of mastering the language. The art of correct pronunciation is more difficult to encapsulate into a set of rules than grammar. As a result, students must either rely on instructor critiques, or their own aural judgment. In language laboratories, the student typically hears a native speaker, speaks into a recorder and listens to his or her voice replayed. Such an approach does not take advantage of the potential for facial movement to provide feedback. In this work, the introduction of a video monitor of facial movement into a pronunciation software tool, along with traditional aural and signal processing based techniques, is investigated. Much like a language laboratory, a native speaker reads a phrase, which the student repeats. Matlab acquires the student response via a webcam and microphone, which replays the student's attempt, allowing the student to self-diagnose. The audio signal is analyzed and displayed in the frequency domain as Short-time Fourier Transform in the form of a spectrogram and in the quefrency domain as the cepstrum. The initial focus is on vowel sounds. Future work will include efforts to provide a bull's eye comparing a numerical figure of merit with a target reference. A software tool that focuses on both audio and video for language learning has the potential increase pronunciation skill while decreasing the learning time. This project can also provide a platform to enable testing of the effectiveness and significance of the different feedback mechanisms employed for language pronunciation. Data Acquisition Data Audio/Visual Databank Data Microphone Webcam Data Window 2 Audio and Video of User Speaking Window 1 Native Speaker Speech Video Audio Audio USER Window 3 Diagnostics Video Video Audio Video Analysis for the word “Analyste” (Analyst) Time Domain Native Speaker Non-Native Speaker Results: The building blocks for modules of the teaching tool, shown in the upper right, are ready for implementation. The ability to acquire synchronized audio and video, and to play the audio and video (not synchronized) form the basis for the visual and aural feedback. The audio signal ,as shown in the analysis to the right, can be displayed in the time domain, the frequency domain, and the quefrency domain, which will allow for the quantization of the signal. The time domain can be used for defining the phonemes and stressing. The frequency domain is used to define the spectrogram, which is used to determine vowels using formants, and consonants using transitions. The quefrency domain is used to create the cepstrum which is used to find the fundamental frequency. Frequency Domain Spectrogram Native Speaker Non-Native Speaker Quefrency Domain Cepstrum Acknowledgements Professor Rudko Professor Hanson Professor Streignitz Professor Cotter Professor Catravas Professor Chilcoat Professor Pickering Non-Native Speaker Native Speaker