210 likes | 531 Views
IntroductionProject BeneficiariesGeneral ObjectivesModules Java Speech APIInterfaces and MethodsScreen shots
E N D
1. Text to Speech SynthesizerPresentation forCOSC 683: Software Engineering PracticumbyKishore ChidarapuE00652529
2. Introduction
Project Beneficiaries
General Objectives
Modules
Java Speech API
Interfaces and Methods
Screen shots
“Demo”
Future Enhancements
Thank you
3. Introduction to the Company This project is done in Capgemini.
Under the guidance of Mr. Srini Kancha, Senior Project Manager.
Capgemini serves industries like Automotive, Consumer Products, Financial Services, Health, Retail etc.
My project comes under consumer products.
4. Project Beneficiaries
Visually impaired people
Potential users of Computer System
Students and researchers
IT companies
General public
5. General Objective This project performs the task of developing an application program which can read out an input text typed at a source.
6. Modules Structure analysis
Process the input text to determine where paragraphs, sentences and other structures start and end. For most languages, punctuation and formatting data are used in this stage.
7. Modules Text pre processing
Analyze the input text for special constructs of the language. In English, special treatment is required for abbreviations, acronyms, dates, times, numbers, currency amounts, email addresses and many other forms.
8. Modules Text-to-phoneme conversion
Convert each word to phonemes. A phoneme is a basic unit of sound in a language. US English has around 45 phonemes including the consonant and vowel sounds. Different languages have different sets of sounds (different phonemes).
9. Modules Prosody code
Process the sentence structure, words and phonemes to determine appropriate prosody for the sentence. This includes the pitch (or melody), the timing (or rhythm), the pausing, the speaking rate, the emphasis on words and many other features.
10. Modules Waveform production
The phonemes and prosody information are used to produce the audio waveform for each sentence. The current systems do it by concatenation of chunks of recorded human speech, or formant synthesis.
11. Java Speech API The Java Speech API enables developers of speech-enabled applications to incorporate more sophisticated and natural user interfaces into Java applications and applets.
Two core speech technologies are supported through the Java Speech API:
speech recognition
speech synthesis.
12. Design Goals for the Java Speech API Provide support for speech synthesizers and for both command-and-control and dictation speech recognizers.
Provide a robust cross-platform, cross-vendor interface to speech synthesis and speech recognition.
Enable access to state-of-the-art speech technology.
Support integration with other capabilities of the Java platform, including the suite of Java Media APIs.
Be simple, compact and easy to learn.
13. Interfaces and Methods com.sun.speech.freetts.voice
Class Voice
allocate()
deallocate()
getDomain()
speak()
com.sun.speech.freetts.VoiceManager
VoiceManager getInstance()
getVoice()
17. “DEMO”
18. Future Enhancements With various types of given text the TTS conversion tool will be tested for naturalness and accuracy and examined by linguistic experts to achieve more correct pronunciation. The outcomes of these examinations shall be incorporated to the TTS.
19. Future Enhancements A new button is created by name “Image converter”
This can be used to recognize the characters of languages like Chinese, Japanese, Mongolian.
Each character of that language is recognized and converted into English and then converted into speech.
20. Future Enhancement
Input text which is converted into speech is stored as a wave file.
Stream the audio thus obtained to a destination, which can be any computer in a network.
21.
Thank you
one and all