1 / 16

Voice-enabled Image Identification System Design

Voice-enabled Image Identification System Design. Aashish P. Shrestha Ming Ming Zheng Multimedia Signal Processing , University of Bridgeport, Connecticut Prof. B. Barkana Spring 2009. Introduce.

Download Presentation

Voice-enabled Image Identification System Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Voice-enabled Image Identification System Design Aashish P. Shrestha Ming Ming Zheng Multimedia Signal Processing, University of Bridgeport, Connecticut Prof. B. Barkana Spring 2009

  2. Introduce • Voice-enabled Application is widely use at this modern time. Basically, it is a sub area of speech recognition. The task of a voice-enabled application is that let machine accept and recognize your command through normal human’s voice.

  3. The overview • (SAPI) Speech Recognition Engine • Voice signal processing • Image Identification • System Design • System Performance • Conclusion

  4. Speech Recognize Engine • Microsoft Speech Application Programming Interface (SAPI): Microsoft provides a speech recognition engine in the SAPI, this engine can transfer prospective human’s voice into text by comparing the input voice with the voice database. Also, it can transfer the text into human’s voice.

  5. Voice Signal Processing Three main classes used in the SpSharedRecoContext interface: • ISpEventSource: handle the start point of speech signal • GetRecognizer: Returns a reference to the current recognizer object associated with context . • ISpeechRecoResult: Return a compared value between input voice and the voice from speech engine.

  6. Image Identification • We preset the image value by its file name. Then the system will get the file name as key word. Finally, save it into the database. • Final output from individual speech results to Image as spoken. • Example: Select “Apple”

  7. Recognizer User’s Voice 2 Speech Engine SAPI (Voice Data) 2 3 1 Image Data Base Back-end Admin module 4 Output Architecture Flow System Design

  8. System Design • In the first stage, the speech engine will initialize and load the voice data according to the database. The database is where we store the information of pictures. • Secondly, users can input their voice by proper way. If the input voice matches the voice data in the speech engine, the system will go to step three, and show the proper image. Meanwhile, the system will reflect the text and speak it out using system voice. We indicate this step as step four.

  9. System Requirement • Hardware: PC with speakers and microphone. • Software: Window 2000/XP/VISTA, Microsoft Access, Microsoft SAPI V5.1, C#.net

  10. System Performance

  11. System Maintenance • A back-end Database Admin Module: • Add a Picture

  12. System Maintenance • Edit or Delete items:

  13. Demonstration • We will demonstrate our system.

  14. Advantage and Drawback Advantages: 1. Accuracy 2. Fast 3. Robust Drawback: Sometime easily affect by the noise environment

  15. Conclusion From this project, we can see, the voice-enabled application is robust and reliable. It has been used in the market for about two decades. The voice command also can easily be integrated with other applications, which involve in any touch-free command.

  16. Thank you!!!!

More Related