Speech Recognition through Neural Networks

Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem

Introduction • Speech Recognition is a process by which a computer maps an acoustic speech signal to text. Speech recognition has several stages. Its stages involve digital sampling of speech, acoustic signal processing and generating coefficients. The final stage is the recognition of phonemes, groups of phonemes and words. Multi Layered Feed Forward Perceptron Neural Network is used to generate output.

Speech Production

Speech Recognition Techniques • Feature Extraction • Artificial Intelligence • Pattern Recognition

System Overview

Speech Recognizer

Digitization of speech • Recording the sound • Analog to Digital converter • Sampling and Quantization

Filtering the Signal • Aliasing • Eliminate the signal over half the sampling frequency (that is SF/2) • Eliminating Pure Noise and Silence • Pure Noise and Silence is eliminated using Zero Crossing Rate (ZCR) and Power of the signal

Classification of Signal

Coefficients Measurement • Need of Coefficients • To get rid of a lot of digital signal data • Advantage of LP Coefficients • Applicable to time domain as well as frequency domain

LP Coefficients

Feed Forward Perceptron Neural Networks • Input Layer • 8 LP Coefficients fed to the input layer • Hidden Layer • Output Layer • Generates the output according to the weights assigned

Neural Network

Future Work • Complete Urdu Speech Recognizer • Interface with an Urdu Editor • Interface with Urdu to English Translator

Speech Recognition through Neural Networks