Word recognition device
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

Word Recognition Device PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on
  • Presentation posted in: General

Word Recognition Device. C.K. Liang & Oliver Tsai. Why is speech recognition important?. Several real world applications. Dictation devices/software i.e. Dragon Naturally Speaking.

Download Presentation

Word Recognition Device

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Word Recognition Device

C.K. Liang & Oliver Tsai


Why is speech recognition important?

  • Several real world applications.

  • Dictation devices/software i.e. Dragon Naturally Speaking.

  • Voice activated devices may be used to dial telephone numbers, change preset buttons in car audio, change t.v. stations, and several other possibilities.


How is this possible?

  • Linear Predictive Coding (LPC)

  • LPC models waveform like Infinite Impulse (IIR) Filter.

  • Uses the feedback from past inputs and past outputs to predict future outputs


IIR Filter

a(1)*y(n) = b(1)*x(n) + b(2)*x(n-1) + ...+b(nb+1)*x(n-nb)

- a(2)*y(n-1)-…-a(na+1)*y(n-na)


How do we use LPC for speech recognition?

  • Record human speech

  • Pre-emphasis

  • Convolution pre-emphasis filter with waveform


Pre-emphasis Filter


Why are vowel sound used ?


Hamming Window

  • Multiply the 240 samples point by point with hamming window

  • Reduce the amplitude on both ends of the window frame


Waveform of a consonant sound


Variance

Sound analysis summary

LPC Coefficients


General Block Diagram

A/D converter

8000 samples/sec

Pre-emphasis

filter

Frame Blocking

30ms window framing

Hamming

Window

Levinson-Durbin

Algorithm

Auto-Correlation

SSD Comparison

Output

4 digital bits


Implementation on Motorola DSP56303

  • Train Device for vowel sound template

  • Recognition Device for vowels


Training for sound template

  • Detect beginning of speech

  • Pre-emphasize 2000 input samples

  • Hamming window 240-sample frame

  • Calculate 10 LPC coefficients

  • Repeat 10 times and store 10 sets of LPC coefficients


Recognition Device

  • Detect beginning of speech

  • Pre-emphasize 2000 input samples

  • Create window frame by shifting 80 samples

  • Hamming window each frame

  • Find 10 LPC coefficients for each frame

  • Compute SSD between the coefficients and those in template


Output Hardware

Map 4 output bits from DSP board to 10 corresponding vowel LEDs plus 1 volume indicator LED with NAND chips


Difficulties encountered

  • Insufficient data memory

  • Indirect connection between microphone and the DSP board

  • Incompatible I/O core302 assembly file

  • Low volume for the sound input


Further Expansion

  • Speech compression

  • Large vocabulary continuous speech recognition with Hidden Markov Model


H(Z) = G/(1+A1 Z-1+A2 Z-2 + …. + A10 Z-10)

239

Ri =  x(n) x(n-i)

n=i

for i = 1 to 10

Autocorrelation


Levinson-Durbin Algorithm

R0R1R2 …. R9 A1R1

R1 R0R1 ….R8A2R2

R1 R0R1 ….R8A3= -R3

……………………….….

R9 R8R7 ….R0A10R10

An(i) = An-1(i) + Kn An-1(n-i)

Kn = (-1/En-1)  An-1(I) Rn-i (i = 0 to n-1)

En = En-1 (1-Kn2 )


  • Login