Speech processing
Download
1 / 25

Speech Processing - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

Speech Processing. Applications of Images and Signals in High Schools. AEGIS RET All-Hands Meeting University of Central Florida June 22, 2012. Contributors. Dr . Veton Këpuska , Faculty Mentor, FIT Jacob Zurasky , Graduate Student Mentor, FIT

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speech Processing' - aradia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Speech processing

Speech Processing

Applications of Images and Signals in High Schools

AEGIS RET All-Hands Meeting

University of Central Florida

June 22, 2012


Contributors
Contributors

Dr. VetonKëpuska, Faculty Mentor, FIT

Jacob Zurasky, Graduate Student Mentor, FIT

Becky Dowell, RET Teacher, BPS Titusville High


Motivation
Motivation

  • Speech audio processing has increased in its usefulness.

  • Applications

    • Siri on iPhone 4S

    • Automated telephone systems

    • Voice transcription (e.g. dictation software)

    • Hands-free computing (e.g., OnStar)

    • Video games (e.g., XBOX Kinect)

    • Military applications (e.g., aircraft control)

    • Healthcare applications


Motivation1
Motivation

  • Speech recognition requires speech to first be characterized by a set of “features”.

  • Features are used to determine what words are spoken.

  • To understand how the features are computed is very important.

  • Our project will implement the feature extraction stage of a speech processing application.


Work completed
Work Completed

  • MATLAB fundamentals

  • Introduction of Signal Processing and Filtering

  • Beginning Project Implementation


Speech recognition
Speech Recognition

Front End:

Pre-processing

Back End: Recognition

Features

Recognized speech

Speech

Large amount of data.

Ex: 256 samples

Reduced data size. Ex: 13 features

  • Front End – reduce amount of data for back end, but keep enough data to accurately describe the signal. Output is feature vector.

    • 256 samples ------> 13 features

  • Back End - statistical models used to classify feature vectors as a certain sound in speech


Discrete time signals
Discrete Time Signals

  • Computer is a discrete system with finite memory resources, requires a discrete representation of sound

  • Sound represented as a sequence of samples

    • time vs. amplitude

    • Amplitude = volume



Discrete time signals2
Discrete Time Signals

  • Sampling rate (# of samples per second)

    • 8 kHz - telephone

    • 44.1 KHz – CD audio

    • 96 kHz – DVD audio


Frequency domain
Frequency Domain

  • Need to analyze signals over frequency rather than time.

  • Sound is composed of many frequencies at the same time

  • Frequency determines the pitch of the sound

  • To recognize the sound, we need to know the frequencies that make the sound.


Fast fourier transform fft
Fast Fourier Transform (FFT)

  • Algorithm used to transform time domain to frequency domain.

  • MATLAB function:

    FFT(X,N) X – discrete time signal

    N – FFT size

X – frequency spectrum

K - frequency bin

N – FFT size

n - sample number

x[n] – input signal


Sine w ave example
Sine Wave Example

  • MATLAB function sine_sound

    • Generate 3 sine waves and a composite signal

    • Play sound and plot graphs

    • Compute and plot FFT of composite signal


Sine wave example
Sine Wave Example

% plays a C major chord (C4, E4, F4)

sine_sound(8000, 261.626, 329.628, 391.995, 1, 4096);


Front end processing of speech recognizer
Front-End Processing of Speech Recognizer

  • Pre-emphasis

  • Window

  • FFT

  • Mel-Scale

  • log

  • IFFT


Pre emphasis
Pre-Emphasis

  • 1st order FIR filter

  • In human speech, higher frequencies have less energy. Need to compensate for higher frequency roll off in human speech

  • High Pass filter


Windowing
Windowing

  • Separate speech signal into frames

  • Apply window to smooth edges of framed of speech signal


Mel scale
Mel-Scale

  • Model sound as humans perceive it – logarithmically.

  • At high frequencies, a larger change in frequency is required to notice a difference

  • Convert linear scale (Hz) to logarithmic scale (mel-scale)



Connections to high school mathematics curriculum
Connections to High School Mathematics Curriculum

  • Florida Math Standard (NGSSS) MA.912.T.1.8:

    • Solve real world problems involving applications of trigonometric functions using graphing technology when appropriate.

  • Pre-Calculus course

    • related topics include graphs of trigonometric functions, unit circle, logarithmic scale, complex numbers in trig form


Timeline
Timeline

  • Week 1

    • MATLAB fundamentals

    • MATLAB Filter Design & Analysis Tool

    • Introduction to Signal Processing, FFT, Filtering

    • Identified topics connected to high school math curriculum

  • Week 2

    • Continued tutorials on signal processing and filtering

    • Implementation of sample code for use in lesson plans

    • Implementation of Pre-emphasis, Windowing, FFT, Cepstral Transform


Timeline1
Timeline

  • Week 3 – 6

    • Implementation of Front-End Speech Processing

    • Work on deliverables.


References
References

  • Ingle, Vinay K., and John G. Proakis. Digital signal processing using MATLAB. 2nd ed. Toronto, Ont.: Nelson, 2007.

  • Oppenheim, Alan V., and Ronald W. Schafer. Discrete-time signal processing. 3rd ed. Upper Saddle River: Pearson, 2010.

  • Weeks, Michael. Digital signal processing using MATLAB and wavelets. Hingham,Mass.: Infinity Science Press, 2007.


Speech processing

Thank you!

Questions?