classification of place of articulation in unvoiced stops with spectro temporal surface modeling n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chairman:Hung-Chi Yang Presenter: Yue -Fong Guo Advisor: Dr. Yeou-Jiunn Chen Date: 2013.3.20 PowerPoint Presentation
Download Presentation
Chairman:Hung-Chi Yang Presenter: Yue -Fong Guo Advisor: Dr. Yeou-Jiunn Chen Date: 2013.3.20

Loading in 2 Seconds...

play fullscreen
1 / 19

Chairman:Hung-Chi Yang Presenter: Yue -Fong Guo Advisor: Dr. Yeou-Jiunn Chen Date: 2013.3.20 - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Classification of place of articulation in unvoiced stops with spectro -temporal surface modeling . V. Karjigi , P. Rao Dept. of Electrical Engineering, Indian Institute of Technology Bombay, Powai , Mumbai 400076, India

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chairman:Hung-Chi Yang Presenter: Yue -Fong Guo Advisor: Dr. Yeou-Jiunn Chen Date: 2013.3.20' - onan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
classification of place of articulation in unvoiced stops with spectro temporal surface modeling

Classification of place of articulation in unvoiced stops with spectro-temporal surface modeling

V. Karjigi, P. RaoDept. of Electrical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India

Received 8 December 2011; received in revised form 12 March 2012; accepted 23 April 2012 Available online 1 June 2012

Chairman:Hung-Chi YangPresenter: Yue-Fong Guo

Advisor: Dr. Yeou-Jiunn ChenDate: 2013.3.20

outline
Outline
  • Introduction
  • MFCC
  • 2D-DCT
  • Polynomial surface
outline1
Outline
  • GMM
  • Results
  • Conclusion
introduction
Introduction
  • Automatic speech recognition (ASR) system
    • The goal is the lexical content of the human voice is converted to a computer-readable input
    • Attempt to identify or confirm issue voice speaker rather than the content of the terms contained therein
introduction1
Introduction
  • Automatic speech recognition (ASR) system
    • Acoustics feature
      • Signal processing and feature extraction
        • Mel frequency cepstral coefficients (MFCC)
    • Acoustics model
      • Statistically speech model
        • Gaussian mixture model (GMM)
slide6
MFCC
  • Mel frequency cepstral coefficients (MFCC)
    • MFCC takes human perception sensitivity with respect to frequencies into consideration, and therefore are best for speech/speaker recognition.
slide7
MFCC
  • Pre-emphasis
    • The speech signal s(n) is sent to a high-pass filter
  • Frame blocking
  • Hamming windowing
    • Each frame has to be multiplied with a hamming window in order to keep the continuity of the first and the last points in the frame
slide8
MFCC
  • Fast Fourier Transform or FFT
    • The time domain signal into a frequency domain
  • Triangular BandpassFilters
    • Smooth the magnitude spectrum such that the harmonics are flattened in order to obtain the envelop of the spectrum with harmonics.
  • Discrete cosine transform or DCT
slide9
MFCC
  • Log energy
    • The energy within a frame is also an important feature that can be easily obtained
  • Delta cepstrum
    • Actually used in speech recognition, we usually coupled differential cepstrum parameters to show the changes of the the cepstrum parameters of the time
2d dct
2D-DCT
  • 2D-DCT modeling
polynomial surface
Polynomial surface
  • Polynomial surface modeling
polynomial surface1
Polynomial surface
  • Polynomial surface modeling
polynomial surface2
Polynomial surface
  • Polynomial surface modeling
polynomial surface3
Polynomial surface
  • Polynomial surface modeling
slide15
GMM
  • Gaussian mixture model (GMM)
    • Is an effective tool for data modeling and pattern classification
    • Speaker acoustic characteristics for clustering, and then each group of acoustic characteristics described with a Gaussian density distribution
databases
Databases
  • Databases
    • Evaluated on two distinct datasets
      • American English continuous speech as provided in the TIMIT database
      • Marathi words database specially created for the purpose
conclusion
Conclusion
  • A comparison of performance with published results on the same task revealed that the spectro-temporal feature systems tested in this work improve upon the best previous systems’ performances in terms of classification accuracies on the specified datasets.