1 / 20

Survey of INTERSPEECH 2013

Survey of INTERSPEECH 2013. Reporter : Yi-Ting Wang 2013/09/10. Outline. Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments

blaine
Download Presentation

Survey of INTERSPEECH 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Survey of INTERSPEECH 2013 Reporter: Yi-Ting Wang 2013/09/10

  2. Outline • Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments • Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments • NMF-base Temporal Feature Integration for Acoustic Event Classificaion

  3. Exemplar-based Individuality-Preserving Voice Conversion for Articulation Disorders in Noisy Environments Ryo AIHARA, Ryoichi TAKASHIMA, Tetsuya TAKIGUCHI, Yasuo ARIKI Graduate School of System Informatics, Kobe University, Japen

  4. Introduction • We present in this paper a noise robust voice conversion(VC) method for a person with an articulation disorder resulting from athetoid cerebral pslsy. • Exemplar-based spectral conversion using NMF is applied to a voice with an articulation disorder in real noisy environments. • NMF is a well-known approach for source separation and speech enhancement. • Poorly articulated noisy speech -> clean articulation

  5. Voice conversion based on NMF

  6. Constructing the individuality-preserving dictionary

  7. Experimental Results • ATR Japanese speech database.

  8. Conclusions • We proposed a noise robust spectral conversion method based on NMF for a voice with an articulation disorder. • Our VC method can improve the listening intelligibility of words uttered by a person with an articulation disorder in noisy environments.

  9. Robust Speech Enhancement Techniques for ASR in Non-stationary Noise and Dynamic Environments Gang Liu, DimitriosDimitriadis, Enrico Bocchieri Center for Robust Speech Systems, University of Texas at Dallas

  10. Introduction • In the current ASR systems the presence of competing speakers greatly degrades the recognition performance. • Furthermore, speakers are, most often, not standing still while speaking. • We use Time Differences of Arrival(TDOA) estimation, multi-channel Wiener Filtering, NMF, multi-condition training, and robust feature extraction.

  11. Proposed cascaded system • The problem of source localization/separation is often addressed by the TDOA estimation.

  12. Experiment and results

  13. Experiment and results • NMF provides the largest boost, due to the suppression of the non-stationary interfering signals.

  14. Conclusion • We propose a cascaded system for speech recognition dealing with non-stationary noise in reverberated environments. • The proposed system offers an average of 50% and 45% in relative improvements for the above mentioned two scenarios.

  15. NMF-base Temporal Feature Integration for Acoustic Event Classificaion Jimmy Ludena-Choez, Ascension Gallardo-Antolin Dep. of Signal Theory and Communications, Universidad Carlos III de Madrid, Avda de la Universidad 30,28911 – Leganes(Madrid), Spain

  16. Introduction • This paper propose a new front-end for Acoustic Event Classification tasks(AEC) based on the combination of the temporal feature integration technique called Filter Bank Coefficients(FC) and Non-Negative Matrix Factorization. • FC allows to capture the dynamic structure in the short time features. • We present an unsupervised method based on NMF for the design of a filter bank more suitable for AEC.

  17. Audio feature extraction

  18. Experiments and results • Here, use the NMF use KL divergence.

  19. Experiments and results

  20. Conclusions • We have presented a new front-end for AEC based on the combination of FC features and NMF. • NMF is used for the unsupervised learning of the filter bank which captures the most relevant temporal behavior in the short-time features. • Low modulation frequencies are more important than the high ones for distinguishing between different acoustic events. • The experiments have shown that the features obtained with this method achieve significant improvements in the classification performance of a SVM-based AEC system in comparison with the baseline FC parameters.

More Related