a tutorial on bayesian speech feature enhancement n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Tutorial on Bayesian Speech Feature Enhancement PowerPoint Presentation
Download Presentation
A Tutorial on Bayesian Speech Feature Enhancement

Loading in 2 Seconds...

play fullscreen
1 / 179

A Tutorial on Bayesian Speech Feature Enhancement - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

SCALE Workshop, January 2010. A Tutorial on Bayesian Speech Feature Enhancement. Friedrich Faubel. I. Motivation. Speech Recognition System Overview. A speech recognition system converts speech to text. It basically consists of two components:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

A Tutorial on Bayesian Speech Feature Enhancement


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. SCALE Workshop, January 2010 A Tutorial on Bayesian Speech Feature Enhancement Friedrich Faubel

    2. I Motivation

    3. Speech Recognition SystemOverview • A speech recognition system converts speech to text. It basically consists of two components: • Front End: extracts speech features from the audio signal • Decoder: finds that sentence (sequence of acoustical states), which is the most likely explanation for the observed sequence of speech features Front End Decoder Text Speech

    4. Speech Feature ExtractionWindowing

    5. Speech Feature ExtractionWindowing

    6. Speech Feature ExtractionWindowing

    7. Speech Feature ExtractionWindowing

    8. Speech Feature ExtractionTime Frequency Analysis • Performing spectral analysis separately for each frame yields a time-frequency representation

    9. Speech Feature ExtractionTime Frequency Analysis • Performing spectral analysis separately for each frame yields a time-frequency representation

    10. Speech Feature ExtractionPerceptual Representation • Emulation of the logarithmic frequency and intensity perception of the human auditory system

    11. Background Noise • Background noise distorts speech features • Result: features don’t match the features used during training • Consequence: severely degraded recognition performance

    12. Overview of the Tutorial I - Motivation II - The effect of noise to speech features III - Transforming probabilities IV - The MMSE solution to speech feature enhancement V - Model-based speech feature enhancement VI - Experimental results VII - Extensions

    13. II Interaction Function The Effect of Noise

    14. Interaction Function • Principle of Superposition: signals are additive noise clean speech noisy speech = +

    15. Interaction Function • In the signal domain we have the following relationship: noisy speech noise clean speech

    16. Interaction Function • In the signal domain we have the following relationship:

    17. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes:

    18. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    19. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    20. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    21. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    22. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    23. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    24. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    25. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    26. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    27. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    28. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    29. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    30. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    31. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    32. Interaction Function • In the signal domain we have the following relationship: • After Fourier transformation, this becomes: • Taking the magnitude square on both sides, we get:

    33. Interaction Function • Taking the magnitude square on both sides, we get:

    34. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have:

    35. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: phase term

    36. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: relative phase

    37. Interaction Function • The relative phase between two waves describes their relative offset in time (delay) time relative phase

    38. Interaction Function • When 2 sound sources are present the following can happen: = = amplification amplification = = cancellation attenuation

    39. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: relative phase

    40. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: zero in average

    41. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes:

    42. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes:

    43. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes:

    44. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes:

    45. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes: Acero, 1990

    46. Interaction Function • Taking the magnitude square on both sides, we get: • Hence, in the power spectral domain we have: • In the log power spectral domain that becomes: But is that really right?

    47. Interaction Function • The mean of a nonlinearly transformed random variable is not necessarily equal to the nonlinear transform of the random variable’s mean. nonlinear transform

    48. Interaction Function • The mean of a nonlinearly transformed random variable is not necessarily equal to the nonlinear transform of the random variable’s mean. nonlinear transform

    49. Interaction Function • Phase-averaged relationship between clean and noisy speech:

    50. III Transforming Probabilities