chania meeting may 2007
Download
Skip this Video
Download Presentation
Advances in WP1

Loading in 2 Seconds...

play fullscreen
1 / 16

Advances in WP1 - PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on

Chania Meeting – May 2007. Advances in WP1. www.loquendo.com. Summary. Test on Hiwire DB with denoising methods developed in the project: Wiener SNR dep. Spectral Subtraction Ephraim-Malah SNR dep. Spectral Attenuation Loquendo FE – UGR PEQ Integration Details Results on Hiwire db.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Advances in WP1' - dalton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
summary
Summary
  • Test on Hiwire DB with denoising methods developed in the project:
    • Wiener SNR dep. Spectral Subtraction
    • Ephraim-Malah SNR dep. Spectral Attenuation
  • Loquendo FE – UGR PEQ Integration
    • Details
    • Results on Hiwire db
test conditions
Test Conditions
  • Test on the last 50 utterances of each speaker (50-99)
  • The first 50 utterances of each speaker (0-50) left for development or adaptation
  • Four noise conditions:
    • Clean
    • Low Noise (SNR = 10 dB)
    • Medium Noise (SNR = 5 dB)
    • High Noise (SNR = -5 dB)
  • 4049 utterances for each condition, from 81 speakers of 4 nationalities
hmm ann models
HMM-ANN Models

Two HMM-ANN models have been trained:

  • Telephone 8 kHz: trained with a large telephone corpus (LDC Macrophone + SpeechDat Mobile)
  • Microphone 16 kHz: trained with a collection of microphone corpora (timit, wsj0-1, vehic1us-ch0)
comments on results
Comments on Results
  • The 16 kHz models are more accurate on clean speech (90.5% vs. 88.4%)
  • Ephraim-Malah noise reduction always outperforms Wiener spectral subtraction (32.8% vs. 25.7% and 25.7% vs. 21.8% E.R.).
peq integration loquendo ugr

Loquendo FE

UGR PEQ

Loquendo ASR

PEQ Integration (Loquendo & UGR)

Phoneme-based

Models

Denoise

(Power Spectrum level)

Feature Normalization

(Frame -13 coeff- level)

peq results
PEQ Results
  • The HMM-ANN models employed are:
  • WSJ0 models
  • WSJ0 models + E.M. denoising
  • WSJ0 models + E.M. denoising + PEQ
comments on em denoising peq
Comments on EM denoising - PEQ
  • On noisy speech (LN, MN, HN):
    • both EM denoising and PEQ obtain a good improvement
    • best results are obtained when adding the effects of EM de-noising and PEQ normalization.
  • On clean speech:
    • EM denoising does not decrease performances
    • PEQ normalization slightly decreases performances
  • PEQ is very useful in mismatched conditions
  • can (slightly) decrease performances in matched conditions (e.g. clean speech)
test on tts american voice dave
Test on TTS American Voice (Dave)
  • We have used the American voice DAVE of Loquendo TTS to read the 4049 sentences of the Hiwire DB
  • The great difference in results is due to non-native pronounce
  • Es. “Range Forty” pronounced
    • by Dave
    • by a French speaker
    • by a Greek speaker
wp1 workplan
WP1: Workplan
  • Selection of suitable benchmark databases; (m6)
  • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12)
  • Discriminative VAD (training+AURORA3 testing) (m16)
  • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21)
  • Preliminary results on spectral subtraction and HEQ techniques (m24)
  • Integration of denoising and normalization techniques (PEQ) (m33)
ad