Chania meeting may 2007
Download
1 / 16

Advances in WP1 - PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on

Chania Meeting – May 2007. Advances in WP1. www.loquendo.com. Summary. Test on Hiwire DB with denoising methods developed in the project: Wiener SNR dep. Spectral Subtraction Ephraim-Malah SNR dep. Spectral Attenuation Loquendo FE – UGR PEQ Integration Details Results on Hiwire db.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Advances in WP1' - dalton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chania meeting may 2007

Chania Meeting – May 2007

Advances in WP1

www.loquendo.com


Summary
Summary

  • Test on Hiwire DB with denoising methods developed in the project:

    • Wiener SNR dep. Spectral Subtraction

    • Ephraim-Malah SNR dep. Spectral Attenuation

  • Loquendo FE – UGR PEQ Integration

    • Details

    • Results on Hiwire db


Chania meeting may 20071

Chania Meeting – May 2007

HIWIRE DB Test

www.loquendo.com


Test conditions
Test Conditions

  • Test on the last 50 utterances of each speaker (50-99)

  • The first 50 utterances of each speaker (0-50) left for development or adaptation

  • Four noise conditions:

    • Clean

    • Low Noise (SNR = 10 dB)

    • Medium Noise (SNR = 5 dB)

    • High Noise (SNR = -5 dB)

  • 4049 utterances for each condition, from 81 speakers of 4 nationalities


Hmm ann models
HMM-ANN Models

Two HMM-ANN models have been trained:

  • Telephone 8 kHz: trained with a large telephone corpus (LDC Macrophone + SpeechDat Mobile)

  • Microphone 16 kHz: trained with a collection of microphone corpora (timit, wsj0-1, vehic1us-ch0)




Comments on results
Comments on Results

  • The 16 kHz models are more accurate on clean speech (90.5% vs. 88.4%)

  • Ephraim-Malah noise reduction always outperforms Wiener spectral subtraction (32.8% vs. 25.7% and 25.7% vs. 21.8% E.R.).


Chania meeting may 20072

Chania Meeting – May 2007

Loquendo FE UGR PEQintegration

www.loquendo.com


Peq integration loquendo ugr

Loquendo FE

UGR PEQ

Loquendo ASR

PEQ Integration (Loquendo & UGR)

Phoneme-based

Models

Denoise

(Power Spectrum level)

Feature Normalization

(Frame -13 coeff- level)



Peq results
PEQ Results

  • The HMM-ANN models employed are:

  • WSJ0 models

  • WSJ0 models + E.M. denoising

  • WSJ0 models + E.M. denoising + PEQ



Comments on em denoising peq
Comments on EM denoising - PEQ

  • On noisy speech (LN, MN, HN):

    • both EM denoising and PEQ obtain a good improvement

    • best results are obtained when adding the effects of EM de-noising and PEQ normalization.

  • On clean speech:

    • EM denoising does not decrease performances

    • PEQ normalization slightly decreases performances

  • PEQ is very useful in mismatched conditions

  • can (slightly) decrease performances in matched conditions (e.g. clean speech)


Test on tts american voice dave
Test on TTS American Voice (Dave)

  • We have used the American voice DAVE of Loquendo TTS to read the 4049 sentences of the Hiwire DB

  • The great difference in results is due to non-native pronounce

  • Es. “Range Forty” pronounced

    • by Dave

    • by a French speaker

    • by a Greek speaker


Wp1 workplan
WP1: Workplan

  • Selection of suitable benchmark databases; (m6)

  • Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR dependent) (m12)

  • Discriminative VAD (training+AURORA3 testing) (m16)

  • Exprimentation of Spectral Attenuation rule (Ephraim-Malah SNR dependent) (m21)

  • Preliminary results on spectral subtraction and HEQ techniques (m24)

  • Integration of denoising and normalization techniques (PEQ) (m33)


ad