Speech Enhancement with Binaural Cues Derived from a Priori Codebook

# Speech Enhancement with Binaural Cues Derived from a Priori Codebook

## Speech Enhancement with Binaural Cues Derived from a Priori Codebook

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Speech Enhancement with Binaural Cues Derived from a Priori Codebook Reporter：Nan Chen Beijing University of Technology http://www.bjut.edu.cn/sci/voice/index.htm

2. 1 Introduction 2 The Proposed Method 4 3 Results and Conclusions Contents http://www.bjut.edu.cn/sci/voice/index.htm

3. 1 Introduction http://www.bjut.edu.cn/sci/voice/index.htm

4. Introduction Noise Street Car Babble office http://www.bjut.edu.cn/sci/voice/index.htm

5. Introduction 1 The traditional method of speech enhancement 2 3 4 http://www.bjut.edu.cn/sci/voice/index.htm

6. Introduction Binaural Cue Coding(BCC) Framework Purpose: recovering the perception of the original input signals BCC analysis: extract the side information of input signals BCC synthesis: recover the input signals by making use of the side information and the mono signal Figure 1 :Block diagram of analysis and synthesis for BCC http://www.bjut.edu.cn/sci/voice/index.htm

7. Introduction Once the Discrete Fourier transform (DFT) coefficients of mono signal is known, the DFT coefficients of each output channel Sc,k can be calculated as Where is the ICLD between channel 1 and channel c for the nth sub-band. , is a random variable which is controlled by ICC (1) (2) (3) http://www.bjut.edu.cn/sci/voice/index.htm

8. Introduction BCC : recovering the perception of the original input signals. speech enhancement : separate clean signal from the noisy signal. The BCC principle is introduced to estimate the clean signal. The noisy speech is enhanced by BCC principle where the channel 1 is assumed as the clean speech and the channel 2 is regarded as the noise. Clean speech Clean speech Noisy speech Noise Noise http://www.bjut.edu.cn/sci/voice/index.htm

9. 2 The Proposed Method 4 http://www.bjut.edu.cn/sci/voice/index.htm

10. The Proposed Method Side Information The Clean Cue speech and noise level difference (SNLD) speech and noise correlation (SNC) The Pre-enhanced Cue pre-enhanced speech and noise level difference (PNLD) pre-enhanced speech and noise correlation(PNC) posterior SNR (PSNR) speech presence probability (SPP) http://www.bjut.edu.cn/sci/voice/index.htm

11. The Proposed Method Figure 2: Block diagram of the proposed monaural speech enhancement method http://www.bjut.edu.cn/sci/voice/index.htm

12. The Proposed Method • weighted codebook mapping algorithm Figure 3:Block diagram of the weighted codebook mapping http://www.bjut.edu.cn/sci/voice/index.htm

13. The Proposed Method Estimation of the clean cue: 1) By comparing the Euclidean distance (ED) between the online pre-enhanced cue and the trained pre-enhanced cue, we can choose M code-vectors with relative small ED from the trained codebook. 2) calculate the degree of membership ρ of the chosen code-vectors 3) the weight of each chosen code-vector can be defined as 4) the online clean cue is obtained by weighting the trained clean cue stored in the chosen code-vector. (4) (5) http://www.bjut.edu.cn/sci/voice/index.htm

14. The Proposed Method Speech Enhancement： According to the BCC principle, we have: where is a random function with zero mean and constant variance. Finally, the noisy speech is enhanced by: (6) (7) (8) http://www.bjut.edu.cn/sci/voice/index.htm

15. 4 Results and Conclusions 3 http://www.bjut.edu.cn/sci/voice/index.htm

16. Results SSNR: http://www.bjut.edu.cn/sci/voice/index.htm

17. Results PESQ: http://www.bjut.edu.cn/sci/voice/index.htm

18. Results LSD: http://www.bjut.edu.cn/sci/voice/index.htm

19. Results 5dB babble clean Ref.A poposed Ref.B http://www.bjut.edu.cn/sci/voice/index.htm

20. Results 10dB babble clean Ref.A Ref.B poposed http://www.bjut.edu.cn/sci/voice/index.htm

21. Conclusions • We enhance the noisy speech by modeling the spectral detail, which is the reason why it can reduce the noise between harmonics. • The noise classification is cancelled because we introduce the binaural cues, which are not correlated with the type of noise, as priori information. http://www.bjut.edu.cn/sci/voice/index.htm

22. Thank You！ http://www.bjut.edu.cn/sci/voice/index.htm