wavelet based speech enhancement l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Wavelet-Based Speech Enhancement PowerPoint Presentation
Download Presentation
Wavelet-Based Speech Enhancement

Loading in 2 Seconds...

play fullscreen
1 / 45

Wavelet-Based Speech Enhancement - PowerPoint PPT Presentation


  • 302 Views
  • Uploaded on

Wavelet-Based Speech Enhancement. Course Project Presentation 1. Mahdi Amiri April 2003 Sharif University of Technology. Presentation Outline. Motivation and Goals Wavelet Transform - Overview Basic Denoising in Wavelet Domain Literature Survey Implementation and Results

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Wavelet-Based Speech Enhancement' - edric


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
wavelet based speech enhancement

Wavelet-Based Speech Enhancement

Course Project Presentation 1

Mahdi Amiri

April 2003

Sharif University of Technology

presentation outline
Presentation Outline
  • Motivation and Goals
  • Wavelet Transform - Overview
  • Basic Denoising in Wavelet Domain
  • Literature Survey
  • Implementation and Results
  • Conclusions and Future Works

Wavelet-Based Speech Enhancement

motivation and goals
Motivation and Goals

Key Applications

  • Improving perceptual quality of speech
    • Reduce listener’s fatigue
    • Hearing aids
  • Improving performance of
    • Speech coders
    • Voice recognition systems

Wavelet-Based Speech Enhancement

motivation and goals4
Motivation and Goals

Goals of SE in Wavelet Domain

  • Variable window size for different frequency components
    • Long time intervals  precise low frequency info.
    • Short time intervals  precise high frequency info.
  • Easy to implement
    • Fast WT computation complexity: O(n)
    • FFT computation complexity: O(nlog2n)
  • Denoising by simple thresholding
    • Real-time implementation

Wavelet-Based Speech Enhancement

slide5

Wavelet Transform - Overview

  • Motivation and Goals

Wavelet Transform - Overview

  • Basic Denoising in Wavelet Domain
  • Literature Survey
  • Implementation and Results
  • Conclusions and Future Works

Wavelet-Based Speech Enhancement

slide6

Wavelet Transform - Overview

History

  • Fourier (1807)
  • Haar (1910)
  • Math World

Wavelet-Based Speech Enhancement

slide7

Wavelet Transform - Overview

  • What kind of Could be useful?
    • Impulse Function (Haar): Best time resolution
    • Sinusoids (Fourier): Best frequency resolution
    • We want both of the best resolutions
  • Heisenberg (1930)
    • Uncertainty Principle
      • There is a lower bound for(An intuitive prove in [Mac91])

Wavelet-Based Speech Enhancement

slide8

Wavelet Transform - Overview

  • Gabor (1945)
    • Short Time Fourier Transform (STFT)
      • Disadvantage: Fixed window size

Wavelet-Based Speech Enhancement

slide9

Wavelet Transform - Overview

  • Constructing Wavelets
    • Daubechies (1988)
      • Compactly Supported Wavelets
  • Computation of WT Coefficients
    • Mallat (1989)
      • A fast algorithm using filter banks

Wavelet-Based Speech Enhancement

slide10

Wavelet Transform - Overview

Multiresolution Signal Representation

Coarse version (Approximation)

more useful than the Detail

  • Browsing image databases on the web
  • Signal transmission for communication
  • Denoising

Wavelet Tree Decomposition

  • Wavelet Transform (WT)
  • Undecimated WT (UWT)

We may lose what is in the Detail

Wavelet-Based Speech Enhancement

slide11

Wavelet Transform - Overview

Full Tree Decomposition

  • Wavelet Packet Transform (WPT)
  • Undecimated WPT (UWPT)

S = A1+D1 or S = A1+AD2+DD2 or …

Which decomposition path could be the best choice?

The answer leads us to the Best Basis

Wavelet-Based Speech Enhancement

slide12

Wavelet Transform - Overview

Best Basis Selection Criterions

Cut if:

  • Entropy
    • Coifman, Meyer, Wickerhauser (1992)
  • Rate-Distortion:
    • Vetterli (1995)

Wavelet-Based Speech Enhancement

slide13

Basic Denoising in Wavelet Domain

  • Motivation and Goals
  • Wavelet Transform - Overview

Basic Denoising in Wavelet Domain

  • Literature Survey
  • Implementation and Results
  • Conclusions and Future Works

Wavelet-Based Speech Enhancement

slide14

Basic Denoising in Wavelet Domain

Principle

  • Only a few coefficients in the lower bands could be used for approximating the main features of the clean signal. Hence, by setting the smaller coefficients to zero, we can nearly optimally eliminate noise while preserving the important information of clean signal.

Wavelet-Based Speech Enhancement

slide15

Basic Denoising in Wavelet Domain

Notation

  • Clean signal
  • Noise signal
  • Noisy signal

Time domain

Wavelet domain

Wavelet-Based Speech Enhancement

slide16

Basic Denoising in Wavelet Domain

Algorithm

  • Framing input noisy signal
  • Forward WT of a frame
  • Thresholding (detail) wavelet coefficients
  • Inverse WT
  • Keep center part of the frame
  • Repeat for all of the frames

Wavelet-Based Speech Enhancement

slide17

Basic Denoising in Wavelet Domain

Threshold Value

VisuShrink [DonJ94b]

Threshold

Estimation of Noise variance

Frame length

For Gaussian white noise:

Another definition (wden.m):

MAD: Median Absolute Difference

Wavelet-Based Speech Enhancement

slide18

Basic Denoising in Wavelet Domain

Threshold Value

Threshold in the WPT case

For the correlated noise situation:Use level dependent threshold (SureShrink [DonJ94b])

Wavelet-Based Speech Enhancement

slide19

Basic Denoising in Wavelet Domain

How to Threshold

Hard Thresholding

Soft Thresholding

Comparison:

Discontinuity

Alteration of values

Wavelet-Based Speech Enhancement

slide20

Literature Survey

  • Motivation and Goals
  • Wavelet Transform - Overview
  • Basic Denoising in Wavelet Domain

Literature Survey

  • Implementation and Results
  • Conclusions and Future Works

Wavelet-Based Speech Enhancement

slide21

Literature Survey

[SeoB97], Novelty

  • Title:
    • Speech enhancement with reduction of noise components in the wavelet domain
  • Novelty:
    • Semisoft thresholding [GaoB95]
    • Classification of unvoiced region in WD
    • Different thresholding for unvoiced region

Wavelet-Based Speech Enhancement

slide22

Literature Survey

[SeoB97], Thresholding

  • Semisoft Thresholding: [GaoB95]
    • Less sensitivity to small perturbations in the data
    • Smaller bias

Hard

Soft

Semisoft

Like [DonJ94b]

Wavelet-Based Speech Enhancement

slide23

Literature Survey

[SeoB97], Unvoiced Regions

  • Separation of unvoiced region
    • Use DWT for finding
    • Calculate average energy of each subband
    • Current speech segment is unvoiced if:

Wavelet-Based Speech Enhancement

slide24

Literature Survey

[SeoB97], Implementations

  • If unvoiced then threshold just highest frequency band
  • Implementation results
    • Additive white Gaussian noise
    • SNR (-10dB  10 dB)
    • “Should we chase those cowboys?”

Wavelet-Based Speech Enhancement

slide25

Literature Survey

[SooKY97], Novelty

  • Title: Wavelet for speech denoising
  • Novelty:
    • Evaluation of different wavelets and different orders (db1-10, coif1-5, sym2-8, bior1.3-6.8)
    • Spectral Subtraction in WD
    • Wiener Filtering in WD (Uses two methods for estimating the a priori SNR)
      • Maximum Likelihood approach
      • Decision Directed approach

Wavelet-Based Speech Enhancement

slide26

Literature Survey

[SooKY97], Thresholding 1

Use DWT and find L levels of decomposition

1. Spectral Subtraction (SS) in WD

if

then

Use similar scheme for

Denoised value 

else

Denoised value 

Expected value of the noise magnitude, could be estimated from silence frames

Wavelet-Based Speech Enhancement

slide27

Literature Survey

[SooKY97], Thresholding 2

2. Wiener Filtering in WD

is the a priori SNR

Estimating

a. Maximum Likelihood

b. Decision Directed

[0, 1], Typ. 0.9

Wavelet-Based Speech Enhancement

slide28

Literature Survey

[SooKY97], Implementations

  • Implementation results
    • White Gaussian noise
    • Both male and female voices
    • 10 levels of decomposition

Wavelet-Based Speech Enhancement

slide29

Literature Survey

[SooKY97], Conclusions

  • The methods are not particularly sensitive to the various wavelet types with the exception of Bior3.1
  • Wiener filtered speeches have better SNR values than Magnitude subtraction
  • For Wiener filtering, the decision directed approach gives better SNR values than the maximum likelihood approach

Wavelet-Based Speech Enhancement

slide30

Literature Survey

[KimYK01], Novelty

  • Title:
    • Speech enhancement using adaptive wavelet shrinkage
  • Novelty:
    • Adaptive threshold value
      • Threshold value will depend on the variance of estimated clean signal (BayesShrink)
    • Classification of unvoiced region using entropy
      • Applies smaller threshold for unvoiced region and calls the method as “Adaptive BayesShrink”

Wavelet-Based Speech Enhancement

slide31

Literature Survey

[KimYK01], Threshold Value

  • BayesShrink: Adaptive threshold value for minimizingthe Bayesian riskis
  • Thus, finds the estimated threshold value as

Where

[ChaYV00a]

Wavelet-Based Speech Enhancement

slide32

Literature Survey

[KimYK01], Unvoiced Regions

  • Current region is unvoiced if
  • Unvoiced region has smaller energy, so apply a smaller threshold:

are selected by simulation

There was no comment about type of entropy,it could be as:

Wavelet-Based Speech Enhancement

slide33

Literature Survey

[KimYK01], Implementations

  • Implementation results:
    • Additive white Gaussian noise
    • SNR: 0db, 10dB and 20dB

Wavelet-Based Speech Enhancement

slide34

Literature Survey

[ChaKYK02], Novelty

  • Title: Speech enhancement for non-stationary noise environment by adaptive wavelet packet
  • Novelty:
    • Node dependent thresholding for adaptation in colored or non-stationary noise
    • Noise estimation based on spectral entropy not MAD
    • Modified hard thresholding to alleviate time-frequency discontinuities

Wavelet-Based Speech Enhancement

slide35

Literature Survey

[ChaKYK02], Threshold Value

  • Create WPT and find best basis tree’s leaf nodes
  • Node dependent thresholding
  • Noise estimation could be like:or the following proposed method

Wavelet-Based Speech Enhancement

slide36

Literature Survey

[ChaKYK02], Noise Estimation

  • Estimate spectral pdf of wavelet packet coefficients through B bins histogram
  • Calculate normalized spectral entropy for each node in adapted wavelet packet tree

Wavelet-Based Speech Enhancement

slide37

Literature Survey

[ChaKYK02], Noise Estimation (cont.)

  • Estimate spectral magnitude intensity by histogram
  • Define an auxiliary threshold
  • Estimate standard deviation of noise

# of C. with magnitude equal to or greater than bin’s amplitude

node_length

bins of C. magnitudes

Wavelet-Based Speech Enhancement

slide38

Literature Survey

[ChaKYK02], Noise Estimation (cont.)

Greater disorder of wavelet coefficients (less voiced, more unvoiced)

More uniform spectral pdf

Bigger values for entropy (0  1)

Bigger value for alpha

Smaller # of bins bigger than alpha

Smaller estimation for standard deviation of noise

Wavelet-Based Speech Enhancement

slide39

Literature Survey

[ChaKYK02], Thresholding

ModifiedHard Thresholding

Wavelet-Based Speech Enhancement

slide40

Literature Survey

[ChaKYK02], Implementations

  • Implementation results:
    • Pink noise, SNR: -5db ~ 15 dB

Subjective tests were in favor of the level dependent thresholding but not every time!Anyway, the proposed method has better spectral performance (spectrogram)

Wavelet-Based Speech Enhancement

slide41

Literature Survey

[ChaKYK02], Implementations (cont.)

  • SNR (dB) test for various noisy speech: “We like bleu cheese but Victor prefers swiss cheese.” (SNR= 10dB)

Wavelet-Based Speech Enhancement

slide42

Literature Survey

  • To be continued…

Thank You.

Wavelet-Based Speech Enhancement

slide43

References (1 of 2)

Wavelet-Based Speech Enhancement

slide44

References (2 of 2)

Wavelet-Based Speech Enhancement

wavelet based speech enhancement45

Wavelet-Based Speech Enhancement

Course Project Presentation 1

Thank You

FIND OUT MORE AT...

1. http://ce.sharif.edu/~m_amiri/

2. http://www.aictct.com/dml/