spatial covariance models for under determined reverberant audio source separation n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation PowerPoint Presentation
Download Presentation
Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation

Loading in 2 Seconds...

play fullscreen
1 / 14

Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation. N. Duong, E. Vincent and R. Gribonval METISS project team, IRISA/INRIA, France Oct. 2009. Content. Under-determined source separation Spatial covariance models Model parameter estimation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation' - gemma-lyons


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
spatial covariance models for under determined reverberant audio source separation

Spatial Covariance Models For Under-Determined Reverberant Audio Source Separation

N. Duong, E. Vincent and R. Gribonval

METISS project team, IRISA/INRIA, France

Oct. 2009

content
Content
  • Under-determined source separation
  • Spatial covariance models
  • Model parameter estimation
  • Experimental evaluation
  • Conclusion
under determined source separation
Under-determined source separation
  • Use recorded mixture signals to separate sources , where
  • Convolutive mixing model: Denote the vector of mixing filters from source to microphone array, the contribution of to all microphones and the vector of mixture signals are computed as:
bss approaches
BSS approaches

Short-term Fourier transform

Sparsity assumption: only FEW sources are active at each time-frequency point

Binary masking: only ONE source is active at each time-frequency point

L1-norm minimization:

beamforming model
Beamforming model

is denoted as and approximated by the distance between each source to microphones [T. Gustafsson et.al.], i.e. in stereo mixture:

Covariance matrix of source images

Spatial covariance matrix (rank 1) modeling the mixing process

Source

variance

spatial covariance models
Spatial covariance models
  • Purpose of the paper: explore the extension of Gaussian framework, i.e. and , that better account for reverberation
  • We evaluate potential separation performance by estimating the spatial model parameter from training data
  • Source separation by Wiener filtering
  • Models for spatial covariance matrix:
    • Rank-1 convolutive model
    • Rank-1 anechoic model
    • Full-rank direct+diffuse model
    • Full-rank unconstrained model.
rank 1 models
Rank-1 models
  • Rank-1 anechoic model
  • Where is steering vector specified in the beamforming approach
  • Rank-1 convolutive model
  • Where is the Fourier transform of the mixing filters
full rank direct diffuse model
Full-rank direct+diffuse model
  • Assuming that the direct part and the reverberant part are uncorrelated and the reverberant part is diffuse
  • where and can be specified from statistical room acoustic, i.e. depends on the microphone distance , wall area , and wall reflection coefficient
  • - In the rectangular room:
full rank unconstrained model
Full-rank unconstrained model

- A more general model than the previous models where the coefficients of are not related a priori

- Allows more flexible modeling of the mixing process since the reverberation part is rarely diffuse and is correlated with the direct part in practice

- Expected to improve separation performance of real-world

convolutive mixtures.

model parameter estimation
Model parameter estimation
  • We investigate the potential separation performance achievable via each model in:
  • Semi-blind context:Spatial covariance matrices are estimated from true source images but source variances are blindly estimated from the mixture in the ML sense
  • Where is the Kullback-Leibler (KL) divergence between the empirical covariance matrices and the model-based matrices.
  • Oracle context:Both and are estimated from the true source images.
experiment
Experiment
  • Purpose:

- Compare the source separation

performance of the model-based

algorithms

- Criteria: SDR, SIR, SAR

Room dimensions: 4.45 x 3.35 x 2.5 m

Source and microphone height: 1.4 m

Microphone distance: d = 20 cm or 5 cm

Source-to-microphone distance: 120 cm or 50 cm

s2

s1

r

m1

m2

1.8m

  • Experimental setup:

- Speech length: 5 seconds

- Sampling rate: 16 kHz

- Sine window for STFT with length of 1024 taps

1.5m

s3

conclusion
Conclusion
  • - Proposed to model the convolutive mixing process by full-rankspatial covariance matrices
  • - Experimental results confirm that full-rank spatial covariance matrices better account for reverberation and potentially improve separation performance compared to rank-1 matrices.
  • Work in progress
  • - Validated the power of the proposed algorithms over real-world recordings with small source movement (demo session)
  • - Blind context: learning the model parameters from the recorded mixture (submitted to ICASSP 2010 ).
  • Future work:
  • - Consider separation of diffuse and semi-diffuse sources