Create Presentation
Download Presentation

Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

191 Views

Download Presentation
Download Presentation
## Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**ISCA Tutorial and Research Workshop on Statistical And**Perceptual Audition (SAPA) 2010 Informed Source Separation of Orchestra and Soloist Using Masking and Unmasking By Yushen Han, Christopher Raphael School of Informatics and Computing, Indiana University Bloomington Saturday 25 September 2010, Makuhari, Japan**Motivation – Musical Source Separation In General**• To extract the orchestra accompaniment from any desirable recordings given the score - can be used in automatic accompaniment system (e.g. a piano concerto) or Karaoke • To isolate one chosen instrument from an ensemble • can be used in performance analysis of soloist**Motivation – This Project**• Problem: separation procedures cause damage to each of the separated sources • Object: to address the degradation of the separation results • Strategy: by exploiting the information redundancy of the musical audio within each source**Overview**• Motivation and introduction – system diagram • Previous – Separation by Spectrogram Masking • Recent – Repair by Spectrogram Unmasking • harmonicity hypothesis tested by Kalman phase smoothing • repair by amplitude inference and harmonic transposition • Examples with evaluation by spectrogram reassignment • Relevant works – ISS? • Conclusion and Discussion**Informed Source Separation System Diagram**System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper**Separation “Informed” by Score Following**solo accompaniment**Previous: (Binary) Spectrogram Masking**Short-time Fourier Transform Complementary binary masks with (hard binary mask)**Previous: 2D Note-based Model**a “template” function qm of note model indexed by m**Informed Source Separation System Diagram**System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper**Phase Estimation**• Amplitude-Phase Decoupling Model Slowing varying at hth harmonic Locally linear in s up to a small correction term amplitude signal phase Phase unwrapping**State-space Model for Phase (cont.)**• This idea of using state-space model to estimate phase should be credited to A. TaylanCemgil. For observable phase sequence at hth harmonic we introduce state vector with an unobservable component As timesprogresses (discretely), the state vector propagates via the state equation Where state transition matrix governs the the sinusoidal movement of phase according to the average phase advance at harmonic h over a relative long period (s0, s1) and w(s)is an unobservable, zero-man random (state) perturbation.**Illustration of State-space Model for Phase Estimation**s s + 1 x2 x1 = (x1(s), x2(s) )t, but only observe y = x1 connects the observed and unobservable where H(s)=[1 0] and r(s) = 0 is the degenerated random (observation) perturbation**Kalman Smoothing**Follows the state-space model, we can obtain the amplitude and phase This state-space model can be computed by Kalman filter but since the phase estimation is offline, we can update the state estimates backward to incorporate the observation that were not “available” at sample t in the forward pass by Kalman smoothing**Informed Source Separation System Diagram**System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper**Phase Estimation And Pairwise Unwrapped Phase Difference**? pitch G#3 (written A#3 on Bb clarinet) over a crescendo clarinet**Application of Phase Estimation – Using Pairwise Unwrapped**Phase Difference to Test the Harmonicity Hypothesis By projecting the unwrapped phase θi(s) from harmonic i to j we visualize the unwrapped phase difference between harmonics in woodwinds and strings to test the harmonicity hypothesis**Informed Source Separation System Diagram**System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper**Informed Source Separation System Diagram**System Input System Output • Expectation- Maximization • Dynamic Programming • Machine Learning • (binary classification) • Harmonic- Percussive Separation by Spectrogram Masking Audio-score Alignment Desoled Audio Damaged Audio EM DP ML HPSS Score Evaluation According to BASS Note-wise audio reconstruction Note sample models 2D Spectral Modeling Note sample library Phase Estimation by Kalman Smoothing Harmonicity Hypothesis Amplitude Inference Desoled Audio Repaired Phase Estimation Mostly Previous Work Recent Development Focus of This Paper**Damage Repair**b b**Experiment**an excerpt of 45 seconds from the 2nd movement of Ravel’s piano concerto in G major**Excerpt from the 2nd movement of Ravel’s piano concerto in**G major**Relevant works**• BSS • NMF (non-negative “part-based representation” in NMF) • Latent variable decomposition by Raj, Smaragdis • Other Score-guided separation by Dubnov • “Informed Source Separation” using watermark by Parvaix • Harmonic/Percussive Sound Separation (HPSS), by Sagayama, Ono • Physical Acoustics, Fletcher**Conclusion and Future Work**• Harmonic-wise Information Redundancy Expressed In both amplitude and phase can be used to inference “partially” damaged notes • Creating a framework to perform separation/repair in a large scale with synthesized ground truth and with BASS performance measurement by E. Vincent • (coming soon) xavier.informatics.indiana.edu/~yushan/SAPA2010**FINE**Thank you for your attention