Convex optimization in sinusoidal modeling for audio signal processing
Download
1 / 32

Convex Optimization in Sinusoidal Modeling for Audio Signal Processing - PowerPoint PPT Presentation


  • 191 Views
  • Uploaded on

Convex Optimization in Sinusoidal Modeling for Audio Signal Processing. Michelle Daniels PhD Student, University of California, San Diego. Outline. Introduction to sinusoidal modeling Existing approach Proposed optimization post-processing Testing and results Conclusions Future work.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Convex Optimization in Sinusoidal Modeling for Audio Signal Processing' - astrid


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Convex optimization in sinusoidal modeling for audio signal processing

Convex Optimization in Sinusoidal Modeling forAudio Signal Processing

Michelle Daniels

PhD Student, University of California, San Diego


Outline
Outline

  • Introduction to sinusoidal modeling

  • Existing approach

  • Proposed optimization post-processing

  • Testing and results

  • Conclusions

  • Future work


Analysis of audio signals
Analysis of Audio Signals

  • Audio signals have rapid variations

    • Speech

    • Music

    • Environmental sounds

  • Assume minimal change over short segments (frames)

  • Analyze on a frame-by-frame basis

    • Constant-length frames (46ms)

    • Frames typically overlap

  • Any audio signal can be represented as a sum of sinusoids (deterministic components) and noise (stochastic components)


Sinusoidal modeling of audio signals
Sinusoidal Modeling of Audio Signals

  • Given a signal y of length N, represent as Kcomponent sinusoids plus noise e:

  • y and e are N-dimensional vectors

  • Each sinusoid has frequency (w), magnitude (a), and phase (f)parameters

  • K is determined during the analysis process

  • Higher-resolution frequencies than DFT bins, no harmonic relationship required

  • Model, encode, and/or process these components independently

  • Applications:

    • Effects processing (time-scale modification, pitch shifting)

    • Audio compression

    • Feature extraction for machine listening

    • Auditory scene analysis


Estimation algorithm
Estimation Algorithm

  • Using frequency domain analysis (e.g. FFT), iterate up to K times, until residual signal is small and/or has a flat spectrum:

    • Identify the highest-magnitude sinusoid in the signal

    • Estimate its frequency w

    • Given w, estimate its magnitude a and phase f

    • Reconstruct the sinusoid

    • Subtract the reconstructed sinusoid to produce a residual signal

  • After all sinusoids have been removed, the final residual contains only noise






Estimation challenges
Estimation Challenges

  • Energy in any DFT bin can come from:

    • Multiple sinusoids with similar frequency

    • Both sinusoids and noise

  • Interference from other sinusoids and/or noise results in inaccurate estimates

  • Incorrect estimation of a single sinusoid corrupts the residual signal and affects all subsequent estimates


Possible solution
Possible Solution

  • Optimize frequency, magnitude, and phase to minimize the energy in the residual signal

  • The original parameter estimates are initial estimates for the optimization

  • Sinusoidal approximation:

  • Residual:

  • Optimization problem:


Is it convex
Is it Convex?

  • Want convexity so the problem is practical to solve

  • Not a convex optimization problem because each element of ŷ is a sum of cosine functions of w and f

  • Want convex function inside of the 2-norm instead

  • With fixed frequencies, can reformulate optimization of magnitudes and phases as convex problem

    • Fix frequencies to initial estimates


Convex optimization problem
Convex Optimization Problem

Classic least-squares problem:

Magnitude and phase recovered as:


Related work
Related Work

  • PetreStoica, Hongbin Li, and Jian Li. “Amplitude estimation of sinusoidal signals: Survey, new results, and an application”, 2000.

    • Mentions least-squares as one approach to estimate amplitude of complex exponentials

    • No discussion of phase estimation

  • Hing-Cheung So. “On linear least squares approach for phase estimation of real sinusoidal signals”, 2005.

    • Focuses on phase estimation

    • Theoretical analysis

  • Not applied specifically to audio signals


Constraints
Constraints

  • Analytic least-squares solution frequently results in unrealistic magnitude values

    • This is possibly the result of errors in frequency estimates

  • Constraints on magnitudes were required

  • Ideal constraint:

  • Relaxed constraint:

  • Result is a constrained least squares problem that can be solved using a generic quadratic program (QP) solver


Final formulation
Final Formulation

  • Quadratic Program:

  • Magnitude and phase recovered from x as:


Test signals
Test Signals

  • Model test signals that reproduce challenging aspects of real-world signals

  • Reconstruct signal based on original model parameters and optimized parameters

  • Compare both reconstructions to original test signal and to each other


Test signal 1 overlapping sinusoids
Test Signal 1: Overlapping Sinusoids

  • Signal consists of two sinusoids close in frequency

  • There is no additive noise, so the residual (the noise component of the model) should be zero


Results 1 overlapping sinusoids
Results 1: Overlapping Sinusoids

  • Without optimization, there is significant energy left in the residual (very audible)

  • With optimization, the residual power at individual frequencies is reduced by as much as 50dB (now barely audible)

  • The improvement with optimization generally decreases as the frequency separation is increased


Test signal 2 sudden onset
Test Signal 2: Sudden Onset

  • A single sinusoid starts half-way through an analysis frame (the first half is silence)


Results 2 sudden onset
Results 2: Sudden Onset

Original:

MSE* = 2.76x10-5

Optimized:MSE* = 4.13x10-6

*MSE = Mean

Squared Error


Test signal 3 chirp
Test Signal 3: Chirp

  • A single sinusoid with constant magnitude and continuously-increasing frequency


Results 3 chirp
Results 3: Chirp

  • Non-optimized peak magnitudes are close to constant between consecutive frames

  • Optimized peak magnitudes vary significantly from frame to frame

  • The optimization produces peak parameters that do not reflect the underlying real-world phenomenon.


Conclusions
Conclusions

  • Problem can be formulated using convex programming

  • For several classic challenging signals, optimization produces a more accurate model

    • Constraints are necessary to ensure parameter estimates reflect possible real-world phenomena

    • Final formulation is quadratic program

  • Parameters obtained via optimization may still not represent the underlying real-world phenomenon as well as the original analysis (i.e. chirp)


Future work
Future Work

  • Explore robust optimization techniques to compensate for errors in frequency estimates

  • Integrate optimization into original analysis instead of a post-processing stage

  • Experiment with more real-world signals

  • Further investigate constraints

  • The ultimate goal: three-way joint optimization of frequency, magnitude, and phase


References
References

  • M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 1.21. http://cvxr.com/cvx, May 2010.

  • R. McAulay and T. Quatieri. Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4):744-754, Aug 1986.

  • Xavier Serra. A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition. PhD thesis, Stanford University, 1989.

  • Kevin M. Short and Ricardo A. Garcia. Accurate low-frequency magnitude and phase estimation in the presence of DC and near-DC aliasing. In Proceedings of the 121st Convention of the Audio Engineering Society, 2006.

  • Kevin M. Short and Ricardo A. Garcia. Signal analysis using the complex spectral phase evolution (CSPE) method. In Proceedings of the 120th Convention of the Audio Engineering Society, 2006.

  • Hing-Cheung So. On linear least squares approach for phase estimation of real sinusoidal signals. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E88-A(12):3654-3657, December 2005.

  • PetreStoica, Hongbin Li, and Jian Li. Amplitude estimation of sinusoidal signals: Survey, new results, and an application. IEEE Transactions on Signal Processing, 48(2):338-352, 2000.


Thanks for your attention
Thanks for your attention!

For further information:

http://ccrma.stanford.edu/~danielsm/ifors2011.html



Convex reformulation
Convex Reformulation

Define:

Change of variables:

Define:


Test signal sinusoid in noise
Test Signal: Sinusoid in noise

  • A single sinusoid with stationary frequency and corrupted by additive white Gaussian noise

  • Noise is present at all frequencies, including that of the sinusoid, corrupting magnitude and phase estimates

  • Test repeated using different variances for the noise (varying signal-to-noise ratios)


Results sinusoid in noise
Results: Sinusoid in noise

  • Without optimization, the sinusoid’s magnitude is over-estimated and the noise’s energy is under-estimated

  • The optimization gives residual energy slightly closer to the true noise energy.


Results overlapping sinusoids
Results: Overlapping Sinusoids

The optimization is able to compensate for some of the errors in initial magnitude and phase estimation, resulting in a lower MSE.


ad