# Convex Optimization in Sinusoidal Modeling for Audio Signal Processing - PowerPoint PPT Presentation

1 / 32

Convex Optimization in Sinusoidal Modeling for Audio Signal Processing. Michelle Daniels PhD Student, University of California, San Diego. Outline. Introduction to sinusoidal modeling Existing approach Proposed optimization post-processing Testing and results Conclusions Future work.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Convex Optimization in Sinusoidal Modeling for Audio Signal Processing

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Convex Optimization in Sinusoidal Modeling forAudio Signal Processing

Michelle Daniels

PhD Student, University of California, San Diego

### Outline

• Introduction to sinusoidal modeling

• Existing approach

• Proposed optimization post-processing

• Testing and results

• Conclusions

• Future work

### Analysis of Audio Signals

• Audio signals have rapid variations

• Speech

• Music

• Environmental sounds

• Assume minimal change over short segments (frames)

• Analyze on a frame-by-frame basis

• Constant-length frames (46ms)

• Frames typically overlap

• Any audio signal can be represented as a sum of sinusoids (deterministic components) and noise (stochastic components)

### Sinusoidal Modeling of Audio Signals

• Given a signal y of length N, represent as Kcomponent sinusoids plus noise e:

• y and e are N-dimensional vectors

• Each sinusoid has frequency (w), magnitude (a), and phase (f)parameters

• K is determined during the analysis process

• Higher-resolution frequencies than DFT bins, no harmonic relationship required

• Model, encode, and/or process these components independently

• Applications:

• Effects processing (time-scale modification, pitch shifting)

• Audio compression

• Feature extraction for machine listening

• Auditory scene analysis

### Estimation Algorithm

• Using frequency domain analysis (e.g. FFT), iterate up to K times, until residual signal is small and/or has a flat spectrum:

• Identify the highest-magnitude sinusoid in the signal

• Estimate its frequency w

• Given w, estimate its magnitude a and phase f

• Reconstruct the sinusoid

• Subtract the reconstructed sinusoid to produce a residual signal

• After all sinusoids have been removed, the final residual contains only noise

### Estimation Challenges

• Energy in any DFT bin can come from:

• Multiple sinusoids with similar frequency

• Both sinusoids and noise

• Interference from other sinusoids and/or noise results in inaccurate estimates

• Incorrect estimation of a single sinusoid corrupts the residual signal and affects all subsequent estimates

### Possible Solution

• Optimize frequency, magnitude, and phase to minimize the energy in the residual signal

• The original parameter estimates are initial estimates for the optimization

• Sinusoidal approximation:

• Residual:

• Optimization problem:

### Is it Convex?

• Want convexity so the problem is practical to solve

• Not a convex optimization problem because each element of ŷ is a sum of cosine functions of w and f

• Want convex function inside of the 2-norm instead

• With fixed frequencies, can reformulate optimization of magnitudes and phases as convex problem

• Fix frequencies to initial estimates

### Convex Optimization Problem

Classic least-squares problem:

Magnitude and phase recovered as:

### Related Work

• PetreStoica, Hongbin Li, and Jian Li. “Amplitude estimation of sinusoidal signals: Survey, new results, and an application”, 2000.

• Mentions least-squares as one approach to estimate amplitude of complex exponentials

• No discussion of phase estimation

• Hing-Cheung So. “On linear least squares approach for phase estimation of real sinusoidal signals”, 2005.

• Focuses on phase estimation

• Theoretical analysis

• Not applied specifically to audio signals

### Constraints

• Analytic least-squares solution frequently results in unrealistic magnitude values

• This is possibly the result of errors in frequency estimates

• Constraints on magnitudes were required

• Ideal constraint:

• Relaxed constraint:

• Result is a constrained least squares problem that can be solved using a generic quadratic program (QP) solver

### Final Formulation

• Magnitude and phase recovered from x as:

### Test Signals

• Model test signals that reproduce challenging aspects of real-world signals

• Reconstruct signal based on original model parameters and optimized parameters

• Compare both reconstructions to original test signal and to each other

### Test Signal 1: Overlapping Sinusoids

• Signal consists of two sinusoids close in frequency

• There is no additive noise, so the residual (the noise component of the model) should be zero

### Results 1: Overlapping Sinusoids

• Without optimization, there is significant energy left in the residual (very audible)

• With optimization, the residual power at individual frequencies is reduced by as much as 50dB (now barely audible)

• The improvement with optimization generally decreases as the frequency separation is increased

### Test Signal 2: Sudden Onset

• A single sinusoid starts half-way through an analysis frame (the first half is silence)

### Results 2: Sudden Onset

Original:

MSE* = 2.76x10-5

Optimized:MSE* = 4.13x10-6

*MSE = Mean

Squared Error

### Test Signal 3: Chirp

• A single sinusoid with constant magnitude and continuously-increasing frequency

### Results 3: Chirp

• Non-optimized peak magnitudes are close to constant between consecutive frames

• Optimized peak magnitudes vary significantly from frame to frame

• The optimization produces peak parameters that do not reflect the underlying real-world phenomenon.

### Conclusions

• Problem can be formulated using convex programming

• For several classic challenging signals, optimization produces a more accurate model

• Constraints are necessary to ensure parameter estimates reflect possible real-world phenomena

• Final formulation is quadratic program

• Parameters obtained via optimization may still not represent the underlying real-world phenomenon as well as the original analysis (i.e. chirp)

### Future Work

• Explore robust optimization techniques to compensate for errors in frequency estimates

• Integrate optimization into original analysis instead of a post-processing stage

• Experiment with more real-world signals

• Further investigate constraints

• The ultimate goal: three-way joint optimization of frequency, magnitude, and phase

### References

• M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 1.21. http://cvxr.com/cvx, May 2010.

• R. McAulay and T. Quatieri. Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4):744-754, Aug 1986.

• Xavier Serra. A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition. PhD thesis, Stanford University, 1989.

• Kevin M. Short and Ricardo A. Garcia. Accurate low-frequency magnitude and phase estimation in the presence of DC and near-DC aliasing. In Proceedings of the 121st Convention of the Audio Engineering Society, 2006.

• Kevin M. Short and Ricardo A. Garcia. Signal analysis using the complex spectral phase evolution (CSPE) method. In Proceedings of the 120th Convention of the Audio Engineering Society, 2006.

• Hing-Cheung So. On linear least squares approach for phase estimation of real sinusoidal signals. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E88-A(12):3654-3657, December 2005.

• PetreStoica, Hongbin Li, and Jian Li. Amplitude estimation of sinusoidal signals: Survey, new results, and an application. IEEE Transactions on Signal Processing, 48(2):338-352, 2000.

For further information:

http://ccrma.stanford.edu/~danielsm/ifors2011.html

### Convex Reformulation

Define:

Change of variables:

Define:

### Test Signal: Sinusoid in noise

• A single sinusoid with stationary frequency and corrupted by additive white Gaussian noise

• Noise is present at all frequencies, including that of the sinusoid, corrupting magnitude and phase estimates

• Test repeated using different variances for the noise (varying signal-to-noise ratios)

### Results: Sinusoid in noise

• Without optimization, the sinusoid’s magnitude is over-estimated and the noise’s energy is under-estimated

• The optimization gives residual energy slightly closer to the true noise energy.

### Results: Overlapping Sinusoids

The optimization is able to compensate for some of the errors in initial magnitude and phase estimation, resulting in a lower MSE.