80 likes | 87 Views
Segmenting Popular Music Sentence by Sentence. Wan-chi Lee. Basic Idea. In a song, The energy of audio signal will be low in the gap between sentences. Trying to detect the energy gap. Problem: There will be accompaniment sound.
E N D
Segmenting Popular Music Sentence by Sentence Wan-chi Lee
Basic Idea • In a song, The energy of audio signal will be low in the gap between sentences. • Trying to detect the energy gap. • Problem: • There will be accompaniment sound. • The dynamic range of audio signal varies a lot: hard to choose threshold.
Methods • Band-pass Filtering the signal: • Here I use 6 order elliptic filter with pass band 800Hz~1.6KHz. • For a short sliding window, calculating the average energy of the signal • I use a 0.1 second window. • Detecting the valley of average energy by piecewise linear approximation.
Piece-wise Linear Approximation • I used a top-down method in determining the approximation. • Specify an error bound. • Find a segmentation point that best improve the approximation. • Calculate linear regression for each segment as the approximation. • If the error bound is not achieved, repeat above steps.
Segmentation Point • After finding the linear approximation, choose points representing the gap in energy. • Place some restrictions to make the segments be in reasonable length.
Demo and Discussion • I only used one feature. Other features can be incorporated. • Heuristic method: no training needed, but lots of parameters to tune. • It should be integrated with onset detection to let the segmenting points coincide with the onset.