100 likes | 252 Views
This document provides an in-depth overview of Dynamic Programming (DP) and its application to Music Information Retrieval (MIR). It discusses the origin of DP, the concept of Dynamic Time Warping (DTW), and how these methods solve multi-stage decision processes that characterize music analysis. Specific applications include query by humming, tempo tracking, and rhythmic similarity measurement. Key references highlight the importance of segmentation and feature selection in these processes. Understanding these methods offers insights into effective audio analysis and alignment techniques in MIR.
E N D
Dynamic Programming Carmine Casciato MUMT 611 Thursday March 31st 2005
Overview • Problem Space • Origins • Dynamic Time Warping • Overview of Usage in MIR
Problem Space • Multi-stage decision processes • system S characterized as evolution of vector p • N stages * M decisions/stage • multi-dimensional maximization problems
Origins • Dynamic Programming (DP) “…the optimal decision to be made at any state of the system. ”Bellman (1957) • “Dynamic” refers to temporal nature of S • Each decision is determined by max/min cost of previous state • Allocation problem, x = y + x-y • fN(x) = Max/Min [g(y) + h(x-y) + fN-1(ay + b(x-y)) 0 <= y <= x
Rabiner and Huang 1993 • Dynamic Time Warping (DTW) as solution for time-alignment and normalization of two utterances • (Dis)similarity measurement of two vectors of short-time spectral features is equal to “best” path through feature grid
DTW • Path Constraints • endpoint • monotonicity • local path constraints • global path constraints • slope weighting • locally and globally • Dissimilarity metric, constraints, weightings, are all heuristically determined
Paulus and Klapuri 2002 • Adopts Rabiner and Huang (1993) DTW to rhythmic similarity • Depends on correct segmentation of rhythms from audio signal • Finds optimal path between feature vectors of loudness and spectral centroid
Usage in MIR • Query by humming • Heo et al. 2003 • Adams et al. 2004 • Nishimura et. al 2001 • Tempo tracking • Raphael 2002 • Feature selection • Chang 1972
References • Adams, N., M. Bartsch, J. Shifrin, and G.Wakefield. 2004. Time series alignment for music information retrieval. In Proceedings of the International Conference on Music Information Retrieval: 30310 • Bellman, R. 1957. Dynamic Programming. Princeton: Princeton University Press. • Chang, C. 1972. Dynamic programming as applied to feature subset selection in a pattern recognition system. In Proceedings of the ACM annual conference 1: 94103. • Guo, A., and H. Siegelman. 2004. Time-warped longest common subsequence algorithm for music retrieval. In Proceedings of the International Conference on Music Information Retrieval: 25861. • Heo, S., M. Suzuki, A. Ito, and S. Makino. 2003. Three dimensional continuous DP algorithm for multiple pitch candidates in music information retrieval system. In Proceedings of the International Conference on Music Information Retrieval. • Nishimura, T., H. Hashiguchi, J. Takita, J. Zhang, M. Goto, and R. Oka. 2001. Music signal spotting retrieval by a humming query using start frame feature dependent continuous dynamic programming. In Proceedings of the International Conference on Music Information Retrieval. • Paulus, J., and A. Klapuri. 2002. Measuring the similarity of rhythmic patterns. In Proceedings of the International Conference on Music Information Retrieval. • Raphael, C. 2002. A hybrid graphical model for rhythmic parsing. Artificial Intelligence 137: 217–38.