1 / 26

Tree structured and combined methods for comparing metered polyphonic music

Tree structured and combined methods for comparing metered polyphonic music. Kjell L ëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008. Outline. Objectives State of the art Tree representation of monodies and polyphonic songs

olivern
Download Presentation

Tree structured and combined methods for comparing metered polyphonic music

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tree structured and combined methods for comparing metered polyphonic music Kjell Lëmstrom David Rizo Valero José Manuel Iñesta CMMR’08 May 21, 2008

  2. Outline • Objectives • State of the art • Tree representation of monodies and polyphonic songs • Comparison of trees for obtaining similarities between songs • Geometric methods • Combination of methods • Experiments • Conclusions and work lines

  3. Melodic comparison (symbolic) Given the sequence of notes at the scores … Are those tunes the same?

  4. Target • Polyphonic music comparison of whole songs

  5. Approaches to polyphonic comparison • Convert into monophonic • Use sequence comparison • Adapted text retrieval methods • PROMS: Clausen et al ‘00 • Doraisamy and Rüger ‘04: n-grams • Geometric methods • Lubiw and Tanur ‘04 • Ukkonen, Lemström and Mäkinen ‘03 + CMMR’08 Session: MUSR: Music Retrieval papers

  6. Tree representation for monodies whole 4 beats half 2+2 1 4/4 bar 4×1 quarter eighth 8×½ F Duration C E G Initial time Tree construction process (Rizo et al. ’03) • Based on the logarithmic nature of music notation • Each tree level is a subdivision of the upper level . . . . . . . . . . . . . . . . . . . . . . . . . • Leaf labels can be any pitch magnitude • Rests are coded the same way as notes • Duration is implicitly coded in the tree structure

  7. Tree representation Representation of whole melodies • The complete melody (all bars) is a forest (all trees) • Bars can be grouped sequentially or hierarchically C E G F A B C G Sequential grouping: G C A B F C E G

  8. {C,E,F,G} G E {C,F,G} {G} {C,G,E} Actually, the interval from the tonic is represented in the tree Using tree tonality guessing (rizo et al.’06) {0,4,5,7} {0,5,7} {0,7} {0} {5} Polyphonic tree representation C F CG Process repeated for each voice: replace single labels for sets {C,G} Propagate from bottom using set union {C} {F}

  9. Polyphonic tree representation • Better tree summarization: Use duration importance: rhythmic weights  Multiset Rhythmic weight = 2h-l h = tree height l = node level {C=3,E=2,F=1,G=4} l = 1 {C=2,E=2,G=2} {C=1,F=1,G=2} l = 2 {C=1} {F=1} l = 3 It has been tested to use theKrumhansl-Schmuckler profiles along with the rhythmic weights: worse results

  10. Comparing songs • Compare songs = compare trees • Approaches • Classical tree edit distances • Shasha • Selkow • Use only the information of the roots • Sequence edit distance • Longest Common Subsequence

  11. Tree comparison {C=0.3, E=0.2, G=0.5} {F=1, G=1,A=1, B=0.2} { C=0.6, F=0.2 } { C=0.3, E=0.1, G=1 } Sa { C=0.6, F=0.2 } Labels of the root of each tree • Use only information in the roots • Roots contain the summary of its children after propagation . . . . . Bar 1 Bar 2 Bar 3 Bar 4 Bar N • RootED and LCRS: • Let  be a tree level ot tree T, compose a sequence S(T) with all nodes at that level in the forest • RootED and LCRS use =1 • Distance between 2 songs A and B at a level  • d(A,B, a, b)= stringDistance(Sa(A), Sb(B)) • or • d(A,B, a, b)= LCS(Sa(A), Sb(B)) Complexity with  = 1 O(|barsA| * |barsB|)

  12. Multiset substitution cost • Define multiset as a vector: • Index = interval from tonic • Value = cardinality • E.g: {C=1, G=4, B=2} is defined as • [1,0,0,0,0,0,0,4,0,0,0,2] • Multiset substitution cost between multisets X and Y represented by vectors v and w

  13. Graphical representations • P1, P2, P3 algorithms from Ukkonen, Lemström, Makinen ‘03 • P2v5, P2v6: indexed versions of P2 • Not published yet

  14. Method combination • Dissimilarity measure for a method = distance between songs • Combined dissimilarity measure = combination of distances between songs • Combination = sum of normalized distances

  15. Experiments • Corpora: • ICPS (68 files): • 7 different polyphonic incipits: Schubert’s Ave Maria, Ravel’s Bolero, Alouette, Happy Birthday, Frère Jacques, Jingle Bells, When The Saints Go Marching In • Covers made up of polyphonic piano files + “Band in a box” variations • VAR (78 files): • Bach Goldberg variations • Bach english suites variations • Some Tchaikowsky variations

  16. Evaluation method • Leave one out • All-against-all: each song S is compared with the rest of the songs, the result is an ordered list with the most similar songs first • Accuracy • Top-recognition-rate (TRRn): presence percentage of the a version of the song S among the top n slots • Success rate = TRR0 • Precision-at-|class| • |class| = number of versions of the same song • Times • Exclude preprocessing times: only performed once at startup of system • Averages: all results are averages of all queries

  17. Results: ICPS Combined method: success rate Time and success rate

  18. Results: VAR Combined method: again success rate Cuccess rate

  19. Top-recognition-rate: ICPS Combined method gets a good result

  20. Top-recognition-rate: VAR Combined method is the best one: combined methods are more robust

  21. Conclusions and work lines Query • Very hard task when MIDI files are real ones • Preprocess songs: Use automatic tonal analysis + tree propagation to remove non-important notes in songs • Improve results by combining more different classifiers • Tune the tree comparison measures: submitted • Add LCS fast implementation from Hyyrö ‘04 • Add confidence values to LCS • Include meter extraction methods to build the trees MIDI

  22. END

  23. Melody = sequence of notes • String representation + string distances • (Mongeau and Sankoff ‘90, Lemström 2000) GGAGCBGG GAGAGGCBB • Symbols are combinations of pitch x rhythm • Pitch can be: absolute pitch, pitch class, interval from tonic, interval, contour, high-def contour, nothing • Rhythm can be: absolute, inter-onset interval, inter-onset ratio, contour, nothing • e.g.: (G4,8)(G4,8)(A4,4)(G4,4)(C5,4)(B4,2)(G4,8)(G4,8) • Best comparison results using intervals • with no rhythm information

  24. Too many ornament notes:  edit distance ≈ String distances • Drawbacks on the comparison without rhythm • Wrong results with: Same melodic distance and different rhythm:  edit distance  Hungarian dance, Schubert Godfather theme

  25. Tree representation Tree construction process Rules (Rizo et al., 2003) • Propagation and prunning s F F A G Tree as initially coded from the score Max. prunning level defined

  26. Tree representation C C C G C G C A C C A C A A A Melodic similarity metrics • TREE EDIT DISTANCE (Zhang & Shasha, 1989) The distance is computed as the cost of the operations to transform one tree into the other. t1 t2 d(t1,t2) Weighted operations of insertion deletion replacement Tree edit distance O( |T1|  |T2| h(T1) h(T2) ) Previous prunning process helps to overcome this complexity (Zhang & Shasha, “Simple fast algorithms for the editing distance between trees...”. SIAM J Comput., 8(6): 1245-1262. 1989)

More Related