1 / 14

A Probabilistic Model for Melody Segmentation

A Probabilistic Model for Melody Segmentation. By Miguel Ferrand, Peter Nelson, and Geraint Wiggins. Outlines. Overview of this model N-gram models and Entropy A case study Compare with the experiment from real listeners Discussion. Overview.

keona
Download Presentation

A Probabilistic Model for Melody Segmentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Probabilistic Model for Melody Segmentation By Miguel Ferrand, Peter Nelson, and Geraint Wiggins

  2. Outlines • Overview of this model • N-gram models and Entropy • A case study • Compare with the experiment from real listeners • Discussion

  3. Overview • A probabilistic approach to predict segmentation boundaries in melodies • No knowledge of music theories is used in this model, pure mathematic method • Use entropy as a measure of unpredictability of music features • Guess that segmentation boundaries will appear at the changes of entropy

  4. N-gram Models (1) • N-gram grammar (Nth order Markov model): P of occurrence of a symbol depends on the prior occurrence of n -1 other symbols. • The probability of sequence s = w1…wl of length l (wji: wi…wj, n: the order)

  5. N-gram Model (2) • Problems: • Data sparseness: some P(wi | …) = 0 • Longer sequences will have lower counts if training corpus is small • Use linear interpolation smoothing method, Take tri-gram for example, P(wk | wk-3, wk-2, wk-1) = λ1P(wk) + λ2P(wk | wk-1) + λ3P(wk | wk-2, wk-1), where λ1 + λ2 + λ3 = 1 and λ1 < λ2 < λ3

  6. Entropy • For an N-gram model M, entropy Hc(M) associated with context c, (e is all possible successor symbol of c) P(e | c) is calculated from linear interpolation smoothing method. Low entropy usually means high predictability.

  7. A case study (1) • Deliège’s experiment • Subjects listened to a melody and had to identify segmentation points in real-time. (Use the solo for English Horn, from Tristan and Isolde by Wagner) • Subjects are both musically trained and untrained. • Found 8 main segment boudaries

  8. A case study (2) • Translate melody information to event-based representation • Pitch Step (PS): interval distance to following event in semitones • Pitch Contour (PC): the sign of PS, {-1, +1, 0} • Duration Ratio (DR): DR of the present and following event • Duration Contour (DC): the change of DR; -1 if DR >1; 1 if DR < 1; 0 if DR = 1

  9. A case study (3)

  10. A case study (4) • Tri-gram, bi-gram and uni-gram model was generated for PS, PC, DR and DC. • Standard deviation of entropy is calculated with sliding window (size = 10) • Results

  11. A case study (5)

  12. A case study (6)

  13. Result • Duration based features have a much higher entropy variance than pitch based features. Therefore time based features are more likely to convey more information for segmentation. • Distinct changes in entropy happened to be melody segment boundaries indicated by listeners.

  14. Discussion • N-gram model might be over-simplified for music sequences. • A state depends only on the previous states. • However, human’s memory is not infinite, either. • The ability to establish large-span temporal relations is limited.

More Related