1 / 28

Automatic Genre Classification of Music Content [A survey]

Automatic Genre Classification of Music Content [A survey]. Nicolas Scaringella, Giorgio Zoia, Daniel Mlynek, IEEE SIGNAL PROCESSING MAGAZINE MARCH 2006. By Yi-Tang Wang. Outline. Introduction Feature extraction techniques Genre classification paradigms Classification results

jovita
Download Presentation

Automatic Genre Classification of Music Content [A survey]

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Genre Classification of Music Content[A survey] Nicolas Scaringella, Giorgio Zoia, Daniel Mlynek, IEEE SIGNAL PROCESSING MAGAZINE MARCH 2006 By Yi-Tang Wang

  2. Outline • Introduction • Feature extraction techniques • Genre classification paradigms • Classification results • Future directions & Conclusion

  3. Introduction • EMD (electronic music distribution) • Restoration of analog archives • New content • music catalogues become huge • What do you want to listen ? • 1 million tracks online • Efficient ways to browse & organize

  4. Introduction (cont.) • Music Genres • Categories to characterize similarities • Boundaries are fuzzy • Automatic Classification • Finding a taxonomy • Hierarchical set of categories • Nontrivial task

  5. Critical issues • Artists, Albums, or Titles • One song to one genre(?) • Albums - heterogeneous material • Artists - several albums • Same Titles? • Nonagreement on Taxonomies • Allmusic, Amazon, Mp3 [2] F. Pachet and D. Cazaly, “A taxonomy of musical genres,” in Proc. Content- Based Multimedia Information Access (RIAO), Paris, France, 2000

  6. Critical issues (cont.) • ILL-Defined Genre Labels • Varied criteria (geographically, timely, etc) • Dependant on cultural • Scalability of genre taxonomies • New genres appear frequently • Merging or splitting • Automatic system

  7. Feature extraction techniques • High-level model • Event-like format (MIDI) • Symbolic format (MusicXML) • Rarely availiable • Low-level • Audio samples • Low level and low density of info • Do feature extraction • Timbre, Melody, Harmony, Rhythm

  8. Timbre • Same pitch and loudness but sound different • Features to characterize timbre • Temporal features • Energy features • Spectral shape features • Perceptual features • Some have been normalized in MPEG-7

  9. Timbre (cont.)

  10. Timbre (cont.) • Transformations • new feature or increase dimensionality • Suggested transforming into logarithmic decibel scale • Texture window • Larger window • Reduce computation • Increase classification accuracy • 1s • Variant size and positions

  11. Timbre (cont.) • Texture model • model of features over texture window: • 1) simple modeling with low-order statistics • 2) modeling with autoregressive model • 3) modeling with distribution estimation algorithms (for example, EM estimation of a GMM of frames)

  12. Melody & Harmony • Melody • succession of pitched events • Horizontal element • Harmony • pitch simultaneity, chords • Vertical element

  13. Melody & Harmony (cont.) • Pitch function • Characterizing pitch distribution • Amplitude, position of main peak, … • Unfolded • Contains pitch content and info of its range • Folded • Mapped to a single octave • Harmonic content

  14. Rhythm • No precise definition • Generically, all of the temporal aspects • Periodicity function • Low level approach as pitch function • 1) tempo: periodicities typically in the range 0.3–1,5s (i.e., 200–40 bpm) • 2) musical pattern: periodicities between 2 and 6 s (corresponding to the length of one or more measure bar) • Gouyon et al. get MFCCs-like descriptors

  15. Extracting from segments • Small segment may contain sufficient information • Reduced required computation • Typically 30s segment • and 30s after beginning • Artist classification • Voice is easier to identify than music only

  16. Local conclusion • High level descriptors from polyphonic audio signal is not yet state of the art • Focus on timbre modeling • Timbre may contain sufficient info • 250ms : 53% , 3s : 72% • Among 10 genres

  17. Local conclusion (cont.) • Another point of view (pessimistic) • Timbre similarity measure & 20,000 titles distributed over 18 genres • Little correlation • May not scalable • Take cultrual features into account

  18. Genre classification • Expert systems • Unsupervised approach • clustering • Supervised approach • Machine learning algorithms

  19. Expert systems • A knowledge based system made up of a set of rules • No model based on it so far • Expensive to implement and maintain • May yield unexpected interactions

  20. Expert systems (cont.) • Pachet and Cazaly’s work • State differences with language based, e.g. instrumentation

  21. Unsupervised approach • Clustering with similarity measures • Similarity measures • If time invariant • Euclidean distance or cosine distance • Otherwise • Build statistical model (Gaussian or GMMs) • Kullback-Leibler divergence, relative entropy • Sampling, Earth’s mover distance, asymptotic likelihood approximation • Shao et al. use HMMs

  22. Unsupervised approach • Clustering algorithms • K-means • Shao et al.’s work • agglomerative hierarchical clustering • SOM (self-organizing map) • Artificial neural network • High dim onto lower dim • GHSOM (growing hierarchical SOM) • Rauber et al.

  23. Supervised approach • A taxonomy of genres is given • VS. Expert System • No rules (or description to genre) • Supervised machine learning algo • KNN (K-Nearest Neighbor) • GMMs (Gaussian Mixture Models) • HMM (Hidden Markov Models) • LDA (Linear Discriminant Analysis) • SVMs (Support Vector Machines) • ANNs (Artificial Neural Networks)

  24. Classification results • MIREX genre classification contest • 1,005 / 510 songs over ten genres • 940 / 447 songs over six genres

  25. Classification results

  26. Future directions • Classification into perceptual categories • Moods, emotions • Novelty Detection • New or unknown data (not belong to any class) • Classification with multiple labels • Probably closer to human experience • From taxonomies to folksonomies • Does the taxonomy fit to users

  27. Conclusion • Definitions of music genres are convoluted • Features → classification → result • Research is evolving from purely objective machine calculations to techniques • Machine learning plays a fundamental role in classification domains

  28. Thank You

More Related