1 / 21

Further Development of a Classifier for Musical Genre Classification and retrieval

Further Development of a Classifier for Musical Genre Classification and retrieval. Kris West School of Computing Sciences University of East Anglia kristopher.west@uea.ac.uk. Outline. Background Genre classification Approaches Classifiers Results Research Perspectives

israel
Download Presentation

Further Development of a Classifier for Musical Genre Classification and retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Further Development of a Classifier for Musical Genre Classification and retrieval Kris West School of Computing Sciences University of East Anglia kristopher.west@uea.ac.uk

  2. Outline • Background • Genre classification • Approaches • Classifiers • Results • Research Perspectives • Reducing data processing costs • More powerful decision strategies • Other applications • In terms of an Integrated MIR system

  3. BackgroundClassical Approaches to Genre Classification • Calculate features over a whole piece of audio OR • Calculate features and average over whole piece • Means and Variances • Model the distribution of classes • GMMs, KNN, LDA, MAP, SVMs etc. • Classify novel input • Classify one vector of features

  4. BackgroundOur Approach to Genre Classification • Calculate features for every frame • 23 ms frames (512 samples @ 22050Hz) • Model distribution of individual frames • Classify novel input • Classify each individual frame • Classify whole piece by majority vote

  5. BackgroundMotivation Unlikely that all frames belonging to a genre will belong to a single distribution More likely that there are multiple distributions in feature space, each of which should be fitted with its own model Number of distributions per class will vary and is hard to predict

  6. BackgroundInitial comparisons • Classical approach • ~62% accuracy • Our approach • ~60.5% • Explanation • Classical approach used stronger feature set • Finer distribution of classes in our approach • i.e. Harder to separate classes (Non-linear) • Classifiers not up to task • Too much data for an SVM

  7. BackgroundDevelop a better classifier • Base it on a decision tree • Recursive sub-division of feature space • Model many distributions for a single class • No limit on complexity • Keep on growing the tree • Only bounded by accuracy of sampling/feature calculation • Don’t have to define number of components/distributions in advance • Easy integration of disparate feature sets including categorical variables • Lots of existing research • Classification and Decision Trees • Breiman, L., Friedman, J., Olshen, R. & Stone, C. (1984)

  8. Classification and Regression TreesOverview

  9. Classification and Regression TreesDisadvantages • Computational complexity • Has to enumerate a large number of splits • Larger feature vectors = more splits to evaluate • Single splits = n possible splits • Linear combinations = n + n(n-2) possible splits n = length of feature vector

  10. Classification and Regression TreesSplitting techniques • Traditionally nodes are split by a threshold of a single variable, a linear combination of variables or the value of a categorical variable • Any classification scheme can be used to split a node in the tree • Form all combinations classes within data • Train a binary classifier for each combination • Evaluate splits and select best • Evaluated: • Gaussian classifiers (GAUSS-CART) • Fisher Criterion Linear Discriminant analysis (LDA-CART)

  11. Classification and Regression TreesAdvantages • Computational complexity reduced • Has to enumerate less splits • Usually less classes than features • Less combinations • Use all classes, no subsets n = length of feature vector

  12. BackgroundInitial comparisons • Classical approach • ~62% accuracy (~66% with CART) • Our approach (CART) • ~83% • Explanation • Identifying individual timbres in a genre

  13. Research Perspectives • Reduce computational costs of technique • Use a segmentation technique • Calc & append 1st and 2nd Order differentials of features or another trajectory • Then use harmonic/simple temporal modelling • Grow simpler more transparent decision trees? • Use Variable Bit Rate (VBR) • Calc difference between frames • Use one frame and a count to represent many similar, sequential frames • Do we need to re-expand or adapt classifier training?

  14. Research Perspectives • Try stronger feature sets • Have tried timbral features such as Flux, Roll-off, Centroiad (mean and banded) • No effect, or reduced accuracy • Try other instrument identification features • Rise time etc. • Try Beat, Pitch and Rhythmic features

  15. Research Perspectives • Use a more powerful technique than bag of frames to decide final output • Model common frame classification errors • Markov chains • Ergodic markov chains of frame sequences • Ngrams • Analogous to markov chains • Neural Nets • Train NNet on output, to decide final classification • Lots of input lines, One output line per class • All of above trained on re-substitution of training data • and independent test samples used to validate tree

  16. Research Perspectives • Apply approach to Instrument Identification • Already performing successful timbral matching • Need database • Record it?

  17. Research Perspectives • Use tree to perform Timbral music similarity • Train tree according to genre • Pre-compute a distance measure between mean vectors of leaf nodes from all other nodes • Big matrix (unless tree has already been simplified) • Use co-occurrence of leaf nodes • Group similar nodes into a single symbol • Using a threshold of the distance measure

  18. Research Perspectives • Alternatively • Reduce music to symbol sequence (Leaf node numbers) • Cache distance scores for each leaf node • Collect distances of all query nodes from training example nodes • Normalise for number of frames • For length invariance • Return lowest n scores OR • Use tf*IDF on symbol sequence/Greenstone • Either use N-grams of symbols OR • Treat each symbol as a word

  19. Research Perspectives • Use symbol sequence (Leaf node numbers) to perform or augment onset detection • Identify nodes corresponding to transients

  20. Research Perspectives • Stick to timbral/Instrument identification/Semantic features • Semantic and episodic memory of music are subserved by distinct neural networksNeuroImage, Volume 20, Issue 1, September 2003, Pages 244-256Hervé Platel, Jean-Claude Baron, Béatrice Desgranges, Frédéric Bernard and Francis Eustache

  21. Research PerspectivesIntegrated MIR system

More Related