Loading in 5 sec....

Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound OntologyPowerPoint Presentation

Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology

- By
**Jims** - Follow User

- 435 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology' - Jims

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Pitch-dependent Musical Instrument Identification and Its Application to Musical Sound Ontology

Tetsuro Kitahara*Masataka Goto**Hiroshi G. Okuno*

*Grad. Sch’l of Informatics, Kyoto Univ.**PRESTO JST / Nat’l Inst. Adv. Ind. Sci. & Tech.

IEA/AIE-2003 (24th June 2003 in UK)

Today’s talk Application to Musical Sound Ontology

- Musical Instrument Identification
- Difficulty: The pitch dependency of timbre
- Solution:Approximating it as a function of F0
- Experiments

- Musical Sound Ontology
- A hierarchy of musical instrument sounds
- Systematically constructed by C5.0

1. What is musical instrument identification? Application to Musical Sound Ontology

- To obtain the names of musical instruments from sounds (acoustical signals).
- Useful for automatic music transcription, music information retrieval, etc.

Feature Extraction

(e.g. Decay speed,

Spectral centroid)

w = argmax p(w|X)

= argmax p(X|w) p(w)

p(X|wpiano)

p(X|wflute)

<inst>piano</inst>

(a) Pitch = C2 (65.5Hz) Application to Musical Sound Ontology

(b) Pitch = C6 (1048Hz)

0.5

0.5

0

0

-0.5

-0.5

0

1

2

3

0

1

2

3

time [s]

time [s]

2. What is the difficulty?The pitch dependency of timbree.g. Low-pitch piano sounds decay slowly High-pitch piano sound decay fast

(a) Pitch = C2 (65.5Hz) Application to Musical Sound Ontology

(b) Pitch = C6 (1048Hz)

0.5

0.5

0

0

-0.5

-0.5

0

1

2

3

0

1

2

3

time [s]

time [s]

2. What is the difficulty?The pitch dependency of timbree.g. Low-pitch piano sound = Slow decay High-pitch piano sound = Fast decay

In previous studies…

The pitch dependency was pointed out,

but has not been dealt with.

3. How is the pitch dependency coped with? Application to Musical Sound Ontology

Our solution:

Approximate the pitch dependency of each featureas a function of fundamental frequency (F0)

Modelling of how each feature varies according to F0 Application to Musical Sound Ontology

3. How is the pitch dependency coped with?Our solution:

Approximate the pitch dependency of each featureas a function of fundamental frequency (F0)

3. How is the pitch dependency coped with? Application to Musical Sound Ontology

An F0-dependent multivariate normal distributionhas following two parameters:

F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions of each F0)

F0-normalized covariancewhich captures the non-pitch dependency

4. Musical instrument identification using F0-dependent multivariate normal distribution

A musical instrument identification method has following four steps:

1st step: Feature extraction

2nd step: Dimensionality reduction

3rd step: Parameter estimation

Final step: Using the Bayes decision rule

4 multivariate normal distribution. Musical instrument identification using F0-dependent multivariate normal distribution(1st) Feature extraction

129 features defined based on consulting literatures are extracted.

(1) Spectral centroid (which captures brightness of tones)

Piano

Flute

Spectral centroid

Spectral centroid

4 multivariate normal distribution. Musical instrument identification using F0-dependent multivariate normal distribution(1st) Feature extraction

129 features defined based on consulting literatures are extracted.

(2) Decay speed of power

Flute

Piano

not decayed

decayed

4. Musical instrument identification using multivariate normal distributionF0-dependent multivariate normal distribution(2nd)Dimensionality reduction

The dimensionality of the feature space is reduced by following two methods.

129-dimensional feature space

PCA (principal component analysis) (with the proportion value of 99%)

79-dimensional feature spaceLDA (linear discriminant analysis)

18-dimensional feature space

4. Musical instrument identification using multivariate normal distributionF0-dependent multivariate normal distribution(3rd) Parameter estimation

First, the F0-dependent mean function is approximated as a cubic polynomial.

4. Musical instrument identification using multivariate normal distributionF0-dependent multivariate normal distribution(3rd) Parameter estimation

Second, the F0-normalized covariance is obtained by subtracting the F0-dependent mean from each feature.

eliminating the pitch dependency

4. Musical instrument identification using multivariate normal distributionF0-dependent multivariate normal distribution(Final)The Bayes decision rule

The instrument w satisfying w = argmax [log p(X|w; f) + log p(w; f)]is determined as the result.

p(X|w; f) … - A probability density function of the F0-dependent multivariate normal distribution. - Defined by F0-dependent mean function and the F0-normalized covariance.

5. Experimental Conditions multivariate normal distribution

- Database: A subset of RWC-MDB-I-2001
- Consists of solo tones of 19 real instrumentswith all pitch range.
- Contains 3 individuals and 3 intensitiesfor each instrument.
- Contains normal articulation only.
- The number of all sounds is 6,247.

- Using the 10-fold cross validation.
- Evaluate the performance both at individual instrument level and at category level.

6. Experimental Results multivariate normal distribution

Recognition rates: 79.73% (at individual level)90.65% (at category level)

Improvement:4.00% (at individual level)2.45% (at category level)

Error reduction (relative):16.48% (at individual level)20.67% (at category level)

Category level(8 classes)

Individual level(19 classes)

6. Experimental Results multivariate normal distribution

The recognition rates of following 6 instruments were improved by more than 7%.

Piano: The best improved (74.21%a83.27%) Because the piano has the wide pitch range.

7. Musical sound ontology multivariate normal distribution

- A hierarchy of musical instrument sounds
- Important for various applications.e.g. Category-level musical instrument recognition (such as strings, wind instruments) Music composing (or arrangement) supporting
- However, its systematic construction has not been reported.
- We report the result of constructing acoustics-based musical sound ontology using C5.0 decision tree program.

7. Musical sound ontology multivariate normal distribution

7. Musical sound ontology multivariate normal distribution

Different from conventional hierarchy.

7. Musical sound ontology multivariate normal distribution

Acoustic characteristics depend on the pitch as well as the sounding mechanism.

7. Musical sound ontology multivariate normal distribution

This hierarchy was known to musicians experientially, but has not been constructed by computer previously.

8. Conclusions multivariate normal distribution

- We proposed a method for musical instrument identification which takes into consideration the pitch dependency of timbre.aRecognition rate improved: 75.73%a79.73%
- We reported the construction ofmusical sound ontology based on acoustic characteristics.
- Future works:
- Evaluation against mixture of sounds
- Development of application systems using the proposed method.

Temporal mean of kurtosis of spectral peaks multivariate normal distribution

Spectral peaks

Non-harmonic

If power of non-harmonic components are stronger,

kurtosis of spectral peaks become higher

a This feature captures how much are non-harmonic components contained in spectrum.

Recognition rates multivariate normal distributionat category level

Err Rdct35% 8% 23% 33% 20% 13% 15% 8%

- Recognition rates for all categories were improved.
- Recognition rates for Piano, Guitar, Strings: 96.7%

We adopted multivariate normal distribution

Bayes vs k-NNBayes (18 dim; PCA+LDA)

Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only)

- PCA+LDA+Bayes achieved the best performance.
- 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim.
- The use of LDA improved the performance. LDA considers separation between classes.

We adopt multivariate normal distribution

Bayes vs k-NNBayes (18 dim; PCA+LDA)

Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only)

Jain’s guideline (1982):Having 5 to 10 times as many training data as # of dimensions seems to be a good practice.

- PCA+LDA+Bayes achieved the best performance.
- 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim.
- The use of LDA improved the performance. LDA considers separation between classes.

Relationship between training data and dimension multivariate normal distribution

14 dim. (85%)18 dim. (88%)20 dim. (89%)23 dim. (90%)32 dim. (93%)41 dim. (95%)52 dim. (97%)79 dim. (99%)

Hughes’s peaking phenomenon

- At 23-dimension, the performance peaked.
- Any results without LDA were worse than that with LDA.

Conventional hierarchy multivariate normal distribution(Sounding-mechanism-based)

Download Presentation

Connecting to Server..