musical instrument identification based on f0 dependent multivariate normal distribution
Download
Skip this Video
Download Presentation
Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

Loading in 2 Seconds...

play fullscreen
1 / 23

Musical Instrument Identification based on - PowerPoint PPT Presentation


  • 467 Views
  • Uploaded on

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution. Tetsuro Kitahara* Masataka Goto** Hiroshi G. Okuno* *Grad. Sch’l of Informatics, Kyoto Univ. **PRESTO JST / Nat’l Inst. Adv. Ind. Sci. & Tech. ICASSP’03 (6-10 th Apr. 2003 in Hong Kong). Today’s talk.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Musical Instrument Identification based on' - arleen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
musical instrument identification based on f0 dependent multivariate normal distribution

Musical Instrument Identification based on F0-dependent Multivariate Normal Distribution

Tetsuro Kitahara*Masataka Goto**Hiroshi G. Okuno*

*Grad. Sch’l of Informatics, Kyoto Univ.**PRESTO JST / Nat’l Inst. Adv. Ind. Sci. & Tech.

ICASSP’03 (6-10th Apr. 2003 in Hong Kong)

today s talk
Today’s talk
  • What is musical instrument identification?
  • What is difficult in musical instrument identification?The pitch dependency of timbre
  • How is the pitch dependency coped with? Approximate it as a function of F0
  • Musical instrument identification using F0-dependent multivariate normal distribution
  • Experimental results
  • Conclusions
1 what is musical instrument identification
1. What is musical instrument identification?
  • It is to obtain the name of musical instruments from sounds (acoustical signals).
  • It is useful for music automatic transcription, music information retrieval, etc.
  • Its research began recently (since 1990s).

Feature Extraction

(e.g. Decay speed,

Spectral centroid)

w = argmax p(w|X)

= argmax p(X|w) p(w)

p(X|wpiano)

p(X|wflute)

<inst>piano</inst>

2 what is difficult in musical instrument identification

(a) Pitch = C2 (65.5Hz)

(b) Pitch = C6 (1048Hz)

0.5

0.5

0

0

-0.5

-0.5

0

1

2

3

0

1

2

3

time [s]

time [s]

2. What is difficult in musical instrument identification?

The pitch dependency of timbree.g. Low-pitch piano sound = Slow decay High-pitch piano sound = Fast decay

3 how is the pitch dependency coped with
3. How is the pitch dependency coped with?

Most previous studies have not dealt with the pitch dependency.

Example: [Martin99] used hierarchical classification.[Brown99] used cepstral coefficients.[Eronen00] used both techniques.[Kashino98] developed a system for computational music scene analysis.[Kashino00] introduced template adaptation and musical contexts

3 how is the pitch dependency coped with6
3. How is the pitch dependency coped with?

Proposal:

Approximate the pitch dependency of each featureas a function of fundamental frequency (F0)

3 how is the pitch dependency coped with7
3. How is the pitch dependency coped with?

An F0-dependent multivariate normal distributionhas following two parameters:

F0-dependent mean function which captures the pitch dependency (i.e. the position of distributions of each F0)

F0-normalized covariancewhich captures the non-pitch dependency

4 musical instrument identification using f0 dependent multivariate normal distribution
4. Musical instrument identification using F0-dependent multivariate normal distribution

1st step: Feature extraction129 features defined based on consulting literatures are extracted.

e.g. Spectral centroid (which captures brightness of tones)

Piano

Flute

Spectral centroid

Spectral centroid

4 musical instrument identification using f0 dependent multivariate normal distribution9
4. Musical instrument identification using F0-dependent multivariate normal distribution

1st step: Feature extraction129 features defined based on consulting literatures are extracted.

e.g. Decay speed of power

Flute

Piano

not decayed

decayed

4 musical instrument identification using f0 dependent multivariate normal distribution10
4. Musical instrument identification using F0-dependent multivariate normal distribution

2nd step: Dimensionality reductionFirst, the 129-dimensional feature spaceis transformed to a 79-dimensional spaceby PCA (principal component analysis) (with the proportion value of 99%)Second, the 79-dimensional feature space is transformed to an 18-dimensional space by LDA (linear discriminant analysis)

4 musical instrument identification using f0 dependent multivariate normal distribution11
4. Musical instrument identification using F0-dependent multivariate normal distribution

3rd step: Parameter estimationFirst, the F0-dependent mean function is approximated as a cubic polynomial.

4 musical instrument identification using f0 dependent multivariate normal distribution12
4. Musical instrument identification using F0-dependent multivariate normal distribution

3rd step: Parameter estimationSecond, the F0-normalized covariance is obtained by subtracting the F0-dependent mean from each feature.

eliminating the pitch dependency

4 musical instrument identification using f0 dependent multivariate normal distribution13
4. Musical instrument identification using F0-dependent multivariate normal distribution

Final step: Using the Bayes decision ruleThe instrument w satisfying w = argmax [log p(X|w; f) + log p(w; f)]is determined as the result.

p(X|w; f) … - A probability density function of the F0-dependent multivariate normal distribution. - Defined using the F0-dependent mean function and the F0-normalized covariance.

5 experiments conditions
5. Experiments (Conditions)
  • Database: A subset of RWC-MDB-I-2001
    • Consists of solo tones of 19 real instrumentswith all pitch range.
    • Contains 3 individuals and 3 intensitiesfor each instrument.
    • Contains normal articulation only.
    • The number of all sounds is 6,247.
  • Using the 10-fold cross validation.
  • Evaluate the performance both at individual-instrument level and at category level.
5 experiments results
5. Experiments (Results)

Recognition rates: 79.73% (at individual level)90.65% (at category level)

Improvement:4.00% (at individual level)2.45% (at category level)

Error reduction (relative):16.48% (at individual level)20.67% (at category level)

Category level(8 classes)

Individual level(19 classes)

5 experiments results17
5. Experiments (Results)

The recognition rates of following 6 instruments were improved by more than 7%.

Piano: The best improved (74.21%a83.27%) Because the piano has the wide pitch range.

6 conclusions
6. Conclusions
  • To cope with the pitch dependency of timbre in musical instrument identification, F0-dependent multivariate normal distribution is proposed.
  • Experimental results: Recognition rate: 75.73%a79.73% (Using 6,247 solo tones of 19 instruments)
  • Future works:
    • Evaluation against mixture of sounds
    • Development of application systems using the proposed method.
recognition rates at category level
Recognition rates at category level

Err Rdct35% 8% 23% 33% 20% 13% 15% 8%

  • Recognition rates for all categories were improved.
  • Recognition rates for Piano, Guitar, Strings: 96.7%
bayes vs k nn

We adopt

Bayes vs k-NN

Bayes (18 dim; PCA+LDA)

Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only)

  • PCA+LDA+Bayes achieved the best performance.
  • 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim.
  • The use of LDA improved the performance. LDA considers separation between classes.
bayes vs k nn22

We adopt

Bayes vs k-NN

Bayes (18 dim; PCA+LDA)

Bayes (79 dim; PCA only)Bayes (18 dim; PCA only)3-NN (18 dim; PCA+LDA)3-NN (79 dim; PCA only)3-NN (18 dim; PCA only)

Jain’s guideline (1982):Having 5 to 10 times as many training data as # of dimensions seems to be a good practice.

  • PCA+LDA+Bayes achieved the best performance.
  • 18-dimension is better than 79-dimension. # of training data is not enough for 79-dim.
  • The use of LDA improved the performance. LDA considers separation between classes.
relationship between training data and dimension
Relationship between training data and dimension

14 dim. (85%)18 dim. (88%)20 dim. (89%)23 dim. (90%)32 dim. (93%)41 dim. (95%)52 dim. (97%)79 dim. (99%)

Hughes’s peaking phenomenon

  • At 23-dimension, the performance peaked.
  • Any results without LDA are worse than that with LDA.
ad