Class distributions on som surfaces for feature extraction and object retrieval
Sponsored Links
This presentation is the property of its rightful owner.
1 / 26

Class distributions on SOM surfaces for feature extraction and object retrieval PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Class distributions on SOM surfaces for feature extraction and object retrieval. Advisor : Dr. Hsu Graduate : Kuo-min Wang Authors : Jorma T. Laaksonen * , J. Markus Koskela, Erkki Oja. 2005 Expert Systems with Applications. Outline.

Download Presentation

Class distributions on SOM surfaces for feature extraction and object retrieval

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Class distributions on SOM surfaces for feature extraction and object retrieval

Advisor : Dr. Hsu

Graduate : Kuo-min Wang

Authors :Jorma T. Laaksonen*,

J. Markus Koskela,

Erkki Oja

2005 Expert Systems with Applications



  • Motivation

  • Objective

  • Introduction

  • Class Distributions

  • BMU Probabilities

  • BMU Entropy

  • SOM Surface Convolutions

  • Multiple feature extraction

  • Bayesian Decision estimation

  • Personal Opinion


  • A Self-Organizing Map (SOM) is typically trained in unsupervised mode, using a large batch of training data.

  • Even from the same data, qualitatively different distributions can be obtained by using different feature extraction techniques


  • We use such distributions for comparing different classes and different feature representations of the data in our content-based image retrieval system PicSOM.

  • The information-theoretic measures of entropy and mutual information are suggested to evaluate the compactness of a distribution and the independence of two distributions.


  • 影像檢索(Image Retrieval)

    • segmentation, feature extraction, representation 及 query processing

  • Segmentation

    • 將影像中不同的區域劃分出來,大多是時候是指者將影像中物件的邊緣找出來,然後再確定這個區域是否是有意義的區域

  • Feature extraction

    • 指一張影向上某一塊區域的特徵。

    • 特徵的擷取跟特徵的表示方式(Representation)有直接的關係,因為不同的表示法,就會需要不同的擷取法

    • 顏色(color)、形狀(shape)、質地(texture)

Introduction (cont.)

  • We study how object class histograms on SOMs can be given interpretations in terms of probability densities and information-theoretic measures

    • Entropy and mutual information (Cover & Thomas, 1991)

  • A good feature

    • the class is heavily concentrated on only a few nearby map elements, giving a low value of entropy.

    • The mutual information of two features’ distributions is a measure on how independent those features are.

Class Distributions

  • Normalized to unit sum

    • the hit frequency give a discrete histogram which is a sample estimate of a probability distribution of the class on the SOM surface.

  • Theshape of the distribution depends on several factors

    • The distribution of the original data

      • Cannot to control the very-high-dimensional pattern space

    • Feature extraction technique in use affects the metrics and the distribution of all the generated feature vectors

      • Feature invariance

      • Some pattern space directions are retained better than others.

      • Working properly, semantically similar patterns will be mapped nearer to each other

Class Distributions (cont.)

  • Overall shape of the training set

    • After it has been mapped from the original data space to the feature vector space, determines the overall organization of the SOM.

  • The class distribution of the studied object subset or class, relative to the overall shape of the feature vector distribution.

Class Distributions (cont.)

  • Measures the denseness or locality of feature vectors on a SOM

    • SDH (Pampalk, Rauber, & Merkl, 2002)

      • Each data point is mapped not only to its nearest SOM unit but to s nearest units with reciprocally decreasing fractions.

  • Quantitative locality measures

    • Map usage, average pair distance, fragmentation,

    • and purity (Pullwitt , 2002)

    • They fail to take into account the topological structure of the class

BMU Probabilities

  • Calculating the a priori probability of each SOM unit for being the BMU for any vector x of the feature space.

  • Probability density function (pdf)

BMU Probabilities (cont.)

  • Voronoi region

    • The set of vectors in the original feature space that the closer to the weight vector of unit i than to any other weight vector

    • We are actually replacing the continuous pdf with a discrete probability histogram by counting the number of times that any given map unit is the BMU.

    • The probability histogram of class C on the SOM surface















BMU Entropy

  • The entropy H of a distribution P=(P0,P1,…Pk-1) is calculated as

BMU Entropy (cont.)

SOM Surface Convolutions

  • Entropy Drawback

    • The calculation of entropies does not yet take into account the spatial topology of the SOM units in any way.

    • It is the topological order of the units that separates SOM from other vector quantization methods.

  • That method bears similarity to the smoothed data histogram approach (Pampalk, 2002)

    • data points are not mapped one-to-one to their BMUs but spread into s closet map units in the feature space.

SOM Surface Convolutions (cont.)

SOM Surface Convolutions (cont.)

  • The larger the convolution window is , the smoother is the overall shape of the distribution due to the vanishing of the details.

  • The selection of a proper size for the convolution mask can be identified as a form of the general scale-space problem

SOM Surface Convolutions (cont.)

Multiple Feature Extractions

  • It is possible to use more than one feature extraction method in parallel.

  • In CBIR, three different feature categories are generally recognized: color, texture, and shape features.

  • Let us denote by P=(P0, P1, …, Pk-1) and Q=(Q0, Q1,…, Qk-1)

  • H(P) and H(Q) measure the distributions of the single feature vectors, mutual information I(P, Q) can be used for studying the interplay between them

Multiple Feature Extractions (cont.)

  • HT and nHT have by far the largest values for mutual information

  • CS and SC have the largest value on both SOMs

  • EH and HT is high on the smaller SOM, but not so much on the larger SOM with more resolution.

Bayesian Decision Estimation

  • Using the Bayesian decision rule to make optimal classification

  • Posterior probability

  • To decide on the jth object’s membership in class C

Bayesian Decision Estimation (cont.)

  • Query by example

    • Presents a number of images to the user at each query round, and the user is expected to evaluate their relevance to her current task.

  • Relevance feedback

    • Incrementally fine-tune the selection so that more and more relevant images will be shown at consequtive query rounds

  • Choose next image for the user

    • Maximal probability of relevance

    • Minimal probability of nonrelevance

Bayesian Decision Estimation (cont.)

Bayesian Decision Estimation (cont.)

  • Relevance feedback problem

    • By adding the hit caused by the new relevant and nonrelevant samples to the map units,

    • convolving them with the mask used,

    • And renormalizing the distributions to unit sums

  • Let us denote the history of the query up to the t – 1’th round by

  • Maximize the current probability of relevance

Bayesian Decision Estimation (cont.)


  • The entropy of the distribution characterizes quantitatively the compactness of an object class.

  • Proposed method can be used as an efficient way of comparing these features and the SOMs produced with them.

  • We showed that the mutual information of the distributions

    • could be used to identify both the most similar and the most uncorrelated of features

    • can also be used to select the subset of the feature extraction methods with the most independent features.

  • Bayesian decision

    • used for choosing either the most probable class for a data item, or the most likely data item belonging to a given class.

  • Personal Opinions

    • Advantage

      • Combined entropy & mutual information & smooth method to find the important feature and independent of features.

    • Application

      • Feature extraction

    • Drawback

      • The structure of this paper is not good,

      • Some diagram is not clear, so difficult to understand

  • Login