Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineerin

Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering National Institute of Technology Kurukshetra

Hyperspectral data • Measurement of radiation in the visible to the infrared spectral region in many finely spaced spectral wavebands. • Provide greater detail on the spectral variation of targets than conventional multispectral systems. • The availability of large amounts of data represents a challenge to classification analyses. • Each spectral waveband used in the classification process should add an independent set of information. However, features are highly correlated, suggesting a degree of redundancy in the available information which can have a negative impact on classification accuracy.

An example: MULTISPECTRAL DATA Discrete wave-bands for example Landsat 7 Band 1- 0.45-0.515 µm Band2- 0.525-0.605 µm Between 0.45 -2.235 µm - A total of six bands HYPERSPECTRAL DATA DAIS data: Between 0.502-2.395 µm - A total of 72 bands Continuous bands at 10-45 nm bandwidth 0.4-0.7 µm – visible, 0.7-1.3 µm- NIR, 1.0-3.0 µm-MIR, 3-100 µm- Thermal

Various approaches could be adopted for the appropriate classification of high dimensional data: • Adoption of a classifier that is relatively insensitive to the Hughes effect (Vapnik, 1995). • Using a methods to effectively increase training set size i.e. semi-supervised classification (Chi and Bruzzone, 2005) and use of unlabelled data (Shahshahani and D. A. Landgrebe, 1994) • Use of some form of dimensionality reduction procedure prior to the classification analysis.

Feature reduction Two broad categories are: feature selection and feature extraction. Feature reduction may speed-up the classification process by reducing data set size. May increase the predictive accuracy. May increase the ability to understand the classification rules. feature selection select a subset of the original features those maintains the useful information to separate the classes by removing redundant features.

Feature selection Three approaches of feature selection are: Filters:uses a search algorithm to search through the space of possible features and evaluate each feature by using a filter such as correlation and mutual information Wrappers: uses a search algorithm to search through the space of possible features and evaluate each subset by using a classification algorithm. Embedded: some classification processes such as random forest produce a ranked list of features during classification. This study aims to explore the usefulness of four filter based feature selection approaches.

Feature selection approaches • Four filter based feature selection approaches were used. • Entropy • Fuzzy entropy • Signal-to-noise ratio • RELIEF

Entropy and Fuzzy Entropy For a finite set ,if P is the probability distribution on X, Yager’s entropy is defined by: For a given fuzzy information system defined by (U, A, V, f), where U is a finite set of objects (Hu and Yu, 2005), A is set of features i.e. If Q is a subset of attribute set A, and is the fuzzy relation matrix by an indiscernibility relation The significance of a is defined as , Significance If significance , attribute a is considered redundant. Further details of this algorithms can be found in Hu and Yu (2005).

Signal to noise ratio This approach rank all features in order to define how well a feature discriminates between two classes. In order to use this approach for multiclass classification problem, one against one approach was used in this study.

RELIEF • The general idea of RELIEF is to choose the features that can be most distinguished between classes. • At each step of an iterative process, an instance is chosen at random from the dataset and the weight for each feature is updated according to the distance of this instance to its Near-miss and Near-hit (Kira and Rendell, 1992). • An instance from the dataset will be a near-hit to X, if it belongs to the close neighbourhood of X and belongs to the same class as that of X. • An instance would be called a near-miss if belongs to the neighbourhood of X but not to the same class as that of X.

Data Set • DAIS 7915 sensor by German Space Agency flown on 29 June 2000. • The sensor acquire information in 79-bands at a spatial resolution of 5m in the wavelength range of 0.502–12.278 µm. • 7 features located in the mid- and thermal infrared region and 7 features from spectral region of 0.502 – 2.395 µm due to striping noise were removed. • An area of 512 pixels by 512 pixels and 65 features covering the test site was used.

Training and test data Random sampling was used to collect train and test using a ground reference image. Eight land cover classes i.e. wheat, water, salt lake, hydrophytic vegetation, vineyards, bare soil, pasture and built-up land. A total of 800 training pixels and a total of 3800 test pixels was used.

Classification Method Support vector machines using one against one approach for multiclass data was used. Radial basis function kernel was used. Regularisation parameter (C) =5000 and Gamma =2 was used. In all feature selection approach classification accuracy with test dataset was obtained. Test for non-inferiority using McNemar test was used.

Selected features with different feature selection approaches

Classification accuracy with SVM classifier with different selected features

Difference and non-inferiority test results based on 95% confidence interval on the estimated difference in accuracy from the accuracy achieved with 65 features and the feature sets selected using different approach.

Conclusions • Fuzzy entropy based feature selection approach works well with this dataset and provides comparable performance with small number of selected features. • Accuracy achieved by signal to noise ratio and entropy based approaches is also comparable to that is achieved with full dataset but require more number of selected features than fuzzy entropy based approach. • Results with Relief based approach show a significant decline in classification accuracy in comparison to full dataset.

Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineerin