Statistical Pattern Recognition Review: Approaches, Features, and Dimensionality

Review of Statistical Pattern Recognition Wen-Hung Liao 10/9/2007

Review Paper • A.K. Jain, R.P.W. Duin and J. Mao, “Statistical Pattern Recognition: A Review”, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 22, No. 1, pp. 4-37, Jan. 2000. • More review papers: http://www.ph.tn.tudelft.nl/PRInfo/revpapers.html

Statistical Approach in PR • Each pattern is represented in terms of d features and is viewed as a point in a d-dimensional feature space. • Goal: establish decision boundaries to separate patterns belonging to different classes. • Need to specify/estimate the probability distributions of the patterns.

Various Approaches in Statistical PR

Linear Discriminant Function Principal Component Analysis Nonlinear Discriminant Function Parzen Window Density-based Classifier Perceptron Auto-Associative Networks Multilayer Perceptron Radial Basis Function Network Links Between Statistical and Neural Network Methods

Model for Statistical Pattern Recognition Classification Feature Measurement Classification Preprocessing Training Feature Extraction /Selection Learning Preprocessing

The Curse of Dimensionality • The performance of a classifier depends on the relationship between sample sizes, number of features and classifier complexity. • Number of training data points be an exponential function of the feature dimension space.

Class-Conditional Probability • Length d feature vector: x = (x1,x2,…,xd) • C Classes (or categories):w1,w2,…,wc • Class-conditional probability:The probability of x happening given that it belongs to class wi: p(x|wi)

How Many Features are Enough? • Question: More features, better classification? • Answer: • Yes, if the class-conditional densities are completely known. • No, if we need to estimate the the class-conditional densities.

Dimensionality Reduction • Keep the number of features as small as possible (but not too small) due to: • measurement cost • classification accuracy • Always some trade-off

Feature Extraction/Selection • Feature Extraction: extract features from the sensed data • Feature Selection: select (hopefully) the best subset of the input feature set. • Feature extraction usually precedes selection • Application-domain dependent

Example: Chernoff Faces • Three classes of face • Feature set: Nose length, mouth curvature, eye size, face shape. • 150 4-d patterns, 50 patterns per class.

Chernoff Faces

Statistical Pattern Recognition Review: Approaches, Features, and Dimensionality

Statistical Pattern Recognition Review: Approaches, Features, and Dimensionality

Presentation Transcript

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural

Sequence Classification Using Statistical Pattern Recognition

ECES 690 – Statistical Pattern Recognition

ECES 690 – Statistical Pattern Recognition

ECES 690 – Statistical Pattern Recognition

INTRODUCTION TO STATISTICAL PATTERN RECOGNITION

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural

Pattern Recognition: Statistical and Neural