Commonly Used Classification Techniques and Recent Developments

Commonly Used Classification Techniques and Recent Developments Presented by Ke-Shiuan Lynn

Input Vector (Feature) Output (Class) Classifier Terminology A classifier can be viewed as a function of block. A classifier assigns one class to each point of the input space. The input space is thus partitioned into disjoint subsets, called decision regions, each associated with a class.

Decision regions Decision boundaries Input #1 Inputs of class A Inputs of class B Input #2 Terminology (cont.) The way a classifier classifies inputs is defined by its decision regions. The borderlines between decision regions are called decision-region boundaries or simply decision boundaries.

Terminology (cont.) In practice, input vectors of different classes are rarely so neatly distinguishable. Samples of different classes may have same input vectors. Due to such a uncertainty, areas of input space can be clouded by a mixture of samples of different classes. Input #1 Input #2

Terminology (cont.) • The optimal classifier is the one expected to produce the least number of misclassifications. • Such misclassifications are due to uncertainty in the problem rather than a deficiency in the decision regions. Input #1 Input #2

Input #2 Input #1 Terminology (cont.) • A designed classifier is said to generalize well if the classifier achieves similar classification accuracy to both training samples and real world samples

Types of Models • Decision-Region Boundaries • Probability Density Functions • Posterior Probabilities

Decision-Region Boundaries • This type of model defines decision regions by explicitly constructing boundaries in the input space. • These models attempt to minimize the number of expected misclassifications by placing boundaries appropriately in the input space.

Probability Density Functions (PDFs) • The models of this type attempt to construct a probability density function,p(x|C), that maps a point x in the input space to class C. • Prior probabilities,p(C), is to be estimated from the given database. • This model assigns the most probable class to an input vector x by selecting the class maximizingp(C)p(x|C).

Posterior Probabilities • Let there be m possible classes denoted C1, C2, …, Cm. This type of models attempts to generate m posterior probabilities p(Ci|x), i=1, 2, …, m for any input vector x. • The classification is made in the way that the input vector is assigned to the class associated with maximal p(Ci|x).

Approaches to Modeling • Fixed models • Parametric models • Nonparametric models

Fixed models Fixed model is used when the exact input-output relationship is known. • Decision region boundary: A known threshold value (e.g. A particular BMI value for defining obesity) • PDF: When each class’s PDF can be obtained • Posterior probability: when the probability that any observation belongs to each class is know.

Parametric Models • Parametric model is used when its parametric mathematical form can be obtained. • The development process of such models consists of two stages: (1) derive an appropriate parametric form, and (2) tune the parameters to fit data.

Parametric Models (cont.) • Decision-region boundary: Linear discriminant function e.g. y=ax1+bx2+cx3+d • PDF: Multivariate Gaussian function • Posterior probability: Logistic regression

Nonparametric Models • Nonparametric model is used when the relationships between input vectors and their associated classes are not well understood. • Models of varying smoothness and complexity are generated and the one with best generalization is chosen.

Nonparametric Models (cont.) • Decision-region boundary: Learning Vector Quantization (LVQ), K nearest neighbor classifier, decision tree. • PDF: Gaussian mixture methods, Pazen’s window. • Posterior probability: Artificial neural network (ANN), radial basis function (RBF), group method of data handling (GMDH)

Commonly Used Algorithms

Practical Constraints • Memory usage • Training time • Classification time

Memory Usage

Training Time

Classification time

Comparison of Algorithms Linear regression: y = w0+w1x1+w2x2 +…+wNxN Logistic regression: • Linear and Logistic regressions both tend to explicitly construct the decision-region boundaries. • Advantages: Easy implementation, easy explanation of input-output relationship • Disadvantages: Limited complexity on the constructed boundary

Comparison of Algorithms (cont) Root Binary decision tree: • Binary and Linear decision trees also tend to explicitly construct the decision-region boundaries. • Advantages: Easy implementation, easy explanation of input-output relationship • Disadvantages: Limited complexity on the constructed boundary, the tree structure may not be global optimal. xi>=c1 xi<c1 xj>=c2 xj<c2 xk>=c3 xk<c3

Comparison of Algorithms (cont) Neural Network: • Feedforward neural network and radial-basis function network both tend to implicitly construct the decision-region boundaries. • Advantages: They can both approximate any complex decision boundaries provided that enough nodes are used. • Disadvantages: Long training time

Comparison of Algorithms (cont) • Supporting vector machine • Supporting vector machine also tends to implicitly construct the decision-region boundaries. • Advantages: This type of classifier has been shown to have good generalization capability.

Comparison of Algorithms (cont) Bay’s Rule: Unimodal Gaussian: • Unimodal Gaussian explicitly construct the PDF, compute the prior probability P(Cj) and posterior probability P(Cj|X). • Advantages: Easy implementation, confidence level can be obtained from the posterior probabilities. • Disadvantages: Sample distributions may not be Gaussian.

Comparison of Algorithms (cont) • Gaussian mixture modify unimodal Gaussian in the way that the PDF is estimated by a weighted average of multiple Gaussian. • Similar to Gaussian mixture Parzen’s windows approximate PDF using weighted average of radial Gaussian. • Advantage: Given enough Gaussian components, the above architectures can approximate arbitrary complex distributions

Comparison of Algorithms (cont) K nearest neighbor classifier • K nearest neighbor tends to construct posterior probabilities P(Cj|X) • Advantage: No training is required, confidence level can be obtained • Disadvantage: classification accuracy is low is complex decision-region boundary exists, large storage required.

Other Useful Classifiers • Projection Pursuit: aims to decomposing the task of high-dimensional modeling into a sequence of low-dimensional modeling. • This algorithm consists of two stage: the first stage projects the input data onto a one-dimensional space while the second stage construct the mapping from projected space to the output space.

Other Useful Classifiers (cont) • Multivariate adaptive regression splines (MARS) tends to approximate the decision-region boundaries in two stages. • At the first stage, the algorithm partitions the state space into small portions. • At the second stage, the algorithm construct a low-order polynomial to approximate the decision-region boundary within each partition. • Disadvantage: This algorithm is intractable for problem with high (> 10) dimensional inputs

Other Useful Classifiers (cont) • Group method of data handling (GMDH) also aims to approximate the decision-region boundaries using high-order polynomial functions. • The modeling process begins with a low order polynomial, and then iteratively combines terms to produce a higher order polynomial until the modeling accuracy saturates.

Keep The Following In Mind • Use multiple algorithms without bias and let your specific data help determine which model is best suited for your problem. • Occam’s Razor: Entities should not be multiplied unnecessarily -- "when you have two competing models which make exactly the same predictions to the data, the one that is simpler is the better."

A New Member In Our Group

Commonly Used Classification Techniques and Recent Developments

Commonly Used Classification Techniques and Recent Developments

Presentation Transcript

Recent Developments and Outlook

Commonly Used Cooperative Learning Techniques

commonly used advertising techniques

Commonly used separation techniques

Commonly Used Accelerants

Commonly used challenges

Commonly Used Distributions

Commonly Used Challenges

Recent developments

Recent Developments and Activities

Recent Developments

Commonly Used Medications

REcent Developments

Commonly Used Classification Techniques and Recent Developments

Recent developments and outlook

Some of the most commonly used Architectural design techniques

5 Visual Effect Techniques Commonly Used in Animation

REcent Developments

Commonly Used Distributions

Recent Developments

Laboratory techniques commonly used in immunology: I