Strengthening I-ReGEC classifier

Strengthening I-ReGEC classifier G. Attratto, D. Feminiano, and M.R. Guarracino High Performance Computing and Networking Institute Italian National Research Council

Supervised learning • Supervised learning refers • to the capability of a system • to learn from a set of • input/output couples: • Training Set.

Classification • Consists of determining a model that it • allows to group elements according to • determined features • The groups are the classes

Evaluation of classification methods • Accuracy It’s ability’s pointer of prediction model • Speed Some methods employ little time than others • Robustness The defined rules and the accuracy do not change considerable with various set • Scalability Possibility to classify dataset of great dimensions

Goals • To render more efficient the examples’ choice during the training • Delete the redundant examples or insufficient informative contribution • Strengthening the training set, deleting the obsolete knowledge Building an efficient, scalabile and generalizable model

Classification techniques • Decision tree (Optimal Tree) Based on tree • Bayesian Networks (Slow in training) Compute posterior probabilities with Bayes’ theorem • Neurals Networks (Slow in training) Simulate the behavior of the biological systems • Support Vector Machine (SVM) Calculate hyperplanes

SVM: The state of the art • Find an examples set (support vectors) • representatives for classes Support vector Linear case Nonlinear case Separation margin Optimal Hyperplane

Regec • Two Hyperplanes representative for classes (GEPSVM’s family) Based on Genralized Eigenvalue

I-Regec • Select k points for each class with a clustering technique • (K-means) |S| = 2xK • Classify the test-set with the S points • Add misclassified points in incremental • mode to the S set • On proceede until the finish of misclassified points

Strengthening • Apply I-ReGEC in order to obtain the training set • Each iteration delete a point from training set • Apply I-ReGEC in each iteration with new input set S • Strengthening the set (save new S) if accuracy is improved

Microarray and matrix CLASSES FEATURES E X AM P L E S Gene expression

Results

Results and Diagrams Golub 2D I-Regec Strengthening Golub 3D I-Regec Strengthening

Conclusions • The examples choice became more efficient • The reduntants or obsolete examples have been deleted • The training set are “strengthened”

Future work • In order to optimize the execution time, the Strengthening technique would to go integrated into I-Regec.

Strengthening I-ReGEC classifier

Strengthening I-ReGEC classifier

Presentation Transcript

Classifier Systems

Classifier Review

Naïve Bayes Classifier

Classifier training

Bayesian Classifier

NAÏVE BAYES CLASSIFIER

Classifier Systems

Naïve Bayes Classifier

Classifier training

Classifier training

Classifier Evaluation

Linear Classifier

Classifier training

Define: Classifier

Bayesian Classifier

Spiral Classifier, Bucket Wheel Classifier

Classifier Systems

Classifier Clarifier

A simple classifier