Pattern recognition project
Download
1 / 23

Pattern Recognition Project - PowerPoint PPT Presentation


  • 231 Views
  • Uploaded on

Pattern Recognition Project. Contents. Data set Iris (Fisher’s data) Riply’s data set Hand-written numerals Classifier – (MATLAB CODE) Bayesian SVM K -nearest neighbor. Fisher’s Iris Plants Database.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Pattern Recognition Project' - dreama


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Contents
Contents

  • Data set

    • Iris (Fisher’s data)

    • Riply’s data set

    • Hand-written numerals

  • Classifier – (MATLAB CODE)

    • Bayesian

    • SVM

    • K-nearest neighbor


Fisher s iris plants database
Fisher’s Iris Plants Database

  • The iris data published by Fisher (1936) have been widely used for examples in discriminant analysis and cluster analysis.

  • The sepal length, sepal width, petal length, and petal width are measured in centimeters on fifty iris specimens from each of three species, Iris setosa, I. versicolor, and I. virginica.

  • Download the package from http://chien.csie.ncku.edu.tw/web/course/iris_svm.rar


  • Attribute Information:

    1. sepal length

    2. sepal width

    3. petal length

    4. petal width

    5. class:

    -- Iris Setosa = 1

    -- Iris Versicolour = 2

    -- Iris Virginica = 3



Example: SVM (MATLAB Tool)

%Load the sample data, which includes Fisher's iris data of 5 measurements on a sample of 150 irises.

load fisheriris

%Create data, a two-column matrix containing sepal length and sepal width measurements for 150 irises.

data = [meas(:,1), meas(:,2)];

%From the species vector, create a new column vector, groups, to classify data into two groups: Setosa and non-Setosa.

groups = ismember(species,'setosa');


%Randomly select training and test sets.

[train, test] = crossvalind('holdOut',groups); cp = classperf(groups);

%Train an SVM classifier using a linear kernel function and plot the grouped data.

svmStruct = svmtrain(data(train,:),groups(train),'showplot',true);

title(sprintf('Kernel Function:’%s',func2str(svmStruct.KernelFunction)),'interpreter','none');


%Use the svmclassify function to classify the test set.

classes = svmclassify(svmStruct,data(test,:),'showplot',true);

%Evaluate the performance of the classifier.

classperf(cp,classes,test); cp.CorrectRate


Riply s data set
Riply’s data set

  • The well-known Ripley dataset problem consists of two classes where the data for each class have been generated by a mixture of two Gaussian distributions.

  • This has two real-valued co-ordinates (xs and ys) and a class (yc) which is 0 or 1.

    • riply.tra: has 250 rows of the training set

    • riply.tes: has 1000 rows of the test set

  • Download the package from http://chien.csie.ncku.edu.tw/web/course/stprtool.rar



Evaluated on the testing
Evaluated on the testing

Example: Bayesian classifier

% load input training data

trn = load('riply_trn');

inx1 = find(trn.y==1);

inx2 = find(trn.y==2);

% Estimation of class-conditional distributions by EM

bayes_model.Pclass{1} = emgmm(trn.X(:,inx1),struct('ncomp',2));

bayes_model.Pclass{2} = emgmm(trn.X(:,inx2),struct('ncomp',2));

% Estimation of priors

n1 = length(inx1); n2 = length(inx2);

bayes_model.Prior = [n1 n2]/(n1+n2);

% Evaluation on testing data

tst = load('riply_tst');

ypred = bayescls(tst.X,bayes_model);

cerror(ypred,tst.y)


Example: Binary SVM

trn = load('riply_trn'); % load training data

options.ker = 'rbf'; % use RBF kernel

options.arg = 1; % kernel argument

options.C = 10; % regularization constant

% train SVM classifier

model = smo(trn,options);

% visualization figure;

ppatterns(trn); psvm(model);

tst = load('riply_tst'); % load testing data

ypred = svmclass(tst.X,model); % classify data

cerror(ypred,tst.y) % compute error


Example: K-nearest neighbor classifier

  • % load training data and setup 8-NN rule

    • trn = load('riply_trn');

    • model = knnrule(trn,8);

  • % visualize decision boundary and training data

  • figure;

    • ppatterns(trn);

    • pboundary(model);

  • % evaluate classifier

    • tst = load('riply_tst');

    • ypred = knnclass(tst.X,model);

    • cerror(ypred,tst.y)


Hand written numerals
Hand-written numerals

  • Pen-Based Recognition of Handwritten Digits.

  • Examples of numerals collected from 44 different persons.

    • The samples written by 30 writers are used for training, cross-validation and writer dependent testing.

    • The digits written by the other 14 are used for writer independent testing.

  • Each person drew 250 examples of each of numerals from ’0’ to ’9’.


  • Number of Instances

    • pendigits.txt 10992

    • pendigits.tra Training 7494

    • pendigits.tes Testing 3498

  • Number of Attributes

    • 16 input+1 class attribute

  • For Each Attribute:

    • All input attributes are integers in the range 0..100

    • The last attribute is the class code 0..9



Example

47,100, 27, 81, 57, 37, 26, 0, 0, 23, 56, 53,100, 90, 40, 98, 8

0, 89, 27,100, 42, 75, 29, 45, 15, 15, 37, 0, 69, 2,100, 6, 2

0, 57, 31, 68, 72, 90,100,100, 76, 75, 50, 51, 28, 25, 16, 0, 1

0,100, 7, 92, 5, 68, 19, 45, 86, 34,100, 45, 74, 23, 67, 0, 4

0, 67, 49, 83,100,100, 81, 80, 60, 60, 40, 40, 33, 20, 47, 0, 1

100,100, 88, 99, 49, 74, 17, 47, 0, 16, 37, 0, 73, 16, 20, 20, 6

0,100, 3, 72, 26, 35, 85, 35,100, 71, 73, 97, 65, 49, 66, 0, 4

0, 39, 2, 62, 11, 5, 63, 0,100, 43, 89, 99, 36,100, 0, 57, 0

13, 89, 12, 50, 72, 38, 56, 0, 4, 17, 0, 61, 32, 94,100,100, 5

57,100, 22, 72, 0, 31, 25, 0, 75, 13,100, 50, 75, 87, 26, 85, 0

74, 87, 31,100, 0, 69, 62, 64,100, 79,100, 38, 84, 0, 18, 1, 9

48, 96, 62, 65, 88, 27, 21, 0, 21, 33, 79, 67,100,100, 0, 85, 8

100,100, 72, 99, 36, 78, 34, 54, 79, 47, 64, 13, 19, 0, 0, 2, 5


Installation for matlab code
Installation for MATLAB code

  • Install MATLAB in your machine.

  • Download the package from http://chien.csie.ncku.edu.tw/web/course/MATLABArsenal.rar

  • Unzip the .zip files into a arbitrary directory, say $MATLABArsenalRoot

  • Add the path $MATLABArsenalRoot and its subfolders in MATLAB. Use addpath command or menu File->Set Path.


How to use classifiers
How to use classifiers

test_classify('classify -t input_file [general_option] [-- EvaluationMethod [evaluation_options]] ... [-- ClassifierWrapper [param] ] -- BaseClassifier [param] );

Example 1

test_classify('classify -t pendigits.txt -sf 1 -- LibSVM -Kernel 0 -CostFactor 3');

Prec:0.979803, Rec:0.979803, Err:0.020197

566 0 10 0 1 0 0 2 0 1

0 547 0 0 0 1 0 0 22 0

10 0 565 1 0 0 0 1 0 0

2 0 0 534 0 4 0 0 0 1

0 0 0 1 557 0 0 0 0 0

0 0 0 1 0 514 1 0 12 3

0 0 0 0 0 0 543 0 1 0

4 0 0 1 1 0 0 562 0 2

0 10 0 0 0 5 0 0 484 1

0 2 0 1 0 8 0 1 0 513

Classify pengigit.txt Shuffle the data before classfication

('-sf 1')50%-50% train-test split (default)Linear Kernel Support Vector Machine


Example 2

Classify pendigits.txt Training the model using pendigits.traLinear Kernel Support Vector Machine

test_classify(strcat('classify -t pendigits.tra -- Train_Only -m pendigits.libSVM.model -- LibSVM -Kernel 0 -CostFactor 3'));

Error = 0.009608

Classify pendigits.txt Testing the new data for pendigits.tes using pendigits.libSVM.model

Linear Kernel Support Vector Machine

test_classify(strcat('classify -t pendigits.tes -- Test_Only -m pendigits.libSVM.model -- LibSVM -Kernel 0 -CostFactor 3'));

Error = 0.069754


Example 3

Classify pendigits.txt Do not shuffle the dataUse first 7494 data as training, the rest as testing Apply a multi-class classification wrapper RBF Kernel SVM_LIGHT Support Vector Machine

test_classify('classify -t pendigits.txt -sf 0 -- train_test_validate -t 7494

-- train_test_multiple_class -- SVM_LIGHT -Kernel 2 -KernelParam 0.01 -CostFactor 3');

Error = 0.047170


ad