1 / 17

R for Classification

R for Classification. Jennifer Broughton Shimadzu Research Laboratory Manchester, UK jennifer.broughton@srlab.co.uk 2 nd May 2013. Classification?. Object Type Feature1 Feature2 Feature3 ……. Feature n Label 1 val[1,1] val[1,2] val[1,3] ……. val[1,n]

geneva
Download Presentation

R for Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R for Classification Jennifer Broughton Shimadzu Research Laboratory Manchester, UK jennifer.broughton@srlab.co.uk 2nd May 2013

  2. Classification? Object Type Feature1 Feature2 Feature3 ……. Feature n Label 1 val[1,1] val[1,2] val[1,3] ……. val[1,n] Label 2 val[2,1] val[2,2] val[2,3] ……. val[2,n] …… ……. ……. ……. ……. ……… Label m val[m,1] val[m.2] val[m,3] ……. val[m,n] Automatic Identification of Type (Class) of Object from Measured Variables (Features) 2 of 17

  3. Example Data 3 of 17

  4. Data Preparation & Investigation EDA Technique Box Plots PCA Decision Trees Clustering • Best features to distinguish between classes • Relationships between • features • Feature reduction Training Set 4 of 17

  5. Box Plots PCA & Multivariate Analysis: ade4 FactoMineR 5 of 17

  6. Example Classifier 6 of 17

  7. Classification Algorithms in R Rattle: RAnalytical Tool to Learn Easily (Rattle: A Data Mining GUI for R, Graham J Williams, The R Journal, 1(2):45-55) 7 of 17

  8. SVM 8 of 17

  9. Ensemble Algorithm 9 of 17

  10. Training and Testing Classification Results Trained Classifier Training Set (labelled) Classification Algorithm: Neural Network Support Vector Machine Random Forest Test Set (unlabelled) Assess Predictions: Confusion Matrix ROC Curve (2 categories) …. Prediction Results + Labels 10 of 17

  11. Using Classifiers in R Select Training Data Build Classifier classifier  algorithm(formula, data, options) (boosting and nnet) Run Classifier classifier.pred  predict(classifier, newdata, options) 11 of 17

  12. SVM & Neural Net Tuning 12 of 17

  13. Classifier Feedback print(classifier) plot(classifier) high Gini Coefficient = high dispersion 13 of 17

  14. Classifier Prediction Results predict(type = “class”) predict(type = “prob”) confusion matrix 14 of 17

  15. Binary Classification Results Class Present? N Y  False Positive True Positive  Y Class Detected? False Negative   True Negative N 15 of 17

  16. ROC Curves in R ROCR package 16 of 17

  17. Example Results 17 of 17

More Related