1 / 19

# Combining Classification and Model Trees for Handling Ordinal Problems - PowerPoint PPT Presentation

Combining Classification and Model Trees for Handling Ordinal Problems. D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software Development Laboratory and Computers and Applications Laboratory Department of Mathematics, University of Patras, Greece. Aim.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Combining Classification and Model Trees for Handling Ordinal Problems' - trista

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Combining Classification and Model Trees for Handling Ordinal Problems

D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas

Educational Software Development Laboratory

and

Computers and Applications Laboratory

Department of Mathematics, University of Patras, Greece

Aim Ordinal Problems

• Handling the problem of learning to predict ordinal (i.e., ordered discrete) classes.

• To propose a technique that can be a more robust solution to the problem.

Contents Ordinal Problems

• Introduction

• Techniques for Dealing with Ordinal Problems

• Proposed Technique

• Experiments

• Conclusions

Ordinal Classification Problems Ordinal Problems

• A class of problems between classification and regression (discrete classes with a linear ordering)

• Given ordered classes, one is not only interested in maximizing the classification accuracy, but also in minimizing the distances between the actual and the predicted classes.

Simple Techniques for Dealing with Ordinal Problems Ordinal Problems

• Classification algorithms by discarding the ordering information in the class attribute.

• Regression algorithms where each class is mapped to a numeric value.

• Reducing the multi-class ordinal classification problem to a set of binary classification problems using the one-against-all approach.

Another more Sophisticated Technique (ORD) Ordinal Problems

• Converting the original ordinal class problem into a series of binary problems that encode the ordering of the original classes, too. However, to predict the class value of an unseen instance this variant algorithm needs to estimate the probabilities of the k original ordinal classes using k − 1 models.

• For a three class ordinal problem, estimation of the probability for the first ordinal class value depends on a single classifier: P(Target < second value) as well as for the last ordinal class: P(Target > second value). However, for class value in the middle of the range, the probability depends on a pair of classifiers and is given by

P(Target > first value) * (1 − P(Target > second value))

Proposed Technique Ordinal Problems(1)

• Combines the predictions of a classification tree and a model tree algorithm.

• When learners are combined using a voting methodology, we expect to obtain good results based on the belief that the majority of classifiers are more likely to be correct in their decision when they agree in their opinion.

Proposed Technique Ordinal Problems(2)

Proposed Technique Ordinal Problems(3)

• In the proposed ensemble the sum rule is used - each voter gives the probability of its prediction for each candidate.

• Next all confidence values are added for each candidate and the candidate with the highest sum wins the election.

Experiments (1) Ordinal Problems

• To test the hypothesis that the above method improves the generalization performance on ordinal prediction problems, we performed experiments on real-world ordinal datasets donated by Dr. Arie Ben David (http://www.cs.waikato.ac.nz/ml/weka/).

• We also used datasets from UCI repository because of the lack of numerous benchmark datasets involving ordinal class values. These datasets represented numeric prediction problems. We converted the numeric target values into ordinal quantities using equal-size binning (three equal size intervals).

Experiments (2) Ordinal Problems

• All accuracy estimates were obtained by averaging the results from 10 separate runs of stratified 10-fold cross-validation.

• 26 datasets

Experiments (3) Ordinal Problems

• For each data set the algorithms are compared according to:

• classification accuracy (the rate of correct predictions)

• mean absolute error:

where p: predicted values and a: actual values.

Results Ordinal Problems(1)

• Table shows the summary results for the proposed technique in comparison with:

• C4.5 without any modification

• in conjunction with the ordinal classification method (C4.5-ORD)

• using classification via regression (M5΄)

Statistical Results Ordinal Problems(as far as root mean square error)

• The presented ensemble is significantly more accurate than M5΄ in 4 out of the 26 datasets, whilst it has significantly higher root mean square error in none dataset.

• The presented ensemble has also significantly lower root mean square error in 8 out of the 26 datasets than both C4.5 and C4.5-ORD, whereas it is significantly less accurate in none dataset.

Statistical Results Ordinal Problems(as far as classification accuracy)

• The presented ensemble is significantly more accurate than M5΄ in 4 out of the 26 datasets, whilst it has significantly higher error rate in 2 datasets.

• The presented ensemble has also significantly lower error rate in 3 out of the 26 datasets than C4.5-ORD, whereas it is significantly less accurate in 1 dataset.

• The proposed method is significantly more accurate than C4.5 in 1 out of the 26 data-sets, whilst it has significantly higher error rate in none dataset.

Discussion Ordinal Problems

• If the ranking problem is posed as a classification problem then the inherent structure present in ranked data is not made use of and hence generalization ability of such classifiers is severely limited.

• On the other hand, posing the task of sorting as a regression problem leads to a highly constrained problem.

Conclusion Ordinal Problems

• According to our experiments in synthetic and real ordinal data sets, the proposed method manages to minimize the distances between the actual and the predicted classes, without harming but actually slightly improving the classification accuracy.

Future work Ordinal Problems

• More extensive experiments with real ordinal data sets from diverse areas will be needed to establish the precise capabilities and relative advantages of this methodology.

Thank you Ordinal Problems

• Any question?