1 / 13

Performance measurement

Performance measurement. Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 = YES’ and ‘0 = NO’ Should we threshold at 0.5, saying that anything > 0.5 is a 1, anything =< 0.5 is a zero?.

marlin
Download Presentation

Performance measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance measurement

  2. Performance measurement • Must be careful what performance metric we use • For example, say we have a NN classifier with 1 output unit, and we code ‘1 = YES’ and ‘0 = NO’ • Should we threshold at 0.5, saying that anything > 0.5 is a 1, anything =< 0.5 is a zero?

  3. Performance measurement • Only if classification/misclassification cost are the same for each of the two classes • Output threshold of 0.5 is not set in stone • What is performance if we use decision threshold of 0.6, or 0.4?

  4. Performance measurement • For example, in predicting consumer creditworthiness: • Are costs of loaning money to someone who then defaults same as: • Costs of not lending money to someone who would in actual fact have repaid the loan?

  5. Confusion matrix/crosstabs • Calculate four quantities: • True Positives (TP): answer = YES, network said YES • True Negatives (TN): answer = NO, network said NO • False Positives (FP): answer = NO, network said YES • False Negatives (FN): answer = YES, network said NO

  6. Confusion matrix

  7. Confusion matrix • Calculate a confusion matrix for many different output thresholds (e.g., 0.1, 0.2 …0.9) • From these matrices, calculate the following values: • Calculate probabilities • hit rate = true positive ratio = sensitivity = TP/(TP+FN) • false alarm rate = false positive ratio = FP/(FP+TN) • Plot Receiver Operating Characteristic (ROC) Curve

  8. ROC Curve

  9. ROC curve

  10. ROC curves • Area under curve gives idea of how good classifier is. 0.5 = no good, approaching 1 = excellent • Can then build in profits/costs of different correct answers/mistakes into the confusion matrices to build a Gains Chart. Again, look at this area on chart • Classifier with highest area on gains chart is the most profitable

  11. Values for gains chart

  12. Performance of regression networks • Mean square error? • Goodness of fit, R squared values? • Again, are costs/benefits of errors same for all different values? • Useful to ‘eyeball’ data, see if there are some areas where network is good, some where not as good, look at relative costs

  13. Summary • When you measure performance, be careful what you are measuring!

More Related