1 / 15

Tutorial 2

Tutorial 2. LIU Tengfei 2/19/2009. Contents. Introduction TP, FP, ROC Precision, recall Confusion matrix Other performance measures Resource. Classifier output of Weka(1). Classifier output of Weka(2). TP rate, FP rate(1). Consider a diagnostic test

erol
Download Presentation

Tutorial 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial 2 LIU Tengfei 2/19/2009

  2. Contents • Introduction • TP, FP, ROC • Precision, recall • Confusion matrix • Other performance measures • Resource

  3. Classifier output of Weka(1)

  4. Classifier output of Weka(2)

  5. TP rate, FP rate(1) Consider a diagnostic test • A false positive(FP): the person tests positive, but actually does not have the disease. • A false negative(FN): the person tests negative, suggesting he is healthy, but he actually does have the disease. Note: True positive/negative are similar

  6. TP rate, FP rate(2) • TP rate = true positive rate FP rate = false positive rate

  7. TP rate, FP rate(3) Definition: TP rate = TP/(TP+FN) FP rate = FP/(FP+TN) From the actual value point of view

  8. ROC curve(1) • ROC = receiver operating characteristic Y:TP rate X:FP rate

  9. ROC curve(2) Which method (A or B) is better? compute ROC area: area under ROC curve

  10. Precision, Recall(1) • Precision = TP/(TP + FP) Recall = TP/(TP + FN) Precision: is the probability that a retrieved document is relevant. Recall: is the probability that a relevant document is retrieved in a search.

  11. Precision, Recall(2) • F-measure = 2*(precision*recall)/(precision + recall) • Precision, recall and F-measure come from information retrieval domain.

  12. Confusion matrix • Example: using J48 to process iris.arff

  13. Other performance measures *p are predicted values and a are actual values

  14. Resource 1. Wiki page for TP, FP, ROC 2. Wiki page for Precision and Recall 3. Ian H. Witten, Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques (Second Edition), Chapter 5

  15. Thank you !

More Related