1 / 17

Classification and Regression Trees

Classification and Regression Trees. JMP Partition Platform. In the News …. Why? What is it? How does it work? JMP Mechanics Evaluating model? Assessing usefulness? Understanding results Applying results. Analyze > Distribution. Data set > Riding Mowers. Begin with a 1-way analysis.

debra
Download Presentation

Classification and Regression Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Classification and Regression Trees JMP Partition Platform Classification Trees

  2. In the News … • Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Classification Trees

  3. Analyze > Distribution Data set > Riding Mowers Begin with a 1-way analysis. There is an equal distribution of values for the two levels of the response variable. Classification Trees

  4. Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Exploring Predictors If we put this in Scatterplot Matrix we are looking for the variable to split to give us best homogeneity Classification Trees

  5. Homogeneity It looks like at an Income of about 85 we would have a “pure” partition only having “owner” records for > 85. Classification Trees

  6. Launch Partition • Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Analyze > Modeling > Partition Identify X and Y values in dialog box. Use defaults for everything else at this point. Classification Trees

  7. Starting Point | To begin click on Split AICc = information criterion. Smaller number is better. Looks for a model with a good fit to the truth but with few parameters. G^2 = a likelihood ratio Chi-square; ratio is of expected to observed. Larger value the more likely there is a statistical difference. http://www.brianomeara.info/tutorials/aic Classification Trees

  8. First Split Classification Trees

  9. Splitting on the Income < 85.5 Leaf Classification Trees

  10. Splitting at INCOME < 85.5 Classification Trees

  11. Last Split at Lot Size < 20 Classification Trees

  12. Result of Splitting on Lot Size < 20 Classification Trees

  13. Split History Hot Spot > Split History We can see that the last split did not improve R-Square Classification Trees

  14. Showing Fit Details • Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Focus on misclassification rate and maybe RMSE or Mean Abs Dev. For usefulness focus on the confusion matrix and think about the two types of misclassification. Classification Trees

  15. Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Importance of Predictors Higher G^2 > increased importance in predicting outcome Classification Trees

  16. Why? • What is it? • How does it work? • JMP Mechanics • Evaluating model? • Assessing usefulness? • Understanding results • Applying results Hot Spot > Save Columns > Save Prediction Formula Prob(Ownership ==owner) Classification Trees

  17. Exercise Lost Sales Classification Trees

More Related