1 / 34

Decision Trees

Decision Trees. The “No Free Lunch” Theorem. Is there any representation that is compact (ie, sub-exponential in n) for all functions? Function = truth table n attributes  2^n rows in table Classification/target column is 2^n long

Download Presentation

Decision Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decision Trees

  2. The “No Free Lunch” Theorem • Is there any representation that is compact (ie, sub-exponential in n) for all functions? • Function = truth table • n attributes  2^n rows in table • Classification/target column is 2^n long • If you drop a bit, you cut the number of functions in half! 6 attributes = 18,446,744,073,709,551,616 functions

  3. Which attribute to select? witten&eibe

  4. A criterion for attribute selection • Which is the best attribute? • The one which will result in the smallest tree • Heuristic: choose the attribute that produces the “purest” nodes • Need a good measure of purity! • Maximal when? • Minimal when?

  5. Unemployed Employed Less or equal 50K Over 50K Information Gain Split over whether applicant is employed Split over whether Balance exceeds 50K Which test is more informative?

  6. Information Gain Impurity/Entropy (informal) • Measures the level of impurity in a group of examples

  7. Impurity Very impure group Less impure Minimum impurity

  8. Calculating Impurity • Impurity = Pi is proportion of class i When examples can belong to one of two classes: What is the worst case of impurity?

  9. 2-class Cases: Minimum impurity • What is the impurity of a group in which all examples belong to the same class? • Impurity= - 1 log21 = 0 • What is the impurity of a group with 50% in either class? • Impurity= -0.5 log20.5 – 0.5 log20.5 =1 Maximum impurity

  10. Calculating Information Gain Information Gain = Impurity (parent) – [Impurity (children)] Entire population (30 instances) 17 instances 13 instances (Weighted) Average Impurity of Children = Information Gain= 0.996 - 0.615 = 0.38

  11. Decision Trees: Summary • Representation=decision trees • Bias=preference for small decision trees • Search algorithm= • Heuristic function=information gain • Overfitting and pruning

More Related