1 / 24

Breeding Decision Trees Using Evolutionary Techniques

Breeding Decision Trees Using Evolutionary Techniques. Papagelis Athanasios - Kalles Dimitrios Computer Technology Institute & AHEAD RM. Introduction. We use GAs to evolve simple and accurate binary decision trees Simple genetic operators over tree structures

Download Presentation

Breeding Decision Trees Using Evolutionary Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Breeding Decision Trees Using Evolutionary Techniques Papagelis Athanasios - Kalles DimitriosComputer Technology Institute & AHEAD RM

  2. Introduction • We use GAs to evolve simple and accurate binary decision trees • Simple genetic operators over tree structures • Experiments with UCI datasets • very good size • competitive accuracy results • Experiments with synthetic datasets • Superior accuracy results

  3. Current tree induction algorithms… • .. Use greedy heuristics • To guide search during tree building • To prune the resulting trees • Fast implementations • Accurate results on widely used benchmark datasets (like UCI datasets) • Optimal results ? • No • Good for real world problems? • There are not many real world datasets available for research.

  4. More on greedy heuristics • They can quickly guide us to desired solutions • On the other hand they can substantially deviate from optimal • WHY? • They are very strict • Which means that they are VERY GOOD just for a limited problem space

  5. Why GAs should work ? • GAs are not • Hill climbers • Blind on complex search spaces • Exhaustive searchers • Extremely expensive • They are … • Beam searchers • They balance between time needed and space searched • Application on bigger problem space • Good results for much more problems • No need to tune or derive new algorithms

  6. Another way to see it.. • Biases • Preference bias • Characteristics of output • We should choose about it • e.g small trees • Procedural bias • How we will search? • We should not choose about it • Unfortunately we have to: • Greedy heuristics make strong hypotheses about search space • GAs make weak hypotheses about search space

  7. The real world question… • Are there datasets where hill-climbing techniques are really inadequate ? • e.g unnecessarily big – misguiding output • Yes there are… • Conditionally dependent attributes • e.g XOR • Irrelevant attributes • Many solutions that use GAs as a preprocessor so as to select adequate attributes • Direct genetic search can be proven more efficient for those datasets

  8. The proposed solution • Select the desired decision tree characteristics (e.g small size) • Adopt a decision tree representation with appropriate genetic operators • Create an appropriate fitness function • Produce a representative initial population • Evolve for as long as you wish!

  9. Initialization procedure • Population of minimum decision trees • Simple and fast • Choose a random value as test value • Choose two random classes as leaves A=2 Class=2 Class=1

  10. Genetic operators

  11. Payoff function • Balance between accuracy and size • set x depending on the desired output characteristics. • Small Trees ?  x near 1 • Emphasis on accuracy ?  x grows big

  12. Advanced System Characteristics • Scalled payoff function (Goldberg, 1989) • Alternative crossovers • Evolution towards fit subtrees • Accurate subtrees had less chance to be used for crossover or mutation. • Limited Error Fitness (LEF) (Gathercole & Ross, 1997) • significant CPU timesavings and insignificant accuracy loses

  13. Second Layer GA • Test the effectiveness of all those components • coded information about the mutation/crossover rates and different heuristics as well as a number of other optimizing parameters • Most recurring results: • mutation rate 0.005 • crossover rate 0.93 • use a crowding avoidance technique • Alternative crossover/mutation techniques did not produce better results than basic crossover/mutation

  14. Search space / Induction costs • 10 leaves,6 values,2 classes • Search space >50,173,704,142,848(HUGE!) • Greedy feature selection • O(ak) a=attributes,k=instances (Quinlan 1986) • O(a2k2)one level lookahead (Murthy and Salzberg, 1995) • O(adkd) for d-1 levels of lookahead • Proposed heuristic • O(gen* k2*a). • Extended heuristic • O(gen*k*a)

  15. How it works? An example (a) • An artificial dataset with eight rules (26 possible value, three classes) • First two activation-rules as below: • (15.0 %) c1 A=(a or b or t) & B=(a or h or q or x) • (14.0%) c1 B=(f or l or s or w) & C=(c or e or f or k) • Huge Search Space !!!

  16. How it works? An example (b)

  17. Illustration of greedy heuristics problem • An example dataset (XOR over A1&A2)

  18. A1=t A2=f A2=t t f f t C4.5 result tree A3=t A1=t A2=f A2=t t f f t Totally unacceptable!!!

  19. More experiments towards this direction

  20. Results for artificial datasets

  21. Results for UCI datasets

  22. C4.5 / OneR deficiencies • Similar preference biases • Accurate, small decision trees • This is acceptable • Not optimized procedural biases • Emphasis on accuracy (C4.5) • Not optimized tree’s size • Emphasis on size (OneR) • Trivial search policy • Pruning as a greedy heuristic has similar disadvantages

  23. Average needed re-classification Future work • Minimize evolution time • crossover/mutation operators change the tree from a node downwards • we can classify only the instances that belong to the changed-node’s subtree. • But we need to maintain more node statistics

  24. Future work (2) • Choose the output class using a majority vote over the produced tree forest (experts voting) • Pruning is a greedy heuristic • A GA’s pruning?

More Related