1 / 15

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees

Richard Jensen Qiang Shen. Fuzzy-Rough Feature Significance for Fuzzy Decision Trees. Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth. Outline. Utility of decision tree induction Importance of attribute selection

tale
Download Presentation

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Richard Jensen Qiang Shen Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth

  2. Outline • Utility of decision tree induction • Importance of attribute selection • Introduction of fuzzy-rough concepts • Evaluation of the fuzzy-rough metric • Results of F-ID3 vs FR-ID3 • Conclusions

  3. Decision Trees • Popular classification algorithm in data mining and machine learning • Fuzzy decision trees (FDTs) follow similar principles to crisp decision trees • FDTs allow greater flexibility • Partitioning of the instance space; attributes are selected to derive partitions • Hence, attribute selection is an important factor in decision tree quality

  4. Fuzzy Decision Trees • Object membership • Traditionally, node membership of {0,1} • Here, membership is any value in the range [0,1] • Calculated from conjunction of membership degrees along path to the node • Fuzzy tests • Carried out within nodes to determine the membership of feature values to fuzzy sets • Stopping criteria • Measure of feature significance

  5. Decision Tree Algorithm Training set S and (optionally) depth of decision treel Start to form decision tree from the top level, Do loopuntil • the depth of the tree gets to lor • there is no node to expand a)Gauge significance of each attribute of S not already expanded in this branch b)Expand the attribute with the most significance c)Stop expansion of the leaf node of attribute if maximum significance obtained End do loop

  6. Feature Significance • Previous FDT inducers use fuzzy entropy • Little research in the area of alternatives • Fuzzy-rough feature significance has been used previously in feature selection with much success • This can also be used to gauge feature importance within FDT construction • The fuzzy-rough measure extends concepts from crisp rough set theory

  7. Crisp Rough Sets Upper Approximation [x]B is the set of all points which are indiscernible with point x in terms of feature subset B. Set X Lower Approximation Equivalence class [x]B

  8. Crisp equivalence class Fuzzy equivalence class Fuzzy Equivalence Classes At the centre of Fuzzy-Rough Feature Selection • Incorporate vagueness • Handle real valued data • Cope with noisy data Image: Rough Fuzzy Hybridization: A New Trend in Decision Making, S. K. Pal and A. Skowron (eds), Springer-Verlag, Singapore, 1999

  9. Fuzzy-Rough Significance • Deals with real-valued features via fuzzy sets • Fuzzy lower approximation: • Fuzzy positive region: • Evaluation function: • Feature importance is estimated with this

  10. Evaluation • Is the γ’ metric a useful gauger of feature significance? • γ’ metric compared with leading feature rankers: • Information Gain, Gain Ratio, Chi2, Relief, OneR • Applied to test data: • 30 random feature values for 400 objects • 2 or 3 features used to determine classification • Task: locate those features that affect the decision

  11. Evaluation… • Results for x*y*z2 > 0.125 • Results for (x + y)3 < 0.125 • FR, IG and GR perform best • FR metric locates the most important features

  12. FDT Experiments • Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3 (FR-ID3) • Only difference between methods is the choice of feature significance measure • Datasets used taken from the machine learning repository • Data split into two equal halves: training and testing • Resulting trees converted to equivalent rulesets

  13. Results • Real-valued data • Average ruleset size • 56.7 for F-ID3 • 88.6 for FR-ID3 • F-ID3 performs marginally better than FR-ID3

  14. Results… • Crisp data • Average ruleset size • 30.2 for F-ID3 • 28.8 for FR-ID3 • FR-ID3 performs marginally better than F-ID3

  15. Conclusion • Decision trees are a popular means of classification • The selection of branching attributes is key to resulting tree quality • The use of a fuzzy-rough metric for this purpose looks promising • Future work • Further experimental evaluation • Fuzzy-rough feature reduction pre-processor

More Related