1 / 81

Feature Selection

Feature Selection. Jamshid Shanbehzadeh, Samaneh Yazdani. Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem University of Teheran). Outline. Outline. Part 1: Dimension Reduction Dimension Feature Space Definition & Goals

Download Presentation

Feature Selection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Feature Selection Jamshid Shanbehzadeh, Samaneh Yazdani Department of Computer Engineering, Faculty Of Engineering, Khorazmi University (Tarbiat Moallem University of Teheran)

  2. Outline

  3. Outline • Part 1: Dimension Reduction • Dimension • Feature Space • Definition & Goals • Curse of dimensionality • Research and Application • Grouping of dimension reduction methods • Part 3: Application Of Feature Selection and Software • Part 2: Feature selection • Parts of feature set • Feature Selection Approach

  4. Part 1: Dimension Reduction

  5. Dimension Reduction • Dimension • Dimension (Feature or Variable): • A measurement of a certain aspect of an object • Two feature of person: • weight • hight

  6. Dimension Reduction • Feature Space • Feature Space: • An abstract space where each pattern sample is represented as point

  7. Dimension Reduction • Introduction • Large and high-dimensional data • Web documents, etc… • A large amount of resources are needed in • Information Retrieval • Classification tasks • Data Preservation etc… Dimension Reduction

  8. Dimension Reduction • Definition & Goals • Dimensionality reduction: • The study of methods for reducing the number of dimensions describing the object • General objectives of dimensionality reduction: • Reduce the computational cost • Improve the quality of data for efficient data-intensive processing tasks

  9. Dimension Reduction • Definition & Goals Class 1: overweight Class 2: underweight Weight (kg) 60 50 • Dimension Reduction • preserves information on classification of overweight and underweight as much as possible • makes classification easier • reduces data size ( 2 features  1 feature ) Height (cm) 150 140

  10. Dimension Reduction • Curse of dimensionality • As the number of dimension increases, a fix data sample becomes exponentially spars Example: Observe that the data become more and more sparse in higher dimensions • Effective solution to the problem of “curse of dimensionality” is: • Dimensionality reduction

  11. Dimension Reduction • Research and Application Why dimension reduction is a subject of much research recently? • Massive data of large dimensionality in: • Knowledge discovery • Text mining • Web mining • and . . .

  12. Dimension Reduction • Grouping of dimension reduction methods • Dimensionality reduction approaches include • Feature Selection • Feature Extraction

  13. Dimension Reduction • Grouping of dimension reduction methods : Feature Selection • Dimensionality reduction approaches include • Feature Selection: the problem of choosing a small subset of features that ideally are necessary and sufficient to describe the target concept. Example • Feature Set= {X,Y} • Two Class Goal: Classification • Feature X Or Feature Y ? • Answer: Feature X

  14. Dimension Reduction • Grouping of dimension reduction methods : Feature Selection • Feature Selection (FS) • Selects feature • ex. • Preserves weight

  15. Dimension Reduction • Grouping of dimension reduction methods • Dimensionality reduction approaches include • Feature Extraction: Create new feature based on transformations or combinations of the original feature set. New Feature • Original Feature {X1,X2}

  16. Dimension Reduction • Grouping of dimension reduction methods • Feature Extraction (FE) • Generates feature • ex. • Preserves weight / height

  17. Dimension Reduction • Grouping of dimension reduction methods • Dimensionality reduction approaches include • Feature Extraction: Create new feature based on transformations or combinations of the original feature set. • N: Number of original features • M: Number of extracted features • M<N

  18. Dimension Reduction • Question: Feature Selection Or Feature Extraction • Feature Selection Or Feature Extraction • It is depend on the problem. Example • Pattern recognition: problem of dimensionality reduction is to extract a small set of features that recovers most of the variability of the data. • Text mining: problem is defined as selecting a small subset of words or terms (not new features that are combination of words or terms). • Image Compression: problem is finding the best extracted features to describe the image

  19. Part 2: Feature selection

  20. Feature selection • Thousands to millions of low level features: select the most relevant one to build better, faster, and easier to understand learning machines. n X m N

  21. Feature selection • Parts of feature set • Irrelevant OR Relevant • Three disjoint categories of features: • Irrelevant • Weakly Relevant • Strongly Relevant

  22. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Two Class : {Lion and Deer} • We use some features to classify a new instance To which class does this animal belong

  23. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Two Class : {Lion and Deer} • We use some feature to classify a new instance So, number of legs is irrelevant feature Q: Number of legs? A: 4 • Feature 1: Number of legs

  24. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Two Class : {Lion and Deer} • We use some features to classify a new instance So, Color is an irrelevant feature Q: What is its color? A: • Feature 1: Number of legs • Feature 2: Color

  25. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Two Class : {Lion and Deer} • We use some features to classify a new instance So, Feature 3 is a relevant feature Q: What does it eat? A: Grass • Feature 1: Number of legs • Feature 2: Color • Feature 3: Type of food

  26. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Three Class : {Lion, Deer and Leopard} • We use some features to classify a new instance To which class does this animal belong

  27. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Three Class : {Lion, Deer and Leopard} • We use some features to classify a new instance So, number of legs is an irrelevant feature Q: Number of legs? A: 4 • Feature 1: Number of legs

  28. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Three Class : {Lion, Deer and Leopard} • We use some features to classify a new instance So, Color is a relevant feature Q: What is its color? A: • Feature 1: Number of legs • Feature 2: Color

  29. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Three Class : {Lion and Deer and Leopard} • We use some features to classify a new instance So, Feature 3 is a relevant feature Q: What does it eat? A: meat • Feature 1: Number of legs • Feature 2: Color • Feature3: Type of food

  30. Feature selection • Parts of feature set • Irrelevant OR Relevant • Goal: Classification • Three Class : {Lion and Deer and Leopard} • We use some feature to classify a new instance • Feature 1: Number of legs • Feature 2: Color • Feature3: Type of food • Add new feature: Felidae • It is weakly relevant feature • Optimal set: {Color, Type of food} Or {Color, Felidae}

  31. Feature selection • Parts of feature set • Irrelevant OR Relevant • Traditionally, feature selection research has focused on searching for relevant features. Relevant Irrelevant Feature set

  32. Data set Five Boolean features C = F1∨F2 F3 = ┐F2 ,F5 = ┐F4 Optimal subset: {F1, F2}or{F1, F3} • Feature selection • Parts of feature set • Irrelevant OR Relevant: An Example for the Problem

  33. Feature selection • Parts of feature set • Irrelevant OR Relevant • Formal Definition 1 (Irrelevance) : • Irrelevance indicates that the feature is not necessary at all. • In previous Example: • F4, F5 irrelevance Relevant F4 and F5

  34. Feature selection • Parts of feature set • Irrelevant OR Relevant • Definition1(Irrelevance)A feature Fi is irrelevantif • Irrelevance indicates that the feature is not necessary at all • F be a full set of features • Fia feature • Si= F −{Fi}.

  35. Feature selection • Parts of feature set • Irrelevant OR Relevant • Categories of relevant features: • Strongly Relevant • Weakly Relevant Strongly Irrelevant Weakly Relevant

  36. Data set Five Boolean features C = F1∨F2 F3 = ┐F2 ,F5 = ┐F4 • Feature selection • Parts of feature set • Irrelevant OR Relevant: An Example for the Problem

  37. Feature selection • Parts of feature set • Irrelevant OR Relevant • Formal Definition2 (Strong relevance) : • Strong relevance of a feature indicates that the feature is always necessary for an optimal subset • It cannot be removed without affecting the original conditional class distribution. • In previous Example: • Feature F1 is strongly relevant Weakly F1 F4 and F5

  38. Feature selection • Parts of feature set • Irrelevant OR Relevant • Definition 2 (Strong relevance)A feature Fi is strongly relevant if • Strong relevance of a feature cannot be removed without affecting the original conditional class distribution

  39. Feature selection • Parts of feature set • Irrelevant OR Relevant • Formal Definition 3 (Weak relevance) : • Weak relevance suggests that the feature is not always necessary but may become necessary for an optimal subset at certain conditions. • In previous Example: • F2, F3 weakly relevant F2 and F3 F1 F4 and F5

  40. Feature selection • Parts of feature set • Irrelevant OR Relevant • Definition 3 (Weak relevance)A feature Fi is weakly relevantif • Weak relevance suggests that the feature is not always necessary but may become necessary for an optimal subset at certain conditions.

  41. Feature selection • Parts of feature set • Optimal Feature Subset • Example: • In order to determine the target concept (C=g(F1, F2)): • F1 is indispensable • One of F2 and F3 can be disposed • Both F4 and F5 can be discarded. optimal subset: Either {F1, F2} or{F1, F3} • The goalof feature selection is to find either of them.

  42. Feature selection • Parts of feature set • Optimal Feature Subset optimal subset: Either {F1, F2} or{F1, F3} • Conclusion • An optimal subset should include all strongly relevant features, none of irrelevant features, and a subset of weakly relevant features. which of weakly relevant features should be selected and which of them removed

  43. Feature selection • Parts of feature set • Redundancy • Solution • Defining Feature Redundancy

  44. Feature selection • Parts of feature set • Redundancy • Redundancy • It is widely accepted that two features are redundant to each other if their values are completely correlated • In previous Example: • F2, F3 ( )

  45. Feature selection • Parts of feature set • Redundancy • It used when one feature is correlated with a set of features. • Given a feature Fi, let ,Mi is said to be a Markov blanket for Fi if Markov blanket • The Markov blanket condition requires that Mi subsume not only the information that Fihas about C, but also about all of the other features.

  46. Feature selection • Parts of feature set • Redundancy • Redundancy definition further divides weakly relevant features into redundant and non-redundant ones. II III Strongly Irrelevant Weakly II : Weakly relevant and redundant features III: Weakly relevant but non-redundant features Optimal Subset: Strongly relevant features +Weakly relevant but non-redundant features

  47. Feature selection • Approaches

  48. Feature selection • Approaches : Subset Evaluation (Feature Subset Selection ) • Framework of feature selection via subset evaluation

  49. Feature selection • Approaches : Subset Evaluation (Feature Subset Selection ) • Generates subset of features for evaluation • Can start with: • no features • all features • random subset of features Subset Generation 1 2 Original Feature Set Generation Subset Evaluation Goodness of the subset Stopping Criterion No Validation Yes 3 4

  50. Feature selection • Approaches : Subset Evaluation (Feature Subset Selection ) Subset search method-Exhaustive Search Example • Examine all combinations of feature subset. • Example: • {f1,f2,f3} => { {f1},{f2},{f3},{f1,f2},{f1,f3},{f2,f3},{f1,f2,f3} } • Order of the search space O(2d), d - # feature. • Optimal subset is achievable. • Too expensive if feature space is large.

More Related