1 / 27

Incorporating Game Theory in Feature Selection for Text Categorization

Incorporating Game Theory in Feature Selection for Text Categorization. Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2 azam200n@cs.uregina.ca jt yao@cs.uregina.ca http://www.cs.uregina.ca/~azam200n http://www.cs.uregina.ca/~jtyao.

rian
Download Presentation

Incorporating Game Theory in Feature Selection for Text Categorization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Incorporating Game Theory in Feature Selection for Text Categorization Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2 azam200n@cs.uregina.ca jtyao@cs.uregina.ca http://www.cs.uregina.ca/~azam200nhttp://www.cs.uregina.ca/~jtyao

  2. Acknowledgement • Thanks to Dr. Dominik Slezak for presenting this work on our behalves. Incorporating Game Theory in Feature Selection for TC

  3. Introduction • Feature selection. • Selecting a subset of important features. • Text categorization. • Assigning textual documents to predefined categories. • Text categorization and high imbalance. • The number of instances in categories varies significantly. • Importance of features vary accordingly. • Hard to apply feature selection techniques directly. Incorporating Game Theory in Feature Selection for TC

  4. Feature Selection in Text Categorization • Assigning positive or negative values to features. • The values indicate importance of features. • Positive values indicates importance for positive category. • Negative values indicates importance for negative category. Incorporating Game Theory in Feature Selection for TC

  5. Existing Feature Selection Approaches • One sided approaches. • Selecting features with high positive values. • Two sided approaches. • Selecting features with high absolute value. • Explicit combinational approach. • Selecting features with high positive or negative values generated by a one sided method. Incorporating Game Theory in Feature Selection for TC

  6. Limitations of Existing Approaches • Favours features indicative of either positive or negative category. • There may be features that indicates both categories. • It is plausible to include such features in some applications. • Dilemma: positive features vs. negative features. • However, we need to find a way to select these features. • Incorporating Game Theory in Feature Selection to deal with this issue. Incorporating Game Theory in Feature Selection for TC

  7. Incompetence of Existing Approaches • An Example. • Considering an imbalanced data set with 10 documents in positive and 100 in negative categories. • There are eight words in these documents. • Considering four methods. • One sided approaches: correlation coefficient and GSS coefficient. • Two sided approaches: chi square and gini index. Incorporating Game Theory in Feature Selection for TC

  8. Probabilities of Words in Categories • Meaning of probabilities. • Referring to fraction of documents from a category containing the word. Incorporating Game Theory in Feature Selection for TC

  9. Scores of Words Incorporating Game Theory in Feature Selection for TC

  10. Rankings of Words • Observations • w7 and w8 are not considered as important by any method. • They will be ignored, if we select three features. Incorporating Game Theory in Feature Selection for TC

  11. A Simple Solution • Using an explicit combinational approach. • Probabilities in respective categories are used for rankings. • The new rankings. • Considering positive category twice as important as negative category. • We may select w1, w8 and w4. • We note that w8 which indicates both categories is selected. Incorporating Game Theory in Feature Selection for TC

  12. Conclusion from the Simple Solution • A feature may be considered as good for, • Positive category, • Negative category, • Both of them, or • Neither of them. • We are trying to find a systematic method, that finds the best decision choice. • Game theory may be useful for formulating such method. Incorporating Game Theory in Feature Selection for TC

  13. Game Theory • Game theory is a core subject in decision sciences. • Prisoners Dilemma. • A classical example in Game Theory. Incorporating Game Theory in Feature Selection for TC

  14. Feature Selection with Game Theory • Formulating problems with Game Theory requires to, • Identifythe player set. • Identify the strategy set. • Determine the payoff functions. • Implement a competition. Incorporating Game Theory in Feature Selection for TC

  15. The Player Set • Two players were selected. • The players represents positive and negative category. • The player C+ represents positive category. • The player C- represents negative category. • Each player determine the features’ utility for its respective category. Incorporating Game Theory in Feature Selection for TC

  16. The Strategy Set • Two actions were formulated for each player. • Action a1 for keeping a feature. • Action a2 for discarding a feature. • For Differentiating the actions of the two players • denote the actions of C+. • denote the actions of C-. Incorporating Game Theory in Feature Selection for TC

  17. The Payoff Functions • Notation for a payoff function. • Payoff of player i, performing action j, given action k of opponent is denoted as . • The payoff sets. Incorporating Game Theory in Feature Selection for TC

  18. Defining the Payoff Functions • Let cat and cat represents positive and negative categories. • A and B represent the number of documents from cat and cat containing word w. • C and D representthe number of documents from cat and cat that does not contain w. • Conditional probabilities of w in cat and cat are Incorporating Game Theory in Feature Selection for TC

  19. Payoffs Functions for Players • Both players deciding to keep a feature. • The payoffs of players are calculated as average. . • Both players deciding to discard a feature. • The payoffs are calculated as . • C+ deciding to keep while C- discard. • The payoffs are and respectively. • C+ deciding to discard while C- keep. • The payoffs are and respectively. Incorporating Game Theory in Feature Selection for TC

  20. Actions Scenarios for Players Incorporating Game Theory in Feature Selection for TC

  21. Implementing Competition • Representing the game in a payoff table. • Determining Nash equilibrium for finding the actions of players. Incorporating Game Theory in Feature Selection for TC

  22. Selected Features Set • Defining two features sets. • FS+ as set of features representing positive category. • FS- as set of features representing negative category. • The game will determine the inclusion or exclusion of features in these sets. • Final selected features is the union of FS+ and FS-. Incorporating Game Theory in Feature Selection for TC

  23. A Demonstrative Example • Considering earlier example. Incorporating Game Theory in Feature Selection for TC

  24. Payoff Tables for Words • The bold cells represents Nash equilibrium. • Considering w1. • The actions of players in equilibrium are for C+ and for C-. • The actions of players decides to include w1 in FS+. Incorporating Game Theory in Feature Selection for TC

  25. Payoff Tables for Words Incorporating Game Theory in Feature Selection for TC

  26. Selected Features • Result of implementing game for features. • FS+ = {w1, w7, w8} and FS- = {w4, w7,w8}. • FS = {w1, w4, w7, w8}. • Observation. • The words w7 and w8 are selected. • The suggested approach selects features, that indicates both categories. Incorporating Game Theory in Feature Selection for TC

  27. Conclusion • Limitations of existing approaches. • Preference is given to features indicating positive or negative category. • The may not be suitable for selecting features indicating both categories. • Game theory based method. • Implements a game between categories. • Importance of the method. • Useful in selecting features indicating positive category, negative category or both of them. Incorporating Game Theory in Feature Selection for TC

More Related