1 / 38

Rare Category Detection

Rare Category Detection. Jingrui He Machine Learning Department Carnegie Mellon University Joint work with Jaime Carbonell. What’s Rare Category Detection. Start de-novo Very skewed classes Majority classes Minority classes Labeling oracle Goal

Download Presentation

Rare Category Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rare Category Detection Jingrui He Machine Learning Department Carnegie Mellon University Joint work with Jaime Carbonell

  2. What’s Rare Category Detection • Start de-novo • Very skewed classes • Majority classes • Minority classes • Labeling oracle • Goal • Discover minority classes with a few label requests Machine Learning Lunch

  3. Rare classes A group of points Clustered Non-separable from the majority classes Comparison with Outlier Detection • Outliers • A single point • Scattered • Separable Machine Learning Lunch

  4. Rare category detection Initial condition: NO labeled examples Goal: discover the minority classes with the least label requests Active learning Initial condition: labeled examples from each class Goal: improve the performance of the current classifier with the least label requests Comparison with Active Learning Machine Learning Lunch

  5. Fraud detection Network intrusion detection Applications Astronomy Spam image detection Machine Learning Lunch

  6. The Big Picture Classifier Unbalanced Unlabeled Data Set Rare Category Detection Learning in Unbalanced Settings Feature Extraction Spatial Raw Data Relational Temporal Machine Learning Lunch

  7. Outline • Problem definition • Related work • Rare category detection for spatial data • Prior-dependent rare category detection • Prior-free rare category detection • Conclusion Machine Learning Lunch

  8. Related Work • Pelleg & Moore 2004 • Mixture model • Different selection criteria • Fine & Mansour 2006 • Generic consistency algorithm • Upper bounds and lower bounds • Papadimitriou et al 2003 • LOCI algorithm for groups of outliers Separable or Near-separable Machine Learning Lunch

  9. Outline • Problem definition • Related work • Rare category detection for spatial data • Prior-dependent rare category detection • Prior-free rare category detection • Conclusion Machine Learning Lunch

  10. Notations • Unlabeled examples: , • m Classes: • m-1 rare classes: • One majority class: , • Goal: find at least ONE example from each rare class by requesting a few labels Machine Learning Lunch

  11. Assumptions • The distribution of the majority class is sufficiently smooth • Examples from the minority classes form compact clusters in the feature space Machine Learning Lunch

  12. Overview of the Algorithms • Nearest-neighbor-based methods • Methodology: local density differential sampling • Intuition: select examples according to the change in local density Machine Learning Lunch

  13. Two Classes: NNDB 1. Calculate class-specific radius 2. , , Increase t by 1 3. 4. Query No 5. Rare class? Yes 6. Output Machine Learning Lunch

  14. NNDB: Calculate Class-Specific Radius • Number of examples from the minority class: • , calculate the distance between and its nearest neighbor • The class-specific radius: Machine Learning Lunch

  15. NNDB: Calculate Nearest Neighbors Machine Learning Lunch

  16. NNDB: Calculate the Scores Query Machine Learning Lunch

  17. NNDB: Pick the Next Candidate Increase t by 1 Query Machine Learning Lunch

  18. Why NNDB Works • Theoretically • Theorem 1 [He & Carbonell 2007]: under certain conditions, with high probability, after a few iteration steps, NNDB queries at least one example whose probability of coming from the minority class is at least 1/3 • Intuitively • The score measures the change inlocal density Machine Learning Lunch

  19. Multiple Classes: ALICE • m-1 rare classes: • One majority class: , 1. For each rare class c, Yes 2. We have found examples from class c No 3. Run NNDB with prior Machine Learning Lunch

  20. Why ALICE Works • Theoretically • Theorem 2 [He & Carbonell 2008]:under certain conditions, with high probability, in each outer loop of ALICE, after a few iteration steps in NNDB, ALICE queries at least one example whose probability of coming from one minority class is at least 1/3 Machine Learning Lunch

  21. Implementation Issues • ALICE • Problem: repeatedly sampling from the same rare class • MALICE • Solution: relevance feedback Class-specific radius Machine Learning Lunch

  22. Results on Synthetic Data Sets Machine Learning Lunch

  23. Abalone 4177 examples 7-dimensional features 20 classes Largest class: 16.50% Smallest class: 0.34% Summary of Real Data Sets • Shuttle • 4515 examples • 9-dimensional features • 7 classes • Largest class: 75.53% • Smallest class: 0.13% Machine Learning Lunch

  24. Results on Real Data Sets Abalone Shuttle MALICE MALICE Interleave Interleave Random sampling Random sampling Machine Learning Lunch

  25. Imprecise priors Abalone Shuttle Machine Learning Lunch

  26. Outline • Problem definition • Related work • Rare category detection for spatial data • Prior-dependent rare category detection • Prior-free rare category detection • Conclusion Machine Learning Lunch

  27. Overview of the Algorithm • Density-based method • Methodology: specially designed exponential families • Intuition: select examples according to the change in local density • Difference from NNDB (ALICE): NO prior information needed Machine Learning Lunch

  28. Specially Designed ExponentialFamilies [Efron & Tibshirani 1996] • Favorable compromise between parametric and nonparametric density estimation • Estimated density Carrier density parameter vector Normalizing parameter vector of sufficient statistics Machine Learning Lunch

  29. SEDER Algorithm • Carrier density: kernel density estimator • To decouple the estimation of different parameters • Decompose • Relax the constraint such that Machine Learning Lunch

  30. Parameter Estimation • Theorem 3 [To appear]: the maximum likelihood estimate and of and satisfy the following conditions: where Machine Learning Lunch

  31. Parameter Estimation cont. • Let • : where , : positive parameter in most cases Machine Learning Lunch

  32. Scoring Function • The estimated density • Scoring function: norm of the gradient where Machine Learning Lunch

  33. Results on Synthetic Data Sets Machine Learning Lunch

  34. Summary of Real Data Sets Moderately Skewed Extremely Skewed Machine Learning Lunch

  35. Moderately Skewed Data Sets Ecoli Glass MALICE MALICE Machine Learning Lunch

  36. Extremely Skewed Data Sets Page Blocks Abalone MALICE MALICE Shuttle MALICE Machine Learning Lunch

  37. Conclusion • Rare category detection • Open challenge • Lack of effective methods • Nearest-neighbor-based methods • Prior-dependent • Local density differential sampling • Density-based method • Prior-free • Specially designed exponential families Machine Learning Lunch

  38. Thank You!

More Related