1 / 78

# Part II: Practical Implementations.

Part II: Practical Implementations. Modeling the Classes. Stochastic Discrimination. Algorithm for Training a SD Classifier. Generate projectable weak model. Evaluate model w.r.t. training set, check enrichment. Check uniformity w.r.t. existing collection. Add to discriminant. Download Presentation ## Part II: Practical Implementations.

E N D

### Presentation Transcript

1. Part II: Practical Implementations.

2. Modeling the Classes Stochastic Discrimination

3. Algorithm for Training a SD Classifier Generate projectable weak model Evaluate model w.r.t. training set, check enrichment Check uniformity w.r.t. existing collection Add to discriminant

4. Dealing with Data Geometry:SD in Practice

5. 2D Example • Adapted from [Kleinberg, PAMI, May 2000]

6. An “r=1/2” random subset in the feature space that covers ½ of all the points

7. Out In In It’s in 1/2 models Y = ½ = 0.5 It’s in 2/3 models Y = 2/3 = 0.67 It’s in 0/1 models Y = 0/1 = 0.0 In In In It’s in 3/4 models Y = ¾ = 0.75 It’s in 4/5 models Y = 4/5 = 0.8 It’s in 5/6 models Y = 5/6 = 0.83

8. Out In In It’s in 6/8 models Y = 6/8 = 0.75 It’s in 7/9 models Y = 7/9 = 0.77 It’s in 5/7 models Y = 5/7 = 0.72 In Out Out It’s in 8/10 models Y = 8/10 = 0.8 It’s in 8/11 models Y = 8/11 = 0.73 It’s in 8/12 models Y = 8/12 = 0.67

9. Fraction of “r=1/2” random subsets covering point (2,17) as more such subsets are generated

10. Fractions of “r=1/2” random subsets covering several selected points as more such subsets are generated

11. Distribution of model coverage for all points in space, with 100 models

12. Distribution of model coverage for all points in space, with 200 models

13. Distribution of model coverage for all points in space, with 300 models

14. Distribution of model coverage for all points in space, with 400 models

15. Distribution of model coverage for all points in space, with 500 models

16. Distribution of model coverage for all points in space, with 1000 models

17. Distribution of model coverage for all points in space, with 2000 models

18. Distribution of model coverage for all points in space, with 5000 models

19. Introducing enrichment: For any discrimination to happen, the models must have some difference in coverage for different classes.

20. Class distribution A biased (enriched) weak model • Enforcing enrichment (adding in a bias): require each subset to cover more points of one class than another

21. Distribution of model coverage for points in each class, with 100 enriched weak models

22. Distribution of model coverage for points in each class, with 200 enriched weak models

23. Distribution of model coverage for points in each class, with 300 enriched weak models

24. Distribution of model coverage for points in each class, with 400 enriched weak models

25. Distribution of model coverage for points in each class, with 500 enriched weak models

26. Distribution of model coverage for points in each class, with 1000 enriched weak models

27. Distribution of model coverage for points in each class, with 2000 enriched weak models

28. Distribution of model coverage for points in each class, with 5000 enriched weak models

29. Error rate decreases as number of models increases Decision rule: if Y < 0.5 then class 2 else class 1

30. Training Set Test Set • Sparse Training Data: Incomplete knowledge about class distributions

31. Distribution of model coverage for points in each class, with 100 enriched weak models Training Set Test Set

32. Distribution of model coverage for points in each class, with 200 enriched weak models Training Set Test Set

33. Distribution of model coverage for points in each class, with 300 enriched weak models Training Set Test Set

34. Distribution of model coverage for points in each class, with 400 enriched weak models Training Set Test Set

35. Distribution of model coverage for points in each class, with 500 enriched weak models Training Set Test Set

36. Distribution of model coverage for points in each class, with 1000 enriched weak models Training Set Test Set

37. Distribution of model coverage for points in each class, with 2000 enriched weak models Training Set Test Set

38. No discrimination! • Distribution of model coverage for points in each class, with 5000 enriched weak models Training Set Test Set

39. Models of this type, when enriched for training set, are not necessarily enriched for test set Training Set Test Set Random model with 50% coverage of space

40. Introducing projectability: Maintain local continuity of class interpretations. Neighboring points of the same class should share similar model coverage.

41. Class distribution A projectable model • Allow some local continuity in model membership, so that interpretation of a training point can generalize to its immediate neighborhood

42. Distribution of model coverage for points in each class, with 100 enriched, projectable weak models Training Set Test Set

43. Distribution of model coverage for points in each class, with 300 enriched, projectable weak models Training Set Test Set

44. Distribution of model coverage for points in each class, with 400 enriched, projectable weak models Training Set Test Set

45. Distribution of model coverage for points in each class, with 500 enriched, projectable weak models Training Set Test Set

46. Distribution of model coverage for points in each class, with 1000 enriched, projectable weak models Training Set Test Set

47. Distribution of model coverage for points in each class, with 2000 enriched, projectable weak models Training Set Test Set

48. Distribution of model coverage for points in each class, with 5000 enriched, projectable weak models Training Set Test Set

49. Promoting uniformity: All points in the same class should have equal likelihood to be covered by a model of each particular rating. Retain models that cover the points whose coverage by current collection is less

More Related