1 / 52

Re-active Learning : Active Learning with Re-labeling

This paper discusses the limitations of standard active learning algorithms and proposes a re-active learning approach that combines uncertainty sampling with relabeling. It introduces the concept of impact sampling and demonstrates its effectiveness in improving label uncertainty and data quality. The paper also presents a case study using two different datasets to validate the proposed approach.

shaws
Download Presentation

Re-active Learning : Active Learning with Re-labeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Re-active Learning: Active Learning with Re-labeling Christopher H. Lin University of Washington Mausam IIT Delhi Daniel S. Weld University of Washington

  2. *Speaker not paid by Oracle Corporation

  3. CROWDSOURCING

  4. (Labeling) Mistakes Were Made Human

  5. Majority Vote Parrot Parakeet Parrot Parrot

  6. Relabel? Parakeet Parrot VS New label? Parakeet

  7. MORE NOISY DATA LESS BETTER DATA

  8. MORE NOISY DATA LESS BETTER DATA [Sheng et al. 2008, Lin et al. 2014]

  9. Re-active Learning Contributions Standard Active Learning Algorithms Fail Uncertainty Sampling [Lewis and Catlett 1994] Expected Error Reduction [Roy and McCallum 2001] Re-active Learning Algorithms Extensions of Uncertainty Sampling Impact Sampling

  10. Standard active learning algorithms fail!

  11. h* True Hypothesis

  12. h* h Current Hypothesis

  13. h* h Uncertainty Sampling [Lewis and Catlett (1994)]

  14. h* h Suppose labeled many times already!

  15. h* h Uncertainty Sampling labels these two examples Infinitely many times!

  16. Fundamental Problem: Does not use all sources of information h* h Uncertainty Sampling labels these two examples Infinitely many times!

  17. Re-active Learning Contributions Standard Active Learning Algorithms Fail Uncertainty Sampling [Lewis and Catlett 1994] Expected Error Reduction [Roy and McCallum 2001] Re-active Learning Algorithms Extensions of Uncertainty Sampling Impact Sampling

  18. Expected Error Reduction (EER) [Roy and McCallum (2001)] Also suffers from infinite looping!

  19. Re-active Learning Contributions Standard Active Learning Algorithms Fail Uncertainty Sampling [Lewis and Catlett 1994] Expected Error Reduction [Roy and McCallum 2001] Re-active Learning Algorithms Extensions of Uncertainty Sampling Impact Sampling

  20. How to fix? Consider the aggregate label uncertainty! ML

  21. How to fix? Consider the aggregate label uncertainty! ML h* h High # annotations = LOW UNCERTAINTY

  22. How to fix? Consider the aggregate label uncertainty! ML Low # annotations = HIGH UNCERTAINTY h* h High # annotations = LOW UNCERTAINTY

  23. Alpha-weighted uncertainty sampling (1-α) . Classifier uncertainty + α . Aggregate Label uncertainty

  24. Fixed-Relabeling Uncertainty Sampling • Pick new unlabeled example using classifier uncertainty • Get a fixed number of labels for that example

  25. Re-active Learning Contributions Standard Active Learning Algorithms Fail Uncertainty Sampling [Lewis and Catlett 1994] Expected Error Reduction [Roy and McCallum 2001] Re-active Learning Algorithms Extensions of Uncertainty Sampling Impact Sampling

  26. Impact (ψ) Sampling

  27. h Current Hypothesis

  28. Labeled Labeled h

  29. Labeled Labeled h What is the impact of labeling this example?

  30. Labeled Labeled h Impact of labeling this example a diamond

  31. Labeled Labeled h Ψ (x) Impact of labeling this example a diamond

  32. Labeled Labeled h Impact of labeling this example a circle

  33. Labeled Labeled h Ψ (x) Impact of labeling this example a circle

  34. Total Expected Impact of h Ψ (x)

  35. Total Expected Impact of h Ψ (x) h Ψ (x)

  36. Total Expected Impact of h Ψ (x) h Ψ (x) Ψ (x) = P(x = ) Ψ(x) + P(x = ) Ψ (x)

  37. Use classifier’s belief as prior. Bayesian update using annotations. Ψ (x) = P(x = ) Ψ(x) + P(x = ) Ψ (x)

  38. Assuming annotation accuracy > 0.5: As # annotations (x) goes to infinity, Ψ(x) goes 0.

  39. Theorem In many noiseless settings, when relabeling is unnecessary, impact sampling = uncertainty sampling

  40. Theorem In many noiseless settings, when relabeling is unnecessary, impact sampling = uncertainty sampling When relabeling is necessary: impact sampling = uncertainty sampling

  41. Consider an example with the following labels: Aggregated Label via majority vote

  42. Before: After adding an additional label: NO CHANGE

  43. Pseudolookahead Let r be the minimum number of labels to flip the aggregate label.

  44. Pseudolookahead Let r be the minimum number of labels to flip the aggregate label.

  45. Pseudolookahead Let r be the minimum number of labels to flip the aggregate label. r = 3

  46. Pseudolookahead Ψ(x) = Ψ (x) / r Redefine r

  47. Pseudolookahead Ψ(x) = Ψ (x) / r Redefine r Careful Optimism!

  48. Budget = 1000 Label Accuracy = 75% 10,30,50,70,90 Features

  49. EER impact Alpha-uncertainty Fixed-uncertainty uncertainty passive Gaussian (num features = 90)

  50. impact uncertainty passive Arrhythmia (num features = 279)

More Related