1 / 39

Active Learning: Class Questions

Active Learning: Class Questions. Meeting 15 — Mar 5, 2013 CSCE 6933 Rodney Nielsen. Space of Active Learning. Space of Active Learning. Active Learning Query Types. Your Questions. Is the query strategy framework decided by the algorithm we apply for a problem?. Your Questions.

nadda
Download Presentation

Active Learning: Class Questions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Active Learning: Class Questions Meeting 15 — Mar 5, 2013 CSCE 6933 Rodney Nielsen

  2. Space of Active Learning

  3. Space of Active Learning

  4. Active Learning Query Types

  5. Your Questions • Is the query strategy framework decided by the algorithm we apply for a problem?

  6. Your Questions • To the Membership Query Synthesis, how does it deal with the influence of data distribution? For example, if we applied it in SVM algorithm, when we use slack variables in soft margin, the distribution will have a great influence on the results.

  7. Membership Query Synthesis • Dynamically construct query instances based on expected informativeness • Applications • Character recognition. • Robot scientist: find optimal growth medium for a yeast • 3x $ decrease vs. cheapest next • 100x $ decrease vs. random selection

  8. Stream-based Selective Sampling • Informativeness measure • Region of uncertainty / Version space • Applications • POST • Sensor scheduling • IR ranking • WSD

  9. Pool-based Active Learning • Informativeness measure • Applications • Cancer diagnosis • Text classification • IE • Image classfctn & retrieval • Video classfctn & retrieval • Speech recognition

  10. Pool-based Active Learning Loop

  11. Your Questions • Which is the most effective setting among the three for active learning? What parameters would decide that?

  12. Questions • Questions???

  13. Paper Selection • Based on reading the paper abstracts, find the single paper you would most like to read on Active Learning • Any length paper that looks really interesting • Email to me by Friday

  14. Space of Active Learning

  15. Uncertainty Sampling • Uncertainty sampling • Select examples based on confidence in prediction • Least confident • Margin sampling • Entropy-based models

  16. Heat Map

  17. Query by Committee • Train a committee of hypotheses • Representing different regions of the version space • Obtain some measure of (dis)agreement on the instances in the dataset (e.g., vote entropy) • Assume the most informative instance is the one on which the committee has the most disagreement • Goal: minimize the version space • No agreement on size of committee, but even 2-3 provides good results

  18. Competing Hypotheses • a

  19. Expected Model Change • Query the instance that would result in the largest expected change in h based on the current model and Expectations • E.g., the instance that would result in the largest gradient descent in the model parameters • Prefer the instance x that leads to the most significant change in the model

  20. Variance Reduction • Regression problems • E[error2] = noise + bias + variance: • Learner can’t change noise or bias so minimize variance • Fisher Information Ratio used for classification

  21. Estimated Error Reduction • Other models approximate the goal of minimizing future error by minimizing (e.g., uncertainty, variance, …) • Estimated Error Reduction attempts to directly minimize E[error]

  22. Density Weighted Methods • Uncertainty sampling and Query by Committee might be hindered by querying many outliers • Density weighted methods overcome this potential problem by also considering whether the example is representative of the input dist. • Tends to work better than any of the base classifiers on their own

  23. Your Questions • The Density-Weighted Methods can help to avoid the negative influence of the noise point, but how to deal with the outliers? Just ignore this part?

  24. Your Questions • Brief us whether the density-weighted approach can be applied to all applications? • If this strategy can be used in all applications, why most of the research is being done using uncertainty sampling (esp. entropy)?

  25. Questions • ???

  26. Your Questions • How to deal with over fitting when applying selective sampling?

  27. Diversity • Naïve selection by earlier methods results in selecting examples that are very similar • Must factor this in and look for diversity in the queries

  28. Questions • ???

  29. Your Questions • There exists huge amount of unlabeled data, Is there a chance where all the data is labeled? • If not, How long does the learner query? • Is there any stage where learner is sufficient with labels and stops querying??

  30. Your Questions • How to stop the iteration? Stop when the accuracy changes less than a certain threshold?

  31. Early Stopping

  32. Early Stopping • A theoretically sound method to stop training is when the examples in the margin are exhausted. • To check if there are still unseen training instances in the margin, the distance of the new selected instance is compared to the support vectors of the current model. • If the new selected instance by active learning (closest to the hyperplane) is not closer than any of the support vectors, we conclude that the margin is exhausted. • A practical implementation of this idea is to count the number of support vectors during the active learning training process. • If the number of the support vectors stabilizes, it implies that all possible support vectors have been selected by the active learning method.

  33. Questions • ???

  34. Your Questions • What is log-loss? • What is Loss? • Related terms: • Risk • Regret • Common Loss functions: • Absolute Loss: | err | • Squared Loss: err2 • Pro: differentiable • Con: can be skewed by a few large values

  35. Your Questions • What is the relationship between entropy and log-loss?

  36. Your Questions • What is the relationship between entropy and log-loss?

  37. Your Questions • What is the difference between information-theoretic learning and decision-theoretic learning?

  38. Your Questions • What is the relationship between entropy and log-loss?

  39. Questions • ???

More Related