1 / 14

Self-Training & Co-Training Overview

Self-Training & Co-Training Overview. Meeting 17 — Mar 19, 2013 CSCE 6933 Rodney Nielsen. Self-Training. L  L 0  < X (0) , Y (0) > Until stopping-criteria h ( x )  f ( L ) U *  select ( U , h ) L  L 0 + < U * , h ( U * )>. Base Learner. Textbook assumes a hard-label

daquan-neal
Download Presentation

Self-Training & Co-Training Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Self-Training & Co-Training Overview Meeting 17 — Mar 19, 2013 CSCE 6933 Rodney Nielsen

  2. Self-Training • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h) • L  L0 + <U*, h(U*)>

  3. Base Learner • Textbook assumes a hard-label • But must output some means of generating a classification confidence

  4. Example Selection • Probability • Probability ratio or probability margin • Entropy

  5. Stopping Criteria • T rounds, • Repeat until convergence, • Use held out validation data, or • k-fold cross validation

  6. Seed • Seed Data vs. Seed Classifier • Training on seed data does not necessarily result in a classifier that perfectly labels the seed data • Training on data output by a seed classifier does not necessarily result in the same classifier • Constraints

  7. Indelibility Indelible • L <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h) • L  L + <U*, h(U*)> • U  U – U* • Original: Y(U) can change • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h) • L  L0 + <U*, h(U*)>

  8. Persistence Indelible • L <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h) • L  L + <U*, h(U*)> • U  U – U* • Persistent: X(L) can’t change • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* U*+select(U, h) • L  L0 + <U*, h(U*)> • U  U – U*

  9. Throttling Throttled • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h, k) • L  L0+ <U*, h(U*)> Select k examples from U, with the greatest confidence • Original: Threshold • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h, θ) • L  L0+ <U*, h(U*)> • Select all examples from U, with confidence > θ

  10. Balanced Balanced (&Throttled) • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h, k) • L  L0+ <U*, h(U*)> Select k+ positive& k- negativeexs; often k+=k- or they are proportional to N+ & N- • Throttled • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h, k) • L  L0+ <U*, h(U*)> • Select kexamples from U, with greatest confidence

  11. Preselection Preselect Subset of U • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U’ select(U, φ) • U* select(U’, h, θ) • L  L0+ <U*, h(U*)> Select exs from U’, a subset of U (typically random) • Original: Test all of U • L L0 <X(0), Y(0)> • Until stopping-criteria • h(x)  f(L) • U* select(U, h, θ) • L  L0+ <U*, h(U*)> • Select exs from all of U

  12. Co-training • X = X1 × X2 ; two different views of the data • x = (x1, x2) ; i.e., each instance is comprised of two distinct sets of features and values • Assume each view is sufficient for correct classification

  13. Co-Training Algorithm 1 Table 1: Blum and Mitchell, 1998

  14. Questions • ???

More Related