active learning 02 750
Download
Skip this Video
Download Presentation
Active Learning 02-750

Loading in 2 Seconds...

play fullscreen
1 / 8

Active Learning 02-750 - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Active Learning 02-750. Jaime Carbonell , Language Technologies Institute Carnegie Mellon University www.cs.cmu.edu/~{jgc | pinard | jinruih | vamshi} 27 September 2010. Active Learning. Training data: Special case: Functional space: Fitness Criterion: a.k.a. loss function

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Active Learning 02-750' - martina-mclean


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
active learning 02 750

Active Learning02-750

Jaime Carbonell,

Language Technologies Institute

Carnegie Mellon University

www.cs.cmu.edu/~{jgc| pinard | jinruih | vamshi}

27 September 2010

active learning
Active Learning
  • Training data:
    • Special case:
  • Functional space:
  • Fitness Criterion:
    • a.k.a. loss function
  • Sampling Strategy:

Jaime G. Carbonell, Language Technolgies Institute

cost sensitive active learning pp37 39 settles
Cost Sensitive Active Learning(pp37-39 Settles)
  • Suppose not all instances cost the same to label
    • Cytoplasmic vs membrane proteins for structure prediction via X-ray crystallography
    • Books vs web pages for topic labels
    • Near-misses vs clear examples
  • Suppose labelers vary in costs
    • Crystallography vs MRI for protein structures
    • Linguists vs Turkers for Machine Translation
  • How to cope with cost-accuracy tradeoffs?
    • Proactive learning (coming later)

Jaime G. Carbonell, Language Technolgies Institute

active learning beyond instances
Active Learning Beyond Instances
  • Active Class Selection (p33 Settles)
    • Given a class, query instances thereof
    • Typical vs boundary instances
  • Active Feature Selection
    • Query values of features across many instances
    • Enables meaningful “batch” experiments
    • Generalized to Instance-Feature matrix
  • Active Clustering (p33-34 Settles)
    • Semi-supervised: new classes can spawn
    • Subsampling for effective unsupervised clustering

Jaime G. Carbonell, Language Technolgies Institute

batch mode active learning
Batch-Mode Active Learning
  • Why would we want Q-batch vs Q-1?
    • Amortize experimental set up
    • Keep human labeler efficiently busy
      • “Staleness” vs utilization (Ringer, 2010)
    • Crowd sourcing  parallelizable AL
  • How do we select batches? (pp 35-36 Settles)
    • Instance Diversity in batch as part of samling (Brinker 2003, Donmez & Carbonell, 2008)
    • Modular and submodular functions (Hoi 2006)
    • Need a joint optimization criterion

Jaime G. Carbonell, Language Technolgies Institute

noisy labelers or experiments pp37 39 settles
Noisy Labelers or Experiments(pp37-39 Settles)
  • Labeling noise  version-space learning flawed
    • E.g. cannot apply SVM shrinking-margin
    • Underlying ML algorithm must be noise resistant
  • Reducing noisy labels if p(correct) > 0.5
    • Repeated labeling (if random noise)
    • Majority vote (if semi-independent labelers)
    • Tradeoffs in repeat vs new labels
    • Cost vs accuracy tradeoffs
  • What if the labeler accuracy is not known?
    • Learn/estimate labeler accuracy as part of AL
    •  Proactive Learning (later class)

Jaime G. Carbonell, Language Technolgies Institute

readings
Readings
  • Burr Settles – Comprehensive Survey of AL http://www.cs.cmu.edu/~bsettles/pub/settles.activelearning.pdf
  • Donmez, P. Carbonell, J. and Bennett, P. “Dual-Strategy Active Learning” http://www.cs.cmu.edu/~jgc/publication/Dual_Strategy_ECML_2007.pdf
  • Cohn, Ghahramani and Jordan, “Active Learning with Statistical Models” http://dspace.mit.edu/bitstream/handle/1721.1/7192/AIM-1522.pdf;jsessionid=13C2A9BF0DEC1567B9CA33F0C43BC3C3?sequence=2

Jaime G. Carbonell, Language Technolgies Institute

thank you
THANK YOU!

Jaime G. Carbonell, Language Technolgies Institute

ad