generative models for crowdsourced data
Download
Skip this Video
Download Presentation
Generative Models for Crowdsourced Data

Loading in 2 Seconds...

play fullscreen
1 / 48

Generative Models for Crowdsourced Data - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

Generative Models for Crowdsourced Data. Outline. What is Crowdsourcing ? Modeling the labeling process Example with real data Extensions Future Directions. What is Crowdsourcing ?. Human based computation. Outsourcing certain steps of a computation to humans.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Generative Models for Crowdsourced Data' - yetty


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • What is Crowdsourcing?
  • Modeling the labeling process
  • Example with real data
  • Extensions
  • Future Directions
what is crowdsourcing
What is Crowdsourcing?
  • Human based computation.
  • Outsourcing certain steps of a computation to humans.
  • ``Artificial artificial intelligence.’’
  • Data science:
    • Making an immediate decision.
    • Creating a labeled data set for learning.
funny enough
Funny enough …
  • Not everybody agrees on the gender of a Twitter profile.
  • Difficult Instances
  • Worker Ability / Motivation
  • Worker Bias
  • AdversarialBehaviour
disagreements
Disagreements
  • When some workers say “male” and some workers say “female”, what to do?
majority rules heuristic
Majority Rules Heuristic
  • Assign label l to item x if a majority of workers agree.
  • Otherwise item x remains unlabeled.
majority rules heuristic1
Majority Rules Heuristic
  • Assign label l to item x if a majority of workers agree.
  • Otherwise item x remains unlabeled.
  • Ignores prior worker data.
majority rules heuristic2
Majority Rules Heuristic
  • Assign label l to item x if a majority of workers agree.
  • Otherwise item xremains unlabeled.
  • Ignores prior worker data.
  • Introduce bias in labeled data.
train on all labels
Train on all labels
  • For labeled data set workflow.
  • Add all item-label pairs to the data set.
  • Equivalent to cost vector of:
    • P (l | { lw}) = 1/nwS 1{l = lw}
train on all labels1
Train on all labels
  • For labeled data set workflow.
  • Add all item-label pairs to the data set.
  • Equivalent to cost vector of:
    • P (l | { lw}) = 1/nwS1{l = lw}
  • Ignores prior worker data.
train on all labels2
Train on all labels
  • For labeled data set workflow.
  • Add all item-label pairs to the data set.
  • Equivalent to cost vector of:
    • P (l | { lw}) = 1/nwS1{l = lw}
  • Ignores prior worker data.
  • Models the crowd, not the “ground truth.”
what is ground truth
What is ground truth
  • Different theoretical approaches.
    • PAC learning with noisy labels.
    • Fully-adversarial active learning.
  • Bayesians have been very active.
    • “Easy” to posit a functional form and quickly develop inference algorithms.
    • Issue of model correctness is ultimately empirical.
bayesian literature
Bayesian Literature
  • (2009) Whitehill et. al. GLAD framework.
    • (1979) Dawid and Skene. Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm.
  • (2010) Welinder et. al. The Multidimensional Wisdom of Crowds.
  • (2010) Raykar et. al. Learning from Crowds.
bayesian approach
Bayesian Approach
  • Define ground truth via a generative model which describes how “ground truth” is related to the observed output of crowdsource workers.
  • Fit to observed data.
  • Extract posterior over ground truth.
  • Make decision or train classifier.
example binary classification
Example: Binary Classification
  • Each worker has a matrix.

α = ( -1 α01)

( α10 -1 )

  • Each item has a scalar difficulty β > 0.
  • P (lw = j | z = i) = e-βαij / (Σk e-βαik)
  • αij ~ N (μij, 1) ; μij ~ N (0, 1)
  • log β ~ N (ρ, 1) ; ρ ~ N (0, 1)
other problems
Other Problems
  • Multiclass classification:
    • Same as binary with larger confusion matrix.
  • Ordinal classification: (“Hot or not”)
    • Confusion matrix has special form.
    • O (L) parameters instead of O (L2).
  • Multilabel classification:
    • Reduce to multiclass on power set.
    • Assume low-rank confusion matrix.
slide34
EM
  • Initially all workers are assumed moderately accurate and without bias.
    • Implies initial estimate of ground truth distribution favors consensus.
    • Disagreeing with the majority is a likely error.
slide35
EM
  • Initially all workers are assumed moderately accurate.
  • Workers consistently in the minority have their confusion probabilities increase.
slide36
EM
  • Initially all workers are assumed moderately accurate.
  • Workers consistently in the minority have their confusion probabilities increase.
  • Workers with higher confusion probabilities contribute less to the distribution of ground truth.
different workers are marginalized1
“Different” workers are marginalized
  • Workers that are consistently in the minority will not contribute strongly to the posterior distribution over ground truth.
    • Even if they are actually more accurate.
  • Can correct when an accurate worker(s) is paired with some inaccurate workers.
  • Good for breaking ties.
  • Raykar et. al.
online em
Online EM
  • Given a set of worker-label pairs for a single item:
  • (Inference) Using current α, find most likely β* and distribution q* over ground truth.
  • (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.
online em1
Online EM
  • Given a set of worker-label pairs for a single item:
  • (Inference) Using current α, find most likely β* and distribution q* over ground truth.
  • (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.
things to do with q
Things to do with q*
  • Take an immediate cost-sensitive decision
    • d* = argmindEz~q*[f (z, d)]
  • Train a (importance-weighted) classifier
    • cost vector cd = Ez~q*[f (z, d)]
    • e.g. 0/1 loss: cd = (1 - q*d)
    • e.g. binary 0/1 loss: |c1 – c0| = |1 – 2 q*1|
    • No need to decide what the true label is!
  • Raykar et. al.: why not jointly estimate classifier and worker confusion?
raykar et al insight
Raykar et. al. insight
  • Cost vector is constructed by estimating worker confusion matrices.
  • Subsequently, classifier is trained; it will sometimes disagree with workers.
  • Would be nice to use that disagreement to inform the worker confusion matrices.
  • Circular dependency suggests joint estimation.
online joint estimation1
Online Joint Estimation
  • Initially the classifier will output an uninformative prior and therefore will be trained to follow consensus of workers.
  • Eventually workers which disagree with the classifier will have their confusion probabilities increase.
  • Workers consistently in the minority can contribute strongly to the posterior if they tend to agree with the classifier.
additional resources
Additional Resources
  • Software
    • http://code.google.com/p/nincompoop
  • Blog
    • http://machinedlearnings.com/
ad