1 / 23

Basics of Statistical Estimation

Basics of Statistical Estimation. tails. heads. Learning Probabilities: Classical Approach. Simplest case: Flipping a thumbtack. True probability q is unknown. Given iid data, estimate q using an estimator with good properties: low bias, low variance, consistent

Download Presentation

Basics of Statistical Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basics of Statistical Estimation

  2. tails heads Learning Probabilities:Classical Approach Simplest case: Flipping a thumbtack True probabilityq is unknown Given iid data, estimate q using an estimator with good properties: low bias, low variance, consistent (e.g., maximum likelihood estimate)

  3. Maximum Likelihood Principle Choose the parameters that maximize the probability of the observed data

  4. Maximum Likelihood Estimation (Number of heads is binomial distribution)

  5. Computing the ML Estimate • Use log-likelihood • Differentiate with respect to parameter(s) • Equate to zero and solve • Solution:

  6. Sufficient Statistics (#h,#t) are sufficient statistics

  7. tails heads p(q) q 0 1 Bayesian Estimation True probability q is unknown Bayesian probability density forq

  8. Use of Bayes’ Theorem prior likelihood posterior

  9. Example: Application to Observation of Single “Heads" p(q) p(heads|q)= q p(q|heads) q q q 0 1 0 1 0 1 prior likelihood posterior

  10. Probability of Heads on Next Toss

  11. MAP Estimation • Approximation: • Instead of averaging over all parameter values • Consider only the most probable value(i.e., value with highest posterior probability) • Usually a very good approximation,and much simpler • MAP value ≠ Expected value • MAP → ML for infinite data(as long as prior ≠ 0 everywhere)

  12. Prior Distributions for q • Direct assessment • Parametric distributions • Conjugate distributions(for convenience) • Mixtures of conjugate distributions

  13. Conjugate Family of Distributions Beta distribution: Resulting posterior distribution:

  14. Estimates Compared • Prior prediction: • Posterior prediction: • MAP estimate: • ML estimate:

  15. Intuition • The hyperparameters ah and at can be thought of as imaginary counts from our prior experience, starting from "pure ignorance" • Equivalent sample size = ah + at • The larger the equivalent sample size, the more confident we are about the true probability

  16. Beta Distributions Beta(0.5, 0.5) Beta(1, 1) Beta(3, 2) Beta(19, 39)

  17. Assessment of aBeta Distribution Method 1:Equivalent sample - assess ah and at - assess ah+at and ah/(ah+at) Method 2:Imagined future samples

  18. Generalization to m Outcomes(Multinomial Distribution) Dirichlet distribution: Properties:

  19. Other Distributions Likelihoods from the exponential family • Binomial • Multinomial • Poisson • Gamma • Normal

More Related