1 / 18

Statistical Estimation

Statistical Estimation. Vasileios Hatzivassiloglou University of Texas at Dallas. Obama contract at intrade.com. Instance profiles.

sydnee
Download Presentation

Statistical Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas

  2. Obama contract at intrade.com

  3. Instance profiles • Given k observations of maximum length n, construct a |Σ|×n matrix A (profile) where entry Aij is the estimated probability that the ith letter occurs in position j • One way to estimate Aij is to count each letter occuring at this position (cij); then • This is maximum likelihood estimation (MLE) • Estimate becomes better as k increases

  4. Example data • 23 sample motif instances for the cyclic AMP receptor transcription factor (positions 3-9) TTGTGGC TTTTGAT AAGTGTC ATTTGCA CTGTGAG ATGCAAA GTGTTAA ATTTGAA TTGTGAT ATTTATT ACGTGAT ATGTGAG TTGTGAG CTGTAAC CTGTGAA TTGTGAC GCCTGAC TTGTGAT TTGTGAT GTGTGAA CTGTGAC ATGAGAC TTGTGAG

  5. Calculated profile

  6. Probability of a motif • Suppose that we consider M as a candidate motif consensus • How do we find the best M given the observations in A? • Assuming independence of positions,

  7. Maximum likelihood estimation • General method for estimating unknown parameters when we have • a sample of values that depend on these parameters • a formula specifying the probability of obtaining these values given the parameters

  8. MLE example: three coins • Suppose we have three coins with probability of heads ⅓, ½, and ⅔ • One of them is used to generate a series of 20 tosses and we observe 11 heads • θ = the heads probability of the coin used in the experiment • Binomial distribution for the number of heads

  9. Binomial distribution • Count of one of two possible outcomes in a series of independent events • The probabilities of the two outcomes are constant across events • An example of iid events (independent, identically distributed)

  10. Binomial probability mass • If the probability of one outcome (let’s call it A) is p and there are n events • The probability of the other outcome is 1-p • The probability of obtaining a particular sequence of outcomes with m A’s is • There are sequences with the same number m of outcomes A • Overall

  11. MLE example: three coins • Result: Choose θ = ½

  12. MLE example: unknown coins • θ can take any value between 0 and 1 • m heads in n tosses • Solve the differential equation

  13. Solving the differential equation

  14. MLE for binomial • Of the three solutions, θ = 0 andθ = 1 result in P(X1,X2,...,Xn | θ) = 0, i.e., local minima • On the other hand, for 0<θ<1, P(X1,X2,...,Xn | θ) > 0, so θ = m/n must be a local maximum • Therefore the MLE estimate is

  15. Properties of estimators • The estimation error for a given sample is where x is the unknown true value • An estimator is a random variable • because it depends on the sample • The meansquare error represents the overall quality of the estimation across all samples

  16. Expected values • Recall that the expected value of a discrete random variable X is defined as • The expected value of a dependent random variable f(X) is • For continuous distributions, replace the sum with an integral

  17. Bias in estimation • An estimator is unbiased if • MLE is not necessarily unbiased • Example: standard deviation • Is the most commonly used measure of dispersion in a data set • For a random variable X, it is defined as

  18. Estimators of standard deviation • MLE estimator where • “Almost unbiased” estimator ( is an unbiased estimator of σ2) biased

More Related