1 / 22

Inference About a Mean Vector II

Inference About a Mean Vector II. BMTRY 726 1/31/2014. Large n. Given X 1 , X 2 ,…, X n ~ NID ( m , S ) where E ( X i )= m and V ( X i )= S … Consider when n is large This implies Therefore . Large n.

morna
Download Presentation

Inference About a Mean Vector II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inference About a Mean VectorII BMTRY 726 1/31/2014

  2. Large n Given X1, X2,…, Xn ~NID(m,S) where E(Xi)=m and V(Xi)=S… Consider when n is large This implies Therefore

  3. Large n In the case where n is large, we worry less about the normality assumption. The distribution of T2can be approximated by So if we want to test We reject H0if We can use this to get an approximate confidence region for m. Only assumption is that Xiiidand m and S exist.

  4. Missing Observations What happens if we have the following data Types of missing data: MCAR: missing completely at random -missing observations independent of the actual measurement MAR: missing at random -missingness depends on observed values but not the missing values NMAR: not missing at random -informative, we can’t ignore why the data are missing

  5. Missing Observations No single uniform method to deal with missing data. -complete case analysis -replace missing with sample mean -multiple imputation Easiest to work with the MCAR assumption Imputation: sometimes we do impute missing values. However this can be “dangerous” as the variance will often be underestimated. Also often is used for missing values which can bias the results towards rejecting the null.

  6. Basic EM Framework EM is useful when maximum likelihood calculations would be easy IF we knew some things we don’t know… -We have a model for complete data X with associated pdf with unknown parameter q -BUT we don’t observe all of X -We want to maximize the observed-data likelihood with respect to q *assuming data are missing at random!

  7. Key Ideas of EM • Compute the expectation of the complete data log-likelihood conditional on the current estimate of parameters • Maximize the resulting log-likelihood to obtain the next estimate of the parameters • Iterate to some level of convergence

  8. Key Ideas of EM EM for exponential families: -compute the expectation of sufficient statistics conditional on the current estimate of the parameters -Use the resulting estimates of the sufficient statistics to re-estimate -iterate to some level of convergence.

  9. EM and Univariate Normal Suppose Begin by making an initial guess

  10. EM and Univariate Normal E-stepk: compute expectations of sufficients statistics, conditional on Xobs and the current parameter estimate M-stepk: maximize the resulting log-likelihood

  11. Multivariate Normal Say we have: If we impute missing valued of BP and LDL separately, we loose the correlation structure Multiple Imputation using the EM algorithm allows us to include correlation among the X’s. We make an initial guess about q and apply our guess to the missing data. We repeat this until convergence.

  12. EM Algorithm for MVN • In MVN data, the EM algorithm is based on the sufficient statistics

  13. EM Algorithm for MVN First estimate based on the data presented E-Step: For each vector , use the estimates of the mean of the conditional distribution of and to estimate the missing values These estimates can be calculated based on properties of a MVN distribution…

  14. EM Algorithm for MVN E-Step: Estimates of missing values These are used to find T1 and T2

  15. EM Algorithm for MVN M-Step: Compute the revised maximum likelihood estimate based on our sufficient statistics from the E-step.

  16. Example Consider an observed sample

  17. Example Update missing values in row 1

  18. Example Update missing values in row 3

  19. Example Now estimate the sufficient statistics Lets start with T1

  20. Example Now find T2

  21. Example Revise the MLE estimates using the sufficient statistics Use these estimates and go through the algorithm again

  22. Programming an EM algorithm Start with pseudo-code… what information do you need to capture and what output do you expect?

More Related