1 / 17

Biostat 2065 Review

Biostat 2065 Review. November 2, 2009. Data with missing values. Taxonomy. Complete-cases based methods ( i ). Complete-cases based methods (ii). Available-case analysis. Lack of self-consistency. Not useful. Single imputation methods ( i ). Explicit modeling:

marin
Download Presentation

Biostat 2065 Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biostat 2065 Review November 2, 2009

  2. Data with missing values

  3. Taxonomy

  4. Complete-cases based methods (i)

  5. Complete-cases based methods (ii)

  6. Available-case analysis • Lack of self-consistency. • Not useful.

  7. Single imputation methods (i) • Explicit modeling: • Unconditional mean imputation. • Conditional mean imputation. • Stochastic regression imputation. Stochastic regression imputation is the best among these three that it produces consistent estimates for higher moments or covariance. But the standard error estimates are not correct.

  8. Single imputation methods (ii) • Implicit modeling: • Hot-Deck imputation. • Nearest neighbor hot deck. These are ad-hoc approaches. They are intuitive and easy to be carried out. But the performance is difficult to evaluate.

  9. Inference on single imputation method • Explicit standard error estimates are available under certain sampling/imputation methods: imputation carried out within each ultimate cluster was unbiased. • Bootstrap standard errors. • Jackknife standard errors.

  10. Multiple imputation • Multiple imputation takes care of the between imputation variation. • It is very effective when the estimate is normally distributed. Only a handful imputations are necessary. • However, variability on the prediction model still needs to be considered. For example, for bivariate data with MCAR nonresponse, by bootstrapping the complete cases to sample the predictive distribution.

  11. Likelihood-based Inference

  12. When data are MAR and ……

  13. Factor likelihood method for monotone data with an ignorable mechanism

  14. Sweep operator • Useful for analysis of multivariate normal data. • By applying sweep and reverse sweep operators, the parameters for a multivariate normal distribution can be derived from the marginal distribution of a “baseline” variable and a sequence of conditional normal regressions.

  15. Computation algorithms • Newton-Raphson • Fast convergence rate in the neighborhood of MLE. • Unstable. • EM • Designed for analysis of data with missing values. • Stable: increase the likelihood function. • Slow convergence. • Extensions: ECM, ECME, and PXEM.

  16. Large-sample inference

  17. SEM and Other methods

More Related