1 / 13

Multiple Imputation using SAS

Multiple Imputation using SAS. Don Miller 812 Oswald Tower miller@pop.psu.edu 814-863-3155. Introduction. Missing values occur often in research: refused/don’t know, attrition, skip patterns…

rigg
Download Presentation

Multiple Imputation using SAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Imputation using SAS Don Miller 812 Oswald Tower miller@pop.psu.edu 814-863-3155

  2. Introduction • Missing values occur often in research: refused/don’t know, attrition, skip patterns… • Dropping missing values may bias results (e.g. women and/or overweight tend to disclose their weight less often than others) • Attempts are made to impute the data (“fill in” missing values) • Single imputation (e.g. with the mean) is biased, doesn’t give measure of uncertainty

  3. Multiple Imputation Simple Procedure • For categorical variables: Construct binary dummy variables, throwing out reference category (e.g. Race: 1=“white”, 2=“black”, 3=“other” becomes Black, Other variables) • Impute using PROC MI • Round off imputed dummies if you want plausible values (this will bias your results) • Do analysis: PROC REG, LOGISTIC, etc. using by _imputation_; in procedure • Combine results using PROC MIANALYZE

  4. PROC MI • Typical syntax: • proc mi data=rawdat seed=8633155 out=impdat; var sex black other age drivesfast; run; • data= 1 copy of data with missing values • out= 5 copies of data with imputed values (will be different across copies) • seed= random seed, you can keep same to reconstruct your results • var Variables with missing values you need imputed, in model, and those that may be helpful with imputation

  5. PROC MI Sample Output

  6. PROC MI Sample Output

  7. PROC MI Options • nimpute=5# imputations, default=5 0 gives missing patterns • minimum=0 0 0 0 set min & max, sometimes maximum=1 1 1 90doesn’t converge as well • round=1 1 1 0.01 round off option • alpha=0.05 confidence limits • mu0=0.5 0.5 0.5 25 t test null hypothesis μ=μ0

  8. PROC MI Statements • em maxiter=200 out=emdata; EM algorithm, MLE of missing data • freq fweight; weighs observations by frequency weight • mcmc (options); modify imputation method • class sex race; specify categorical variables (don’t need dummies) (new / experimental)

  9. Regression • Fit your model as if data had no missing values, using by _imputation_; • proc reg data=impdat outest=parmcov covout; model drivesfast=sex black other age; by _imputation_; run; • You’ll get nimpute (usually 5) sets of output • Estimates, covariances, errors will be combined in MIANALYZE (R² is just mean) • Need to generate parameter estimates and covariance data set (varies by procedure)

  10. Parameter Est. & Covariance Matrix • proc logistic data=impdat descending; model drivesfast=sex black other age /covb; by _imputation_; ods output ParameterEstimates=parmsdat CovB=covbdat; run; • proc mixed data=impdat; model drivesfast=sex black other age /solution covb; by _imputation_; ods output covparms=parmcov; run;

  11. Parameter Est. & Covariance Matrix • proc genmod data=impdat; model drivesfast=sex black other age /covb; by _imputation_; ods output ParameterEstimates=parmsdat CovB=covbdat; run; • proc glm data=impdat; model drivesfast=sex black other age /inverse; by _imputation_; ods output ParameterEstimates=parmsdat InvXPX=xpxidat; run;

  12. PROC MIANALYZE • Syntax depends on what procedure you used in previous step: • proc mianalyze data=parmcov; or proc mianalyze parms=parmsdat covb=covbdat; or proc mianalyze parms=parmsdat xpxi=xpxidat; modeleffects intercept sex black other age; run; • Note the “var” statement is now “modeleffects” • Note that the dependent variable is omitted

  13. PROC MIANALYZE Output

More Related