1 / 19

Practical Missing Data Analysis in SPSS (v17 onwards)

Practical Missing Data Analysis in SPSS (v17 onwards). Peter T. Donnan Professor of Epidemiology and Biostatistics. Objectives. How to impute missing values in SPSS, specifically MI How to implement analyses with multiple imputed values Interpretation of the output Practical tips.

edita
Download Presentation

Practical Missing Data Analysis in SPSS (v17 onwards)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics

  2. Objectives • How to impute missing values in SPSS, specifically MI • How to implement analyses with multiple imputed values • Interpretation of the output • Practical tips

  3. Example data From trial of pedometers+advicevs advice vs controls in sedentary elderly women Follow-up at 3 and 6 mnths Main outcome measure of activity from accelerometer counts 210 randomised / 170 at 3 months

  4. Example data – Pedometer trial Read in data ‘SPSS Study databse.sav’ Main outcome is: 3 mnth activity – AccelVM2 Baseline activity – AccelVM1a Trial arm represented by two dummy variables: Grp1 = Pedom. Vs. control Grp2 = Advice vs. control

  5. Main analysis – Pedometer trial Regression on 3 months activity adjusting for baseline activity and two dummy variables representing trial arm contrasts

  6. Main analysis – Pedometer trial Note that n =170 with 40 missing in complete case analysis and so potential for bias

  7. Missing at Random (MAR) Prob (Missing) is independent of: 1) unobserved data but 2) dependent on observed data Essentially observed data is a random sample of full data in each stratum MAR is weaker version of MCAR assumption If MAR is assumed, many methods possible to impute data using observed data.

  8. Comparison of completers at 3 months and drop-outs

  9. Execution of MI in SPSS So assuming MAR we can use the available data to predict missing values in SPSS: Analyze Multiple Imputation Impute Missing Data Values

  10. Execution of MI in SPSS Enter ALL variables you think associated with missingness Note default imputation number = 5 Create new dataset to store results Note icon indicating procedures that allow MI analysis

  11. Execution of MI in SPSS Automatic method lets SPSS chose Custom gives more flexibility Can include all 2-way interactions Linear Regression model prediction

  12. Execution of MI in SPSS List of variables chosen Define Each variable for imputation or predictor or BOTH N.b. Recommend including the OUTCOME as both predictor and outcome

  13. Output of MI in SPSS Note main interest in outcome VM2 but other factors with missing values also imputed

  14. Step 2 - Using Imputed datasets in analysis Note new dataset has IMPUTATION number as first column and contains in order the original dataset (n = 210), IMPUTATION = 0 and concatenated below it a further 5 new datasets (each n = 210) but now with imputed values, IMPUTATION = 1 to 5 Most analyses can now be implemented if the fossil shell spiral symbol is present

  15. Repeat Main analysis – Need Pooled Results Procedure exactly same as before SPSS will do the pooled analysis if the icon (above) is present in the drop-down menu

  16. Pooled Analysis in SPSS Results presented for the original data and for each imputed dataset separately

  17. Results of pooled analysis from 5 imputed datasets Larger effect sizes in both groups Greater power gives more significance

  18. Interpretation Compare pooled results with the original as a form of sensitivity analysis If results similar suggests the original results fairly robust Consider whether MAR is reasonable assumption Consider whether you have included all factors (including the outcome) related to the missingness in the imputation model as a crucial assumption

  19. Summary • SPSS now includes Multiple imputation in its armoury • Consider assumptions of MI • Compare results under different assumption to assess robustness of results • If MAR assumption o.k. then MI provides results that are less biased than complete case analysis

More Related