Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics.

Similar presentations


Presentation on theme: "Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics."— Presentation transcript:

1 Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics

2 Objectives How to impute missing values in SPSS, specifically MI How to impute missing values in SPSS, specifically MI How to implement analyses with multiple imputed values How to implement analyses with multiple imputed values Interpretation of the output Interpretation of the output Practical tips Practical tips

3 Example data From trial of pedometers+advice vs advice vs controls in sedentary elderly women Follow-up at 3 and 6 mnths Main outcome measure of activity from accelerometer counts 210 randomised / 170 at 3 months

4 Example data – Pedometer trial Read in data ‘SPSS Study databse.sav’ Main outcome is: 3 mnth activity – AccelVM2 Baseline activity – AccelVM1a Trial arm represented by two dummy variables:Grp1 = Pedom. Vs. control Grp2 = Advice vs. control

5 Main analysis – Pedometer trial Regression on 3 months activity adjusting for baseline activity and two dummy variables representing trial arm contrasts

6 Main analysis – Pedometer trial Note that n =170 with 40 missing in complete case analysis and so potential for bias

7 Missing at Random (MAR) Prob (Missing) is independent of: Prob (Missing) is independent of: 1) unobserved data but 2) dependent on observed data Essentially observed data is a random sample of full data in each stratum Essentially observed data is a random sample of full data in each stratum MAR is weaker version of MCAR assumption MAR is weaker version of MCAR assumption If MAR is assumed, many methods possible to impute data using observed data. If MAR is assumed, many methods possible to impute data using observed data.

8 Completers (n =172) Dropped out at 3 months (n = 32) Chi-squared or t- test p-value Age Mean (SD)77.1 (5.0)78.5 (5.6) 0.137 Accelerometer VM Mean (SD)130695 (47991)113381 (50444) 0.065 Limb Function Mean (SD)8.69 (2.25)7.41 (2.86) 0.028 NHS Costs previous 3 months Mean (SD) £199.59 (306.74)£404.29 (1289.54) 0.402 Pedometer Group N (%)58 (85.3%)10 (14.7%) 0.052 BCI Group N (%)52 (77.6%)15 (22.4%) Control Group N (%)62 (92.5%)5 (7.5%) Stairs difficult Yes48 (76.2%)15 (23.8%) 0.033 No124 (87.9%17 (12.1%) Comparison of completers at 3 months and drop-outs

9 Execution of MI in SPSS So assuming MAR we can use the available data to predict missing values in SPSS: Analyze Multiple Imputation Impute Missing Data Values

10 Execution of MI in SPSS Enter ALL variables you think associated with missingness Note default imputation number = 5 Create new dataset to store results Note icon indicating procedures that allow MI analysis

11 Execution of MI in SPSS Automatic method lets SPSS chose Custom gives more flexibility Can include all 2-way interactions Linear Regression model prediction

12 Execution of MI in SPSS List of variables chosen Define Each variable for imputation or predictor or BOTH N.b. Recommend including the OUTCOME as both predictor and outcome

13 Output of MI in SPSS Note main interest in outcome VM2 but other factors with missing values also imputed

14 Step 2 - Using Imputed datasets in analysis Note new dataset has IMPUTATION number as first column and contains in order the original dataset (n = 210), IMPUTATION = 0 and concatenated below it a further 5 new datasets (each n = 210) but now with imputed values, IMPUTATION = 1 to 5 Most analyses can now be implemented if the fossil shell spiral symbol is present

15 Repeat Main analysis – Need Pooled Results Procedure exactly same as before SPSS will do the pooled analysis if the icon (above) is present in the drop-down menu is present in the drop-down menu

16 Pooled Analysis in SPSS Results presented for the original data and for each imputed dataset separately

17 Results of pooled analysis from 5 imputed datasets ModelBSEtSig.Fraction missing Pooled Constant1560778081.9990.0470.173 AccelVM1a0.8520.05116.6300.0000.124 Pedometer Group 1131061311.8450.0660.138 Advice only1753665262.6870.0090.266 Larger effect sizes in both groups Greater power gives more significance

18 Interpretation Compare pooled results with the original as a form of sensitivity analysis If results similar suggests the original results fairly robust Consider whether MAR is reasonable assumption Consider whether you have included all factors (including the outcome) related to the missingness in the imputation model as a crucial assumption

19 Summary SPSS now includes Multiple imputation in its armoury SPSS now includes Multiple imputation in its armoury Consider assumptions of MI Consider assumptions of MI Compare results under different assumption to assess robustness of results Compare results under different assumption to assess robustness of results If MAR assumption o.k. then MI provides results that are less biased than complete case analysis If MAR assumption o.k. then MI provides results that are less biased than complete case analysis


Download ppt "Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics."

Similar presentations


Ads by Google