380 likes | 491 Views
Basic output through to regression models. Title: Stata2Mplus conversion for ego_ghq12_id.dta.dta List of variables converted shown below ghq01 : ghq time1 item 1 ghq02 : ghq time1 item 2 ghq03 : ghq time1 item 3 ghq04 : ghq time1 item 4 ghq05 : ghq time1 item 5
E N D
Title: Stata2Mplus conversion for ego_ghq12_id.dta.dta List of variables converted shown below ghq01 : ghq time1 item 1 ghq02 : ghq time1 item 2 ghq03 : ghq time1 item 3 ghq04 : ghq time1 item 4 ghq05 : ghq time1 item 5 ghq06 : ghq time1 item 6 ghq07 : ghq time1 item 7 ghq08 : ghq time1 item 8 ghq09 : ghq time1 item 9 ghq10 : ghq time1 item 10 ghq11 : ghq time1 item 11 ghq12 : ghq time1 item 12 f1 : Scores for factor 1 id : Data: File is "C:\work\courses\mar09_course\ego\ego_ghq12_id.dta.dat" ; listwise is on;
Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; !usevariables = ghq01 ghq03 ghq05 ghq07 ghq09 ghq11; usevariables = ghq02 ghq04 ghq06 ghq08 ghq10 ghq12; idvariable = id; Analysis: Type = basic ; output: !sampstat; plot: type is plot3; savedata: file is "C:\work\courses\mar09_course\ego\ego_odd.dat" ;
Output INPUT READING TERMINATED NORMALLY Stata2Mplus conversion for ego_ghq12_id.dta.dta List of variables converted shown below ghq01 : ghq time1 item 1 ghq02 : ghq time1 item 2 ghq03 : ghq time1 item 3 ghq04 : ghq time1 item 4 ghq05 : ghq time1 item 5 ghq06 : ghq time1 item 6 ghq07 : ghq time1 item 7 ghq08 : ghq time1 item 8 ghq09 : ghq time1 item 9 ghq10 : ghq time1 item 10 ghq11 : ghq time1 item 11 ghq12 : ghq time1 item 12 f1 : Scores for factor 1 id :
SUMMARY OF ANALYSIS Number of groups 1 Number of observations 1119 Number of dependent variables 6 Number of independent variables 0 Number of continuous latent variables 0 Observed dependent variables Continuous GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 Variables with special functions ID variable ID Estimator ML Information matrix OBSERVED Maximum number of iterations 1000 Convergence criterion 0.500D-04 Maximum number of steepest descent iterations 20 Maximum number of iterations for H1 2000 Convergence criterion for H1 0.100D-03 Input data file(s) C:\work\courses\mar09_course\ego\ego_ghq12_id.dta.dat Input data format FREE
SUMMARY OF DATA Number of missing data patterns 1 SUMMARY OF MISSING DATA PATTERNS MISSING DATA PATTERNS (x = not missing) 1 GHQ02 x GHQ04 x GHQ06 x GHQ08 x GHQ10 x GHQ12 x MISSING DATA PATTERN FREQUENCIES Pattern Frequency 1 1119 COVARIANCE COVERAGE OF DATA Minimum covariance coverage value 0.100
PROPORTION OF DATA PRESENT Covariance Coverage GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ GHQ02 1.000 GHQ04 1.000 1.000 GHQ06 1.000 1.000 1.000 GHQ08 1.000 1.000 1.000 1.000 GHQ10 1.000 1.000 1.000 1.000 1.000 GHQ12 1.000 1.000 1.000 1.000 1.000 1.000
RESULTS FOR BASIC ANALYSIS ESTIMATED SAMPLE STATISTICS Means GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ 1 2.161 2.123 2.060 2.195 1.987 2.223 Covariances GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ GHQ02 0.768 GHQ04 0.152 0.373 GHQ06 0.350 0.229 0.653 GHQ08 0.211 0.199 0.271 0.387 GHQ10 0.387 0.264 0.439 0.305 0.873 GHQ12 0.266 0.196 0.312 0.250 0.380 0.520 Correlations GHQ02 GHQ04 GHQ06 GHQ08 GHQ10 GHQ12 ________ ________ ________ ________ ________ ________ GHQ02 1.000 GHQ04 0.284 1.000 GHQ06 0.494 0.465 1.000 GHQ08 0.387 0.525 0.538 1.000 GHQ10 0.473 0.464 0.581 0.524 1.000 GHQ12 0.421 0.445 0.535 0.556 0.564 1.000
PLOT INFORMATION The following plots are available: Histograms (sample values) Scatterplots (sample values) SAVEDATA INFORMATION Order and format of variables GHQ02 F10.3 GHQ04 F10.3 GHQ06 F10.3 GHQ08 F10.3 GHQ10 F10.3 GHQ12 F10.3 ID I5 Save file C:\work\courses\mar09_course\ego\ego_odd.dat Save file format 6F10.3 I5
Define new variables Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; Analysis: Type = basic ; Etc.
8 bins 10 bins 12 bins
Full sample Random sample of 250
Linear Regression Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; Analysis: estimator = ML; Model: sumodd on sumeven; output: sampstat cinterval; plot: type is plot3; savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
Linear Regression Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; Analysis: estimator = ML; Model: sumodd on sumeven; output: sampstat cinterval; plot: type is plot3; savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
Linear Regression Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; Analysis: estimator = ML; Model: sumodd on sumeven; output: sampstat cinterval; plot: type is plot3; savedata: file is "C:\work\courses\mar09_course\ego\ego_oddeven_regress.dat" ; SAVE = MAHALANOBIS COOKS INFLUENCE;
Output TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 0.000 Degrees of Freedom 0 P-Value 0.0000 Chi-Square Test of Model Fit for the Baseline Model Value 1635.553 Degrees of Freedom 1 P-Value 0.0000 CFI/TLI CFI 1.000 TLI 1.000 Loglikelihood H0 Value -5155.247 H1 Value -5155.247
Information Criteria Number of Free Parameters 3 Akaike (AIC) 10316.495 Bayesian (BIC) 10331.555 Sample-Size Adjusted BIC 10322.027 (n* = (n + 2) / 24) RMSEA (Root Mean Square Error Of Approximation) Estimate 0.000 90 Percent C.I. 0.000 0.000 Probability RMSEA <= .05 0.000 SRMR (Standardized Root Mean Square Residual) Value 0.000
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value SUMODD ON SUMEVEN 0.890 0.015 60.886 0.000 Intercepts SUMODD 1.941 0.193 10.051 0.000 Residual Variances SUMODD 2.868 0.121 23.654 0.000 CONFIDENCE INTERVALS OF MODEL RESULTS Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5% SUMODD ON SUMEVEN 0.852 0.861 0.890 0.919 0.928 Intercepts SUMODD 1.444 1.563 1.941 2.320 2.439 Residual Variances SUMODD 2.556 2.631 2.868 3.106 3.181
Compare with Stata Source | SS df MS Number of obs = 1119 -------------+------------------------------ F( 1, 1117) = 3700.56 Model | 10633.9457 1 10633.9457 Prob > F = 0.0000 Residual | 3209.82016 1117 2.87360802 R-squared = 0.7681 -------------+------------------------------ Adj R-squared = 0.7679 Total | 13843.7659 1118 12.3826171 Root MSE = 1.6952 ------------------------------------------------------------------------------ sumodd | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- sumeven | .8900851 .0146318 60.83 0.000 .8613762 .9187941 _cons | 1.941059 .1933 10.04 0.000 1.561787 2.320332 ------------------------------------------------------------------------------ Say something about OLS / ML estimation
Logistic regression 1 – cts predictor Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; categorical are sumodd; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumodd (16); Analysis: estimator = ML; Model: sumodd on sumeven; output: sampstat; cinterval;
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value SUMODD ON SUMEVEN 0.970 0.070 13.856 0.000 Thresholds SUMODD$1 15.665 1.080 14.499 0.000 CONFIDENCE INTERVALS OF MODEL RESULTS Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5% SUMODD ON SUMEVEN 0.790 0.833 0.970 1.107 1.150 Thresholds SUMODD$1 12.882 13.547 15.665 17.783 18.448 CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS SUMODD ON SUMEVEN 2.203 2.300 2.638 3.026 3.159
Compare with Stata . gen sumodd_g = sumodd . recode sumodd_g 0/16=0 17/24=1 (sumodd_g: 1119 changes made) . tab sumodd_g sumodd_g | Freq. Percent Cum. ------------+----------------------------------- 0 | 916 81.86 81.86 1 | 203 18.14 100.00 ------------+----------------------------------- Total | 1,119 100.00 . logistic sumodd_g sumeven Logistic regression Number of obs = 1119 LR chi2(1) = 666.85 Prob > chi2 = 0.0000 Log likelihood = -196.45269 Pseudo R2 = 0.6292 ------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sumeven | 2.638126 .184695 13.86 0.000 2.299868 3.026134 ------------------------------------------------------------------------------
Logistic regression 2 – binary predictor Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd sumeven; categorical are sumodd; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumeven (16); cut sumodd (16); Analysis: estimator = ML; Model: sumodd on sumeven; output: sampstat; cinterval; Don’t put sumeven here
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value SUMODD ON SUMEVEN 4.647 0.273 17.020 0.000 Thresholds SUMODD$1 2.687 0.132 20.307 0.000 CONFIDENCE INTERVALS OF MODEL RESULTS Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5% SUMODD ON SUMEVEN 3.944 4.112 4.647 5.182 5.350 Thresholds SUMODD$1 2.346 2.428 2.687 2.946 3.028 CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS SUMODD ON SUMEVEN 51.618 61.069 104.289 178.096 210.706
Compare with Stata . gen sumeven_g = sumeven . recode sumeven_g 0/16=0 17/24=1 (sumeven_g: 1119 changes made) . tab sumeven_g sumeven_g | Freq. Percent Cum. ------------+----------------------------------- 0 | 957 85.52 85.52 1 | 162 14.48 100.00 ------------+----------------------------------- Total | 1,119 100.00 . xi: logistic sumodd_g i.sumeven_g Logistic regression Number of obs = 1119 LR chi2(1) = 484.77 Prob > chi2 = 0.0000 Log likelihood = -287.49045 Pseudo R2 = 0.4574 ------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Isumeven_~1 | 104.2885 28.47434 17.02 0.000 61.0702 178.0917 ------------------------------------------------------------------------------
Logistic regression 3 – ordinal predictor Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12 f1 id; Missing are all (-9999) ; usevariables = sumodd ghq02_1 ghq02_2; categorical are sumodd; idvariable = id; Define: sumodd = ghq01+ ghq03 +ghq05 +ghq07 +ghq09 +ghq11; !sumeven = ghq02 +ghq04 +ghq06 +ghq08 +ghq10 +ghq12; cut sumodd (16); ghq02_1 = ghq02; ghq02_2 = ghq02; cut ghq02_1 (1); cut ghq02_2 (2); if ghq02_2 eq 1 then ghq02_1 = 0; Analysis: estimator = ML; Model: sumodd on ghq02_1 ghq02_2;
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value SUMODD ON GHQ02_1 2.103 0.524 4.015 0.000 GHQ02_2 3.786 0.515 7.348 0.000 Thresholds SUMODD$1 4.182 0.504 8.301 0.000 CONFIDENCE INTERVALS OF MODEL RESULTS Lower .5% Lower 2.5% Estimate Upper 2.5% Upper .5% SUMODD ON GHQ02_1 0.754 1.076 2.103 3.129 3.452 GHQ02_2 2.459 2.776 3.786 4.796 5.113 Thresholds SUMODD$1 2.884 3.195 4.182 5.170 5.480 CONFIDENCE INTERVALS FOR THE LOGISTIC REGRESSION ODDS RATIO RESULTS SUMODD ON GHQ02_1 2.125 2.933 8.188 22.853 31.550 GHQ02_2 11.691 16.056 44.075 120.987 166.159
Compare with Stata . recode ghq02 4=3 (ghq02: 88 changes made) . xi: logistic sumodd_g i.ghq02 i.ghq02 _Ighq02_1-3 (naturally coded; _Ighq02_1 omitted) Logistic regression Number of obs = 1119 LR chi2(2) = 190.38 Prob > chi2 = 0.0000 Log likelihood = -434.68929 Pseudo R2 = 0.1796 ------------------------------------------------------------------------------ sumodd_g | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ighq02_2 | 8.1875 4.287567 4.02 0.000 2.933598 22.85083 _Ighq02_3 | 44.07477 22.7058 7.35 0.000 16.05759 120.9761 ------------------------------------------------------------------------------