1 / 9

Full example

Full example. Simple Regression. Data. Howell dataset Hassles Number of minor daily hassles reported during a month (measure of stress) Psychological symptomology. First descriptives. Ouch!. Pearson r = 0.61. Big difference!.

blaine
Download Presentation

Full example

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Full example Simple Regression

  2. Data • Howell dataset Hassles • Number of minor daily hassles reported during a month (measure of stress) • Psychological symptomology

  3. First descriptives Ouch! Pearson r = 0.61 Big difference! n mean sd median trimmed1 mad min max range skew kurtosis se Hassles 65 156.43 122.14 143 136.41 100.82 4 717 713 1.98 5.85 15.15 Symptoms 66 88.79 20.27 86 86.33 17.79 58 177 119 1.41 3.71 2.50

  4. Caveman say “more graph good” But data big problem! Me smash!

  5. On with the analysis • Assumptions are likely violated, and something will definitely have to be done about the outliers • But we can’t test them without running the model first

  6. Graphical examination • Homoscedasticity violated • Breusch-Pagan test • BP = 18.984, df = 1, p-value = 1.318e-05 • No serial correlation • Durbin-Watson test • DW = 2.3085, p-value = 0.8924 • Linearity ok • RESET test • RESET = 1.0903, df1 = 2, df2 = 61, p-value = 0.3426 • Normality violated • Shapiro-Wilk normality test • W = 0.9459, p-value = 0.006662 • Not surprising

  7. Coefficient and intercept notably different, with corresponding difference in test statistic Given that after compensating for the most extreme score, the next one is high leverage but not influential on the coefficients, and not down-weighted that much (~.90) So the robust helps bring us back to homoscedasticty, but not normality Bootstrapping the coefficient with the robust weights are provided for comparison, and as expected, noticeably different. The robust is the blue line below1 Regular Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 72.70703 3.29603 22.059 < 2e-16 *** Hassles 0.10238 0.01666 6.147 5.96e-08 *** Robust Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 74.26659 3.01780 24.61 < 2e-16 *** Hassles 0.07828 0.01747 4.48 3.21e-05 *** Intervals Normal 2.5 % 97.5 % (Intercept) 66.12 79.29 Hassles 0.069 0.136 Intervals Bootstrap 2.5 % 97.5 % (Intercept) 67.23 81.84 Hassles 0.047 0.114 Robust comparison

  8. Validation • The original model R2 is .375, and the robust procedure I used previously automatically matches it. • However given the issues, and the fact that another robust procedure that doesn’t force the match showed a drop (as did a normal weighted regression with the robust weights), we’d probably want a bias-adjusted R2 • Validating the robust model provided a bias-adjust R2 of .316.

  9. R code • Getting started with the original model • RegModel.1 <- lm(Symptoms~Hassles, data=hassles) • summary(RegModel.1) • Rcmdr menus • plot(RegModel.1) • influencePlot(RegModel.1) • bptest(Symptoms ~ Hassles, varformula = ~ fitted.values(RegModel.1), data=hassles) • dwtest(Symptoms ~ Hassles, alternative="greater", data=hassles) • resettest(Symptoms ~ Hassles, power=2:3, type="regressor", data=hassles) • shapiro.test(RegModel.1$residuals) • Time to Robustify • library(robustbase) • modelrob = lmrob(Symptoms~Hassles, data=hassles) • summary(modelrob) • Bootstrapping the coefficients. This uses my own (borrowed) function called regcoef1, which I tailor to the dataset I’m working on. • library(boot) • boot.regcoef=boot(hasssmooth, regcoef , 500) • boot.ci(boot.regcoef, type = "bca") #constant2 • boot.ci(boot.regcoef, index=2,type = "bca") #predictor3 • Validate the model • library(Design) • valmod=ols(Symptoms~Hassles, data=hassles, x=T,y=T,weights=modelrob$weights) • validate(valmod)

More Related