STATS 330: Lecture 6. Inference for the Multiple Regression Model. Inference for the Regression model. Aim of today’s lecture: To discuss how we assess the significance of variables in the regression Key concepts: Standard errors Confidence intervals for the coefficients
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
STATS 330: Lecture 6
Inference for the
Multiple Regression Model
330 Lecture 6
Aim of today’s lecture:
To discuss how we assess the significance of variables in the regression
Key concepts:
Reference: Coursebook Section 3.2
330 Lecture 6
330 Lecture 6
x1, x2
330 Lecture 6
330 Lecture 6
Call:
lm(formula = volume ~ diameter + height, data = cherry.df)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 4.7082 0.2643 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 3.882 on 28 degrees of freedom
Multiple R-Squared: 0.948, Adjusted R-squared: 0.9442
F-statistic: 255 on 2 and 28 DF, p-value: < 2.2e-16
Standard errors of coefficients
330 Lecture 6
330 Lecture 6
A 95% confidence interval for a regression coefficient is of the form
Estimated coefficient +/- standard error ´ t
where t is the 97.5% point of the appropriate t-distribution. The degrees of freedom are
n-k-1 where n=number of cases (observations) in the regression, and k is the number of variables (assuming we have a constant term)
330 Lecture 6
Use functionconfint
> confint(cherry.lm)
2.5% 97.5%
(Intercept) -75.68226247 -40.2930554
diameter 4.16683899 5.2494820
height 0.07264863 0.6058538
Object created by lm
330 Lecture 6
330 Lecture 6
330 Lecture 6
t-values
Call:
lm(formula = volume ~ diameter + height, data = cherry.df)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 4.7082 0.2643 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 3.882 on 28 degrees of freedom
Multiple R-Squared: 0.948, Adjusted R-squared: 0.9442
F-statistic: 255 on 2 and 28 DF, p-value: < 2.2e-16
p-values
All variables required since p=values small (<0.05)
330 Lecture 6
Density curve for t with 28 degrees of freedom
P-value: total area is 0.0145
-2.607
2.607
330 Lecture 6
330 Lecture 6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877 8.6382 -6.713 2.75e-07 ***
diameter 4.7082 0.2643 17.816 < 2e-16 ***
height 0.3393 0.1302 2.607 0.0145 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 3.882 on 28 degrees of freedom
Multiple R-Squared: 0.948, Adjusted R-squared: 0.9442
F-statistic: 255 on 2 and 28 DF, p-value: < 2.2e-16
p-value
F-value
330 Lecture 6
Full model: model with all the variables
Sub-model: model with a set of variables deleted.
330 Lecture 6
330 Lecture 6
330 Lecture 6
330 Lecture 6
Value of F
P-value
330 Lecture 6
330 Lecture 6
ffa age weight skinfold
0.759 105 67 0.96
0.274 107 70 0.52
0.685 100 54 0.62
0.526 103 60 0.76
0.859 97 61 1.00
0.652 101 62 0.74
0.349 99 71 0.76
1.120 101 48 0.62
1.059 107 59 0.56
1.035 100 51 0.44
… 20 observations in all
330 Lecture 6
> model.full<- lm(ffa~age+weight+skinfold,data=fatty.df)
> summary(model.full)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.95777 1.40138 2.824 0.01222 *
age -0.01912 0.01275 -1.499 0.15323
weight -0.02007 0.00613 -3.274 0.00478 **
skinfold -0.07788 0.31377 -0.248 0.80714
This suggests that
Can we get away with just weight?
330 Lecture 6
> model.sub<-lm(ffa~weight,data=fatty.df)
> anova(model.sub,model.full)
Analysis of Variance Table
Model 1: ffa ~ weight
Model 2: ffa ~ age + weight + skinfold
Res.Df RSS Df Sum of Sq F Pr(>F)
1 18 0.91007
Small F, large p-value suggest weight alone is adequate. But test should be interpreted with caution, as we “pretested”
330 Lecture 6
log(V) = b0 + b1 log(D) + b2 log(H)
330 Lecture 6
> cherry.lm = lm(log(volume)~log(diameter)+log(height),data=cherry.df)
> cc = c(0,1,1)
> c = 3
> test.lc(cherry.lm,cc,c)
$est
[1] 3.099773
$std.err
[1] 0.1765222
$t.stat
[1] 0.5652165
$df
[1] 28
$p.val
[1] 0.5764278
330 Lecture 6
library(R330)
330 Lecture 6
c0b0 + c1b1 + c2b2 = c
(in our example c0 = 0, c1=1, c2=1, c = 3)
330 Lecture 6