AAEC 4302 STATISTICAL METHODS IN AGRICULTURAL RESEARCH. Chapter 7(7.1 &7.2): Theory and Application of the Multiple Regression Model. Introduction. The multiple regression model aims to and must include all of the independent variables X1, X2, X3, …, Xk that are believed to affect Y
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Chapter 7(7.1 &7.2): Theory and Application of the Multiple Regression Model
Yi = β0 + β1X1i + β2X2i + β3X3i +…+ βkXki + ui
where i=1,…,n represents the observations, k is the total number of independent variables in the model, β0, β1,…, βk are the parameters to be estimated and ui is the disturbance term, with the same properties as in the simple regression model
Yi = β0 + β1X1i + β2X2i + β3X3i +β4X4i + ui
E[ Yi ]= β0 + β1X1i + β2X2i + β3X3i +…+ βkXki
Yi = E[Yi]+ ui , the systematic (explainable) and
unsystematic (random) components of Yi
Yi = β0 + β1X1i + β2X2i + β3X3i +β4X4i
^
^
^
^
^
^
SSR = ei2= (Yiβ0  β1X1i  β2X2i  β3X3i  β4X4i )2
n
n
^
^
^
^
^
i=1
i=1
X2
Regression surface (plane)
E[Y] = Bo+B1X1+B2X2
Ui
X2 slope
measured
by B2
Bo
X1 slope measured by B1
X1
^
^
^
^
^
^
^
^
^
^
^
^
^
^
^
R2 = 1  { ei2/ (YiY)2}
n
n
i=1
i=1
R2 = 1 [{ei2/(nk1)}/{(YiY)2/(n1)}]
Chapters 6.3
Variables & Model Specifications
For example, a farmer’s current year investment
decisions might be based on the previous year prices,
since the current year prices are not known when
making these decisions.
Suppose we want to estimate cotton acres planted in the US (Y) as a function of the last 3 years price of cotton lint (Xt), cents/lb.
What's the interpretation of: = 1.2 ?
It means that if the price of cotton lint three years ago (t3), changed by 1 cent per pound; the # of acres of planted cotton today (time, t) would increase by 1.2 acres, while holding all the other X’s constant.
Suppose you wanted to estimate the function where investment is a function of the change in GNP (i.e. first difference).
Chapters 6.46.5, 7.4
Variables & Model Specifications
INFLi = 1.984+ 22.234*UINVi
R2= 0.549 SER=0.956
LnMi= 3.948 + 0.215 LnGNPi
R2 = 0.78 SER=0.0305
lnM = 3.948 + 0.215*6.908 = 5.433
Antilog of 5.433 = 222.8 bill $
An advantage of the polynomial model specification is that it can combine situations in which some of the independent variables are nonlinearly related to Y while others are linearly related to Y
Multiple regression :
Crosssectional DB with 100 observations
Estimated EANRS function:
EANRSi = 9.791 +0.995 EDi + 0.471EXPi –
0.00751EXPSQi
R2=0.329 SER4.267
B 1= 0.995 – holding the level of experience constant one additional year of education increases earnings by $995
EANRSi = constant + 0.471EXPi – 0.00751EXPSQi
where the “constant” depends of the particular value chosen for ED
slope = 0.471 + (2)(0.00751)(5) = 0.396 thou $
A man with 5 years of experience will have his earnings increased by 396 $ after gaining one additional year of experience
Chapter 7.3 Dummy Variables
X1
X2
1
1
0
D11: 1 if sex = Male, 0 otherwise
D21: 1 if species = 1, 0 otherwise
D22: 1 if species = 2, 0 otherwise
Male of Specie 3
Y (mm)
Female of Specie 3
3.05
(age)
The Normal and t Distributions
to have a standard normal distribution if its probability distribution is of the form:
The area under p(Z) is equal to 1
Z has and , page 210
ά is the probability
If Z*= 1.5 than from the table α = .067
Pr (Z ≤ Z*)= Pr (Z ≥ Z*)
Pr (Z ≤ 1.5)= .067
ІZІ = Z* means Z ≤ Z* andZ ≥ Z* together
= 2Pr(Z ≥ Z*) area in Fig. 10.1b
The probability of not being in either tail is unshaded area or:
Pr(ІZІ ≤ Z*) = 1  Pr(ІZІ≥ Z*)
Pr(Z ≥ 1.5) = 0.067, then Pr(ІZІ≥ 1.5) = 0.134 and Pr(ІZІ≤ 1.5) = 0.866
to have a normal distribution if its probability distribution is of the form:
where b>0 and a can be any value.
and
Pr(X≥ 6) ?
From Table A.1 we find Pr(Z ≥ 0.5)=0.309
p(t) = f (t; df), ∞< t <∞
Find α such that Pr(t ≥ t*) =α
Table A.2 can be used to find probability
df=5, Pr(t ≥ 1.5) = 0.97 and Pr(t ≥ 2.5) = 0.027
χ2 = , df=d
Figure 10.6 page 222
χ2 has μ = d and S =
Find (χ2 )c such that Pr(χ2 ≥ (χ2)c) =α
Table A.4 df =10 and α=0.10 then χ2 ≥ (χ2)c=15.99
Fc = 3.33 for α = 0.05
Chapter 11:
Sampling Theory in Regression Analysis
and its standard deviation is σ(Yi) = σu
P(Yi)
Yi~ N[67,(5)2]
σ(Yi) = σ = 5
Mean: B0+B1X1
E[Yi]=67
σ
+σ
Yi
67
72
62
where means “distributed”, N means normal, the first element in parenthesis is the mean or expected value of the estimator and the second element is the formula for calculating the variance of the estimator.
2
æ
ö
æ
ö
ΣX
ç
s
÷
ç
÷
2
B
~
N
B
,
i
ç
÷
ç
÷
(
)
2
0
0

n
X
X
å
è
ø
è
ø
i
Example:
^
^
What is the chance that B1 is between 11 & 13?
α = Pr(11≤β1≤13)
= 12Pr(β1≥13)
= 12Pr(Z≥Zk) where
=12Pr(Z≥0.6) = 1(2)(0.274) = 0.452
Thus, the probability α is about 45 percent.
^
α = P(11≤β1≤13) = 0.68
When Standard Error is smaller there is a greater possibility that est. β1 will take on a value in some interval centered around true β1 valueThe smaller the standard error, the more precise is est. β1 as an estimator of β1