Download Presentation
PSYC 3030 Review Session

Loading in 2 Seconds...

1 / 24

PSYC 3030 Review Session - PowerPoint PPT Presentation

PSYC 3030 Review Session. Gigi Luk December 7, 2004. Overview. Matrix Multiple Regression Indicator variables Polynomial Regression Regression Diagnostics Model Building. Matrix: Basic Operation. Addition Subtraction Multiplication Inverse |A| ≠ 0 A is non-singular

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PSYC 3030 Review Session

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

PSYC 3030 Review Session

Gigi Luk

December 7, 2004

Overview
• Matrix
• Multiple Regression
• Indicator variables
• Polynomial Regression
• Regression Diagnostics
• Model Building
Matrix: Basic Operation
• Addition
• Subtraction
• Multiplication
• Inverse
• |A| ≠ 0
• A is non-singular
• All rows (columns) are linearly independent

Possible only when dimensions are the same

Possible only when inside dimensions are the same 2x3 & 3x2

Matrix: Inverse

Linearly independent:

Linearly Dependent:

Some notations
• n = sample size
• p = number of parameters
• c = number of values in x (cf. LOF, p. 85)
• g = number of family member in a Bonferroni test (cf. p. 92)
• J = I = H = x(x’x)-1x’
LS estimates

x’y = (x’x)b

x’x =

x’y =

(x’x)-1=

Residuals

e =

= y – xb

= [I – H]y

Matrix: estimates & residuals
Matrix: Application in Regression

df

MS

• SSE = e’e = y’y-b’x’y n-p SSE/n-p
• SSM = 1
• SSR = b’x’y – SSM p-1 SSR/p-1
• SST = y’y n
• SSTO = y’(1-J/n)y n-1

= y’y – SSM

Matrix: Variance-Covariance

Var-cov (Y) = σ2(Y) =

var-cov (b) = est σ2(b) = s2(b) = = MSE (x’x)-1

=

Multiple Regression
• Model with more than 2 independent variables: y = β0 + β1X1 + β2X2 + εi
Coefficients of multiple determination:

R2 = SSR/SSTO 0 ≤ R2 ≤ 1

alternative:

Coefficients of partial determination:

MR: R-square

SSTO

SSR(X1)

SSR(X2)

SSR(X1,X2)

SSR(X1|X2)

SSR(X2|X1)

SSE(X1)

SSE(X2)

SSE(X1,X2)

MR: Hypothesis testing
• Test for regression relation (the overall test): Ho: β1 = β2 =….. =βp-1 =0 Ha: not all βs = 0

If F* ≤ F(1-α; p-1, n-p), conclude Ho.

F*=MSR/MSE

• Test for βk:

Ho: βk = 0 Ha: βk ≠ 0

If |t|* ≤ t(1-α/2; n-p), conclude Ho.

t* = bk/s(bk) ≈ F*= [MSR(xk|all others)/MSE]

MR: Hypothesis Testing (cont’)
• Test for LOF:

Ho: E{Y} = βo + β1X1+β2X2+….+ βp-1Xp-1

Ha: E{Y} ≠ βo + β1X1+β2X2+….+ βp-1Xp-1

If F* ≤ F(1-α; c-p, n-p), conclude Ho.

F* = (SSLF/c-p)/(SSPE/n-c)

• Test whether some βk=0:

Ho: βh = βh+1 =….. =βp-1 =0

If F* ≤ F(1-α; p-1, n-p), conclude Ho.

F* = [MSR(xh…xp-1|x1…xh-1)]/MSE

MR: Extra SS (p. 141, CK)
• Full: y = βo+ β1X1+ β2X2 SSR(x1,x2)
• Red: y = βo+ β1X1  SSR(x1)
• SSR (x2|x1) = SSR(x1,x2) - SSR(x1)

= Effect of X2 adjusted for X1

= SSE(x1) - SSE(x1,x2)

• General Linear Test

Ho: β2 = 0 Ha: β2 ≠ 0

F* =

Y = expressive vocabulary

0

X = receptive vocabulary

Indicator variables

y-hat = bo +b1X1 +b2X2

y-hat = bo +b1X1

girls

boys

bo+b2

slope = b1

bo

Y = expressive vocabulary

0

X = receptive vocabulary

y-hat = bo + b1X1 +b2X2 + b12X1X2

If b12 > 0, then there is an interaction  boys and girls have different slopes in the relation of X and Y.

boys

girls

Polynomial Regression
• 2nd Order: Y = βo+ β1X1 + β2X2+εi
• 3rd Order: Y = βo+ β1X1 + β2X2+ β3X3+εi
• Interaction:

Y = βo+ β1X1 + β2X2+ β11X21+ β22X22+

β12X1X2+ εi

linear

quadratic

interaction

PR: Partial F-test (p.303, 5th ed.)
• Test whether a 1st order model would be sufficient:

Ho: β11= β22= β12= 0 Ha: not all βs in Ho =0

F* =

In order to obtain this SSR, you need sequential SS (see top of p. 304 in text). This test is a modified test for extra SS.)

Regression Diagnostics
• Collinearity:
• Effects: (1) poor numerical accuracy

(2) poor precision of estimates

• Danger sign: several large s(bk)
• Determinant of x’x ≈ 0
• Eigenvalues of c = # of linear dependencies
• Condition #: (λmax/ λi)1/2
• 15-30 watch out
• > 30 trouble
• > 100 disaster
Regression Diagnostics
• VIF (Variance Inflation Factor)

= 1/(1-R2i)

When to worry? When VIF ≈ 10

• TOL (Tolerance)

= 1/VIFi

Model Building
• Goals:
• Make R2 large or MSE small
• Keep cost of data collection, s(b) small
• Selection Criteria:
• R2 look at ∆R2
• MSE  can  or  as variables are added

Random error

Bias

Model Building (cont’)
• Cp≈ p = est. of 1/σ2

Σ{var(yhat) + [yhattrue – yhatp]}

=SSEp/MSEall – (n-2p)

=p+(m+1-p)(Fp-1)

m: # available predictors

Fp: incremental F for predictors omitted

Model Building (cont’)
• Variable Selection Procedure
• Choose min MSE & Cp≈ p
• SAS tools:
• Forward
• Backward
• Stepwise
• Guided selection: key vars, promising vars, haystack
• Substantive knowledge of the area
• Examination of each var: expected sign & magnitude coefficients