15 linear regression
Download
Skip this Video
Download Presentation
15: Linear Regression

Loading in 2 Seconds...

play fullscreen
1 / 16

15: Linear Regression - PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on

15: Linear Regression. Expected change in Y per unit X. Introduction (p. 15.1). X = independent (explanatory) variable Y = dependent (response) variable Use instead of correlation when distribution of X is fixed by researcher (i.e., set number at each level of X )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '15: Linear Regression' - ziva


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
15 linear regression

15: Linear Regression

Expected change in Y per unit X

introduction p 15 1
Introduction (p. 15.1)
  • X = independent (explanatory) variable
  • Y = dependent (response) variable
  • Use instead of correlation

when distribution of X is fixed by researcher (i.e., set number at each level of X)

studying functional dependency between X and Y

illustrative data bicycle sav p 15 1
Illustrative data (bicycle.sav) (p. 15.1)
  • Same as prior chapter
  • X = percent receiving reduce or free meal (RFM)
  • Y = percent using helmets (HELM)
  • n = 12 (outlier removed to study linear relation)
how formulas determine best line p 15 2
How formulas determine best line(p. 15.2)
  • Distance of points from line = residuals (dotted)
  • Minimizes sum of square residuals
  • Least squares regression line
interpretation of slope b p 15 3
Interpretation of Slope (b)(p. 15.3)
  • b = expected change in Y per unit X
  • Keep track of units!
    • Y = helmet users per 100
    • X = % receiving free lunch
  • e.g., b of –0.54 predicts decrease of 0.54 units of Y for each unit X
predicting average y
Predicting Average Y
  • ŷ = a + bx

Predicted Y = intercept + (slope)(x)

HELM = 47.49 + (–0.54)(RFM)

  • What is predicted HELM when RFM = 50?

ŷ = 47.49 + (–0.54)(50) = 20.5

Average HELM predicted to be 20.5 in neighborhood where 50% of children receive reduced or free meal

  • What is average Y when x = 20?

ŷ = 47.49 +(–0.54)(20) = 36.7

confidence interval for slope parameter p 15 4
Confidence Interval for Slope Parameter(p. 15.4)

95% confidence Interval for ß =

where

b = point estimate for slope

tn-2,.975 = 97.5th percentile (from t table or StaTable)

seb = standard error of slope estimate (formula 5)

standard error of regression

illustrative example bicycle sav
Illustrative Example (bicycle.sav)
  • 95% confidence interval for 

= –0.54 ± (t10,.975)(0.1058)

= –0.54 ± (2.23)(0.1058)

= –0.54 ± 0.24

= (–0.78, –0.30)

interpret 95 confidence interval
Interpret 95% confidence interval

Model:

Point estimate for slope (b) = –0.54

Standard error of slope (seb) = 0.24

95% confidence interval for  = (–0.78, –0.30)

Interpretation:

slope estimate = –0.54 ± 0.24

We are 95% confident the slope parameter falls between –0.78 and –0.30

significance test p 15 5
Significance Test(p. 15.5)
  • H0: ß = 0
  • tstat (formula 7) with df = n – 2
  • Convert tstat to p value

df =12 – 2 = 10

p = 2×area beyond tstat on t10

Use t table and StaTable

regression anova not in reader nr
Regression ANOVA(not in Reader & NR)
  • SPSS also does an analysis of variance on regression model
  • Sum of squares of fitted values around grand mean = (ŷi – ÿ)²
  • Sum of squares of residuals around line = (yi– ŷi )²
  • Fstat provides same p value as tstat
  • Want to learn more about relation between ANOVA and regression? (Take regression course)
distributional assumptions p 15 5
Distributional Assumptions(p. 15.5)
  • Linearity
  • Independence
  • Normality
  • Equal variance
validity assumptions p 15 6
Validity Assumptions(p. 15.6)
  • Data = farr1852.sav

X = mean elevation above sea level

Y = cholera mortality per 10,000

  • Scatterplot (right) shows negative correlation
  • Correlation and regression computations reveal:

r = -0.88

ŷ = 129.9 + (-1.33)x

p = .009

  • Farr used these results to support miasma theory and refute contagion theory
    • But data not valid (confounded by “polluted water source”)
ad