Medical Statistics (full English class)

1 / 21

# Medical Statistics (full English class) - PowerPoint PPT Presentation

Medical Statistics (full English class). Ji-Qian Fang School of Public Health Sun Yat-Sen University. Chapter 12 Linear Correlation and Linear Regression. 12.3 Linear regression. Initial meaning of “regression”: Galdon noted that if father is tall, his son

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Medical Statistics (full English class)' - kenneth-smith

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Medical Statistics (full English class)

Ji-Qian Fang

School of Public Health

Sun Yat-Sen University

12.3 Linear regression

Initial meaning of “regression”:

Galdon noted that if father is tall, his son

will be relatively tall; if father is short, his

son will be relative short.

• But, if father is very tall, his son will not taller than his father usually; if father is very short, his son will not shorter than his father usually.

Otherwise, ……?!

• Galdon called this phenomenon “regression to the mean”

220

200

Son’s height (cm)

180

160

140

120

100

100

120

140

160

180

200

220

Father’s height（cm）

What is regression in statistics?

To find out the track of the means

Given the value of chest circumference (X), the vital capacity (Y) vary around a center (y|x)

• All the centers locate on a line -- regression line. The relationship between the center y|x and X – regression equation

1. Linear regression equation

• Linear regression

Try to estimate  and , getting

• Where

a -- estimate of  , intercept

b -- estimate of  , slop

-- estimate of y|x

Least square method

To find suitable a and b such that

By calculus,

Slop b

Intercept a

Regression Equation

2. t test for regression coefficient

• b is sample regression coefficient, change from sample to sample
• There is a population regression coefficient, denoted by 
• Question : Whether  =0 or not?
• H0: =0, H1: ≠0α=0.05

Statistic

Standard deviation of regression coefficient

Standard deviation of residual

Sum of squared residuals

3. Application of regression

1) To describe how the value of Y depending on X

2) To estimate or predict the value of Y through a value of X (known)

-- based on the regression of Y on X.

3) To control the value of X through a value of Y (known)

-- If X is not a random variable,

based on the regression of Y on X.

-- If X is also a random variable,

based on the regression of X on Y.

12.4 The relationship betweenRegression and Correlation

1. Distinguish and connection

• Distinguish:

Correlation: Both X and Y are random

Regression: Y is random

X is notrandom – Type  regression

X is alsorandom – Type  regression

Connection: When both X and Y are random

1) Same sign for correlation coefficient

and regression coefficient

2) t tests are equivalent

tr = tb

3) Coefficient of determination
• Without regression, given the value of Xi we canonly predict , the sum of squared residuals is
• After regression, given the value of Xi we can predict

, the sum of squared residuals is

• Contribution of regression
• It can be proved
2. Caution --

for regression and correlation

• Don’t put any two variables together for correlation and regression – They must have some relation in subject matter;
• Correlation does not necessary mean causality

-- sometimes may be indirect relation or even no any real relation;

A big value of rdoes not necessary mean a big regression coefficient b;

4) To reject H0: ρ=0 does not necessary mean the correlation is strong -- ρ≠0;

5) Scatter diagram is useful before working with linear correlation and linear regression;

6) The regression equation is not allowed to be applied beyond the range of the data set.