slide1
Download
Skip this Video
Download Presentation
MVS 250: V. Katch

Loading in 2 Seconds...

play fullscreen
1 / 38

MVS 250: V. Katch - PowerPoint PPT Presentation


  • 57 Views
  • Uploaded on

S TATISTICS. Chapter 5 Correlation/Regression. MVS 250: V. Katch. Paired Data is there a relationship if so, what is the equation use the equation for prediction. Overview. Correlation. Correlation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' MVS 250: V. Katch' - lane-cherry


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

STATISTICS

Chapter 5 Correlation/Regression

MVS 250: V. Katch

overview
Paired Data

is there a relationship

if so, what is the equation

use the equation for prediction

Overview
definition
Correlation

exists between two variables when one of them is related to the other in some way

Definition
assumptions
1. The sample of paired data (x,y) is a random sample.

2. The pairs of (x,y) data have a bivariate normal distribution.

Assumptions
definition1
Scatterplot (or scatter diagram)

is a graph in which the paired (x,y) sample data are plotted with a horizontal x axis and a vertical y axis. Each individual (x,y) pair is plotted as a single point.

Definition
positive linear correlation
Positive Linear Correlation

y

y

y

x

x

x

(a) Positive

(b) Strong

positive

(c) Perfect

positive

Scatter Plots

negative linear correlation
Negative Linear Correlation

y

y

y

x

x

x

(d) Negative

(e) Strong

negative

(f) Perfect

negative

Scatter Plots

no linear correlation
No Linear Correlation

y

y

x

x

(h) Nonlinear Correlation

(g) No Correlation

Scatter Plots

slide12

xy/n - (x/n)(y/n)

r =

(SDx) (SDy)

  • Definition
  • Linear Correlation Coefficient r

measures strength of the linear relationship between paired x and y values in a sample

Where xy/n is the mean of the cross products; (x/n) is the mean of the x variable; (y/n) is the mean of the y variable; SDx is the standard deviation of the x variable and SDy is the standard deviation of the x variable

notation for the linear correlation coefficient
n number of pairs of data presented

 denotes the addition of the items indicated.

x/ndenotes the mean of all x values.

y/ndenotes the mean of all y values.

xy/ndenotes the mean of the cross products [x times y, summed; divided by n]

r linear correlation coefficient for a sample

 linear correlation coefficient for a population

Notation for the Linear Correlation Coefficient
slide14
Round to three decimal places

Use calculator or computer if possible

Rounding the Linear Correlation Coefficient r

properties of the linear correlation coefficient r
1. -1 r 1

2. Value of r does not change if all values of either variable are converted to a different scale.

3. The r is not affected by the choice of x and y. Interchange x and y and the value of r will not       change.

4. r measures strength of a linear relationship.

Properties of the Linear Correlation Coefficient r
interpreting the linear correlation coefficient
If the absolute value of r exceeds the value in Sig. Table, conclude that there is a significant linear correlation.

Otherwise, there is not sufficient evidence to support the conclusion of significant linear correlation.

Remember to use n-2

Interpreting the Linear Correlation Coefficient
common errors involving correlation
1. Causation: It is wrong to conclude that correlation implies causality.

2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.

3. Linearity: There may be some relationship between x and y even when there is no significant linear correlation.

Common Errors Involving Correlation
common errors involving correlation1

250

200

150

Distance

(feet)

100

50

0

1

2

3

4

5

6

7

8

0

Time (seconds)

Common Errors Involving Correlation
correlation calculations
Correlation Calculations

Rank Order Correlation -RhoPearson’s - r

rank order correlation cont
Rank Order Correlation, cont

Rho = 1- [6 (∑D2) / N (N2-1)]

Rho = 1- [6(58)/10(102-1)]

Rho = 1- [348 / 10 (100 -1)]

Rho = 1- [348 / 990]

Rho = 1- 0.352

Rho = 0.648

(∑D2 = 58)

N=10

pearson s r

xy/n - (x/n)(y/n)

r =

(SDx) (SDy)

Pearson’s r

r = 32.86 - (5.5) (5.5)/(3.03) (3.03)

r = 35.86 - 30.25 / 9.09

r = 5.61 / 9.09

r = 0.6172

pearson s r1
Pearson’s r

Excel Demonstration

slide25

Data from the Garbage Project

x Plastic (lb)

0.27

2

1.41

3

2.19

3

2.83

6

2.19

4

1.81

2

0.85

1

3.05

5

y Household

Is there a significant linear correlation?

slide26

Data from the Garbage Project

x Plastic (lb)

0.27

2

1.41

3

2.19

3

2.83

6

2.19

4

1.81

2

0.85

1

3.05

5

y Household

Is there a significant linear correlation?

slide27

Data from the Garbage Project

x Plastic (lb)

0.27

2

1.41

3

2.19

3

2.83

6

2.19

4

1.81

2

0.85

1

3.05

5

y Household

r = 0.842

R2 = 0.71

Is there a significant linear correlation?

slide28

= .01

= .05

n

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

25

30

35

40

45

50

60

70

80

90

100

.950

.878

.811

.754

.707

.666

.632

.602

.576

.553

.532

.514

.497

.482

.468

.456

.444

.396

.361

.335

.312

.294

.279

.254

.236

.220

.207

.196

.999

.959

.917

.875

.834

.798

.765

.735

.708

.684

.661

.641

.623

.606

.590

.575

.561

.505

.463

.430

.402

.378

.361

.330

.305

.286

.269

.256

Is there a significant linear correlation?

n = 8  = 0.05 H0:  = 0

H1 : 0

Test statistic is r = 0.842

Critical values are r = - 0.707 and 0.707

(Table R with n = 8 and = 0.05)

TABLE R Critical Values of the Pearson Correlation Coefficient r

slide29

Is there a significant linear correlation?

0.842 > 0.707, That is the test statistic does fall within the

critical region.

Therefore, we REJECT H0:  = 0 (no correlation) and conclude

there is a significant linear correlation between the weights of

discarded plastic and household size.

Reject

= 0

Fail to reject

 = 0

Reject

= 0

1

- 1

r = - 0.707

0

r = 0.707

Sample data:

r = 0.842

slide30

Method 1: Test Statistic is t

(follows format of earlier chapters)

formal hypothesis test
To determine whether there is a significant linear correlation between two variables

Two methods

Both methods let H0: =

(no significant linear correlation)

H1: 

(significant linear correlation)

Formal Hypothesis Test
slide32
Test statistic: r

Critical values: Refer to Table R

(no degrees of freedom)

Method 2: Test Statistic is r

(uses fewer calculations)

slide33
Test statistic: r

Critical values: Refer to Table A-6

(no degrees of freedom)

Method 2: Test Statistic is r

(uses fewer calculations)

Reject

= 0

Fail to reject

 = 0

Reject

= 0

r = 0.811

1

r = - 0.811

0

-1

Sample data:

r = 0.828

slide34

Method 1: Test Statistic is t

(follows format of earlier chapters)

Test statistic:

r

t =

1 - r 2

n - 2

Critical values:

use Table T with

degrees of freedom = n - 2

testing for a linear correlation

Select a

significance

level 

Calculate r using

Formula 9-1

r

The test statistic is

The test statistic is

r

t =

Critical values of t are from

Table A-6

1 - r 2

n -2

Critical values of t are from Table A-3 with n-2 degrees of freedom

If the absolute value of the

test statistic exceeds the

critical values, reject H0:  = 0

Otherwise fail to reject H0

If H0 is rejected conclude that there

is a significant linear correlation.

If you fail to reject H0, then there is

not sufficient evidence to conclude

that there is linear correlation.

Start

Let H0:  = 0

H1:  0

METHOD 1

METHOD 2

Testing for a

Linear Correlation

why does the critical value of r increase as sample size decreases
Why does the critical value of r increase as sample size decreases?

A correlation by chance is more likely.

coefficient of determination effect size
Coefficient of Determination(Effect Size)

r2

The part of variance of one variable that can be explained by the variance of a related variable.

slide38

Justification for r Formula

 (x -x) (y -y)

r=

(x, y) centroid of sample points

(n -1) Sx Sy

x = 3

y

x - x = 7- 3 = 4

(7, 23)

24

20

y - y = 23 - 11 = 12

Quadrant 1

Quadrant 2

16

12

y = 11

(x, y)

8

Quadrant 3

Quadrant 4

4

x

0

0

1

2

3

4

5

6

7

ad