# Chapter 4 - PowerPoint PPT Presentation

1 / 26

Chapter 4. Scatterplots and Correlation. Explanatory and Response Variables. Studying the relationship between two variables. Measuring both variables on the same individuals. a response variable measures an outcome of a study

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Chapter 4

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Chapter 4

Scatterplots and Correlation

Chapter 4

### Explanatory and Response Variables

• Studying the relationship between two variables.

• Measuring both variables on the same individuals.

• a response variable measures an outcome of a study

• an explanatory variable explains or influences changes in a response variable

• sometimes there is no distinction

Chapter 4

### Case study

♦ In a study to determine whether surgery

or chemotherapy results in higher survival

rates for some types of cancer.

♦ Whether or not the patient survived is one

surgery or chemotherapy is the other variable.

♦ Which is the explanatory variable and

which is the response variable?

Chapter 4

### Scatterplot

• Graphs the relationship between two quantitative (numerical) variables measured on the same individuals.

• Usually plot the explanatory variable on the horizontal (x) axis and plot the response variable on the vertical (y) axis.

Chapter 4

Scatterplot

Relationship between mean SAT verbal score and percent of high school grads taking SAT

Chapter 4

### Scatterplot

• Look for overall pattern and deviations from this pattern

• Describe pattern by form, direction, and strength of the relationship

• Look for outliers

Chapter 4

### Linear Relationship

Some relationships are such that the points of a scatterplot tend to fall along a straight line -- linear relationship

Chapter 4

### Direction

• Positive association

◙ A positive value for the correlation implies a positive association.

◙ large values of X tend to be associated with large values of Y and small valuesof X tend to be associated with small values of Y.

• Negative association

◙ A negative value for the correlation implies a negative or inverse association

◙ large values of X tend to be associated with small values of Y and vice versa.

Chapter 4

### Examples

From a scatterplot of college students, there is a positive associationbetween verbal SAT score and GPA.

For used cars, there is a negative association between the age of the car and the selling price.

Chapter 4

Chapter 4

### Measuring Strength & Directionof a Linear Relationship

• How closely does a straight line (non-horizontal) fit the points of a scatterplot?

• The correlation coefficient (often referred to as just correlation): r

• measure of the strength of the relationship: the stronger the relationship, the larger the magnitude of r.

• measure of the direction of the relationship: positive r indicates a positive relationship, negative r indicates a negative relationship.

Chapter 4

### Correlation Coefficient

• special values for r :

• a perfect positive linear relationship would have r = +1

• a perfect negative linear relationship would have r = -1

• if there is no linear relationship, or if the scatterplot points are best fit by a horizontal line, then r = 0

• Note: r must be between -1 and +1, inclusive

• both variables must be quantitative; no distinction between response and explanatory variables

• r has no units; does not change when measurement units are changed (ex: ft. or in.)

Chapter 4

Chapter 4

### Examples of Correlations

• Husband’s versus Wife’s ages

• r = .94

• Husband’s versus Wife’s heights

• r = .36

• Professional Golfer’s Putting Success: Distance of putt in feet versus percent success

• r = -.94

• Chapter 4

### Not all Relationships are Linear Miles per Gallon versus Speed

• Linear relationship?

• Correlation is close to zero.

Chapter 4

### Not all Relationships are Linear Miles per Gallon versus Speed

• Curved relationship.

Chapter 4

### Problems with Correlations

• Outliers can inflate or deflate correlations (see next slide)

• Groups combined inappropriately may mask relationships (a third variable)

• groups may have different relationships when separated

Chapter 4

### Outliers and Correlation

A

B

For each scatterplot above, how does the outlier affect the correlation?

A: outlier decreases the correlation

B: outlier increases the correlation

Chapter 4

### Correlation Calculation

• Suppose we have data on variables X and Y for n individuals:

x1, x2, … , xnand y1, y2, … , yn

• Each variable has a mean and std dev:

Chapter 4

### Case Study

Per Capita Gross Domestic Product

and Average Life Expectancy for

Countries in Western Europe

Chapter 4

Chapter 4

Chapter 4

Chapter 4

### Summary

• Explanatory variable Vs responsible variable

• Scatterplot: display the relationship between two quantitative variables measured on the same individuals

• Overall pattern of scatterplot showing:

☻ Direction☻Form☻ Strength

♦ Correlation coefficientr:Measures only straight-line

linear relationship

♦ R indicate the direction of a linear relationship by sign

R > 0, positive association R< 0, negative association

♦ The range of R:-1 ≤ R ≤ 1,

Chapter 4

### Scatterplots

Which of the following scatterplots displays the stronger linear relationship?

• Plot A

• Plot B

• Same for both

### Correlation

If two quantitative variables, X and Y, have a correlation coefficient r = 0.80, which graph could be a

scatterplot of the two variables?

• Plot A

• Plot B

• Plot C

Plot A ? Plot B? Plot C?