1 / 20

# Correlation and Regression - PowerPoint PPT Presentation

Correlation and Regression. Statistics 2126. Introduction. Means etc are of course useful We might also wonder, “how do variables go together?” IQ is a great example It goes together with so much stuff. A scatterplot.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about 'Correlation and Regression' - liana

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Correlation and Regression

Statistics 2126

• Means etc are of course useful

• We might also wonder, “how do variables go together?”

• IQ is a great example

• It goes together with so much stuff

• You tend to put the predictor on the x axis and the predicted on the y, though this is not a hard and fast rule

• A scatterplot is a pretty good EDA tool too eh

• Pick an appropriate scale for you axes

• Plot the (x,y) pairs

• If, as one variable increases, the other variable increases we have a positive association

• If, as one goes up, the other goes down, we have a negative association

• There could be no association at all

• BTW, I am only talking about straight line relationships

• Not curvilinear

• Say like the Yerkes Dotson Law, as far as a the stuff we will talk about, there is no relationship, yet we know there is

• The more the points cluster around a line, the stronger the relationship is

• Height and weight vs height in cm vs height in inches

• We need something that ignores the units though, so if I did IQ and your income in real money or IQ and your income in that worthless stuff they use across the river, the numbers would be the same

• -1.00 <= r <= +1.00

• The sign indicates ONLY the direction (think of it as going uphill or downhill)

• |r| indicates the strength

• So, r = -.77 is a stronger correlation than r = .40

• All of these have have the same correlation

• R = .7 in each case

• Note the problem of outliers

• Note the problem of two subpopulations

• Correlation is not causation

• I said, correlation is not causation

• Let me say it again, correlation is not causation

• Birth control and the toaster method

• If we could predict y from x

• You know, like an equation

• Remember that in school, you would get an equation, plug in the x and get the y

• Well surprise surprise, there is a method like this in statistics

• Well, we will make mistakes

• We will want to minimize those mistakes

• Those prediction errors or residuals (e) sum to 0

• Damn

• Though guess what we could do…

• Why square them of course

• So we get a line that minimizes squared residuals

Y intercept

slope

Y hat (predicted y)

• With a regression line you can predict y from x

• Just because it says that some value = a linear combination of numbers it does not mean that there is necessarily a causal link

• Don’t go outside the range

• Linear only