1 / 16

LSP 121

LSP 121. Introduction to Correlation. Correlation. Correlation – when a relationship exists between two sets of data The news is filled with examples of correlation If you eat so many helpings of tomatoes… One alcoholic beverage a day… Driving faster than the speed limit…

mdrake
Download Presentation

LSP 121

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LSP 121 Introduction to Correlation

  2. Correlation • Correlation – when a relationship exists between two sets of data • The news is filled with examples of correlation • If you eat so many helpings of tomatoes… • One alcoholic beverage a day… • Driving faster than the speed limit… • Women who smoke during pregnancy… • If you eat only fast food for 30 days… • If your parents did not have offspring, then you won’t either (huh?)

  3. How Do You Calculate Correlation in Excel? • Make an XY scatterplot of the data, putting one variable on the x-axis and one variable on the y-axis. • Insert a linear trendline on the graph and include the R2 value • Interpret the results

  4. Interpreting the Results • The higher the R2 value, the better • If you only have a few data points, then you need a higher R2 value in order to conclude there is a correlation • Crude estimate: R2 > 0.5, most people say there is a correlation; R2 < 0.3, the correlation is essentially non-existent • R2 between 0.3 and 0.5?? Gray area!

  5. Examples • Look at: • CigarettesBirthweight.xls • SpeedLimits.xls • HeightWeight.xls • Grades.xls • WineConsumption.xls • BreastCancerTemperature.xls

  6. How Do We Calculate Correlation in SPSS/PASW? • In SPSS, click on Analyze -> Correlate -> Bivariate • Select the two columns of data you want to analyze (move them from the left box to the right box) • You can actually pick more than two columns, but we’ll keep it simple for now

  7. How Do We Calculate Correlation in SPSS/PASW? • Make sure the checkbox for Pearson Correlation Coefficients is checked • Click OK to run the correlation • You should get an output window something like the following slide

  8. The correlation between height and weight is 0.861 The Pearson Correlation value is not the same as Excel’s R-squared value; it can be positive or negative

  9. Positive and Negative Correlation • Positive correlation: as the values of one variable increase, the values of a second variable increase (values from 0 to 1.0) • Negative correlation: as the values of one variable increase, the values of a second variable decrease (values from 0 to -1.0) • Note: The SPSS R value will be greater than Excel’s R2 value! R=.5 equivalent to R2=.25

  10. Positive and Negative Correlation • There is a negative correlation between TV viewing and class grades—students who spend more time watching TV tend to have lower grades (or, students with higher grades tend to spend less time watching TV).

  11. Positive and Negative Correlation Positive correlation Negative correlation

  12. Positive and Negative Correlation • When looking for correlation, positive correlation is not necessarily greater than negative correlation • Which correlation is the greatest? -.34 .72 -.81 .40 -.12

  13. What Can We Conclude? • If two variables are correlated, then we can predict one based on the other • But correlation does NOT imply cause! • It might be the case that having more education causes a person to earn a higher income. It might be the case that having higher income allows a person to go to school more. There could also be a third variable. Or a fourth. Or a fifth…

  14. What Can We Conclude? • Causality – one variable, say A, actually causes the change in B. In the absence of any other evidence, data from observational studies simply cannot be used to establish causation.

  15. What Can We Conclude? • Common underlying cause or causes – most important one – A is correlated to B, but there is a third factor C (the common underlying cause) that causes the changes in both A and B. • Example: as ice cream sales go up, so do crime rates.

  16. What Can We Conclude? Sheer coincidence – the two variables have nothing in common, but they create a strong R or R2 value Both variables are changing over time – divorce rates are going up and so are drug-offenses. Is an increase in divorce causing more people to use drugs (and get caught)?

More Related