Understanding Relationships through Scatterplots and Correlation Analysis
190 likes | 291 Views
Learn how to examine relationships between variables using scatterplots and correlation analysis. Understand the strength, direction, and patterns in data to interpret results effectively.
Understanding Relationships through Scatterplots and Correlation Analysis
E N D
Presentation Transcript
When there is more than just one… • Ask • What individuals? • What variables? How are the measured? • All quantitative? Or at least one categorical? • Simply Explore? • Or think a variable explains or causes changes?
SCATTERPLOTs…. • Most effective way to show relation between 2 quantitative variables measured on the same individuals • The values of one variable (explanatory) appear on the horizontal axis and the other variable (response) on the vertical axis • Each point represents one individual
Remember …. You already know this…. • Explanatory Variable – attempts to explain the observed outcomes • INDEPENDENT • Response Variable – measures an outcome of a study • DEPENDENT
Problem 3.1 • The amount of time a student spends studying for a statistics exam and the grade on the exam. • The weight and height of a person • The amount of yearly rainfall and the yield of a crop • A student’s grades in statistics and in French • The occupational class of a father and of a son.
Interpreting a Scatterplot • Look for overall pattern and distribution • Describe by • FORM – clusters, linear, curved, etc. • DIRECTION – positive, negative, none • STRENGTH – strong, medium, weak, etc. • OUTLIERS – falls outside overall pattern
Drawing Scatterplots • Scale both axes • Label both axes • Don’t compress plot, make large enough so plot uses whole grid
Hw : Friday • 3, 7, 10, 12, 15, 16, 17
Correlation – measures the direction and strength of the linear relationship between two quantitative variables. Usually written with r.
Facts to know about correlation… • Makes no distinction between explanatory and response variables. (it doesn’t matter which variable you call x or y in the formula) • Requires that both variables be quantitative • R uses standardized values of the observations, so r does not change if the units of measurement are changed. (correlation itself has NO unit of measurement it is JUST a number)
Continued…. • Positive r means positive association, negative r means negative association • r is always a number between -1 and 1, r near zero means VERY weak. Strength of r increases as value moves closer to either -1 or 1. rare cases of r=1, or r =-1 only occur when there is perfectly linear relationship • ONLY measures linear relationship, does not measure curved relationships • NOT resistant: r is STRONGLY affected by a few outliers
************** Remember that Correlation is not a complete description of two –variable data. EVEN when the data is linear. You should give the means and standard deviations of both x and y as well. (Means and standard deviations because these are used in the formula for r)****************
Finding correlation in your calculator…. • MAKE sure your Diagnostics are ON! • Enter both data sets into L1 and L2 • Stat – Calc – 8.LinReg(a + bx) • LinReg(a+bx) L1, L2, Y1
Tuesday’s Homework • 12, 25, 26, 30, 37