1 / 8

Understanding Correlation and Linear Regression in AP Statistics

This note card summarizes the key concepts of correlation and linear regression as covered in Chapter 3 of AP Statistics. It explores the direction (positive and negative associations), form (linear, exponential, quadratic), and strength (strong, moderate, weak) of relationships between variables. It highlights the importance of residual analysis and discusses the correlation coefficient (r) and the coefficient of determination (r²). Additionally, it identifies the limitations of regression analysis, including extrapolation and the influence of lurking variables.

giulio
Download Presentation

Understanding Correlation and Linear Regression in AP Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AP Statistics Chapter 3 Notecards

  2. Interpreting Correlation • Direction • Positively associated – above average explanatory results in above average response • Negatively associated – above average explanatory results in below average response 2) Form – (linear, exponential, quadratic) 3) Strength – how sure are you of the relationship (strong, moderate, weak) 4) Outliers – individual observations outside pattern (difference between outlier and influential point) 5) IN CONTEXT!

  3. Linear Regression/Residuals a = average change in y for every change in x b = predicted y – intercept = predicted y for given x is a point on the regression line r – correlation coefficient ( -1 ≤ r ≤ 1) - r +r - The closer r is to 1 or negative 1, the stronger the linear correlation. An r of 0 implies no correlation between x and y -Remember, correlation does NOT imply causation r2 – coefficient of determination percentage of change in y that is due to the change in x

  4. An r near one is not enough to assure correlation is linear – you must look at the residual plot. If the residual plot shows a pattern, linear correlation is not a good assumption. Residuals are scattered – no apparent pattern - LINEAR Residuals show a definite pattern - NONLINEAR • Every x has two y’s associated with it; the y the equation predicts and the observed value. The residual is the observed – expected

  5. Linear equation without data We can find the equation of the regression line if we know; x, Sx, y, Sy, and r

  6. log y x Exponential/Power Transformations • Look at scatterplot – if linear, run regression and check residual plot If linear is not appropriate; try: 2) log y vs x If log y vs x is linear Exponential relationship logy = ax + b

  7. log y log x 3) log y vs log x If log y vs log x is linear Power relationship logy = a logx + b 4) Run appropriate regression 5) Plot y = on original data to check

  8. Limitations of regression • Describe only linear relationships resulting in need to transform data • Strongly influenced by extreme observations (non-resistant) • Extrapolation – prediction outside the domain of values (can yield incorrect predictions) • Lurking variables variables that have an important effect on the relationship but are not included in the study

More Related