1 / 28

Statistics

Statistics. Correlation and regression. Introduction. Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment B? Correlation and regression used to investigate relationships between variables most commonly linear relationships

lori
Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics Correlation and regression

  2. Introduction • Some methods involve one variable • is Treatment A as effective in relieving arthritic pain as Treatment B? • Correlation and regression used to investigate relationships between variables • most commonly linear relationships • between two variables • is BMD related to dietary calcium level?

  3. Contents • Coefficients of correlation • meaning • values • role • significance • Regression • line of best fit • prediction • significance

  4. Introduction • Correlation • the strength of the linear relationship between two variables • Regression analysis • determines the nature of the relationship • Is there a relationship between the number of units of alcohol consumed and the likelihood of developing cirrhosis of the liver?

  5. Pearson’s coefficient of correlation • r • Measures the strength of the linear relationship between one dependent and one independent variable • curvilinear relationships need other techniques • Values lie between +1 and -1 • perfect positive correlation r = +1 • perfect negative correlation r = -1 • no linear relationship r = 0

  6.  r = +1 r = -1              r = 0 r = 0.6          Pearson’s coefficient of correlation

  7.       Scatter plot BMD dependent variable make inferences about Calcium intake independent variable make inferences from controlled in some cases

  8. Non-Normal data

  9. Normalised

  10. Calculating r • The value and significance of r are calculated by SPSS

  11. SPSS output: scatter plot

  12. SPSS output: correlations

  13. Interpreting correlation • Large r does not necessarily imply: • strong correlation • r increases with sample size • cause and effect • strong correlation between the number of televisions sold and the number of cases of paranoid schizophrenia • watching TV causes paranoid schizophrenia • may be due to indirect relationship

  14. Interpreting correlation • Variation in dependent variable due to: • relationship with independent variable: r2 • random factors: 1 - r2 • r2 is the Coefficient of Determination • e.g. r = 0.661 • r2 = = 0.44 • less than half of the variation in the dependent variable due to independent variable

  15. Agreement • Correlation should never be used to determine the level of agreement between repeated measures: • measuring devices • users • techniques • It measures the degree of linear relationship • 1, 2, 3 and 2, 4, 6 are perfectly positively correlated

  16. Assumptions • Errors are differences of predicted values of Y from actual values • To ascribe significance to r: • distribution of errors is Normal • variance is same for all values of independent variable X

  17. Non-parametric correlation • Make no assumptions • Carried out on ranks • Spearman’s r • easy to calculate • Kendall’s t • has some advantages over r • distribution has better statistical properties • easier to identify concordant / discordant pairs • Usually both lead to same conclusions

  18. Calculation of value and significance • Computer does it!

  19. Role of regression • Shows how one variable changes with another • By determining the line of best fit • linear • curvilinear

  20.        value of Y when X=0 change in Y when X increases by 1 Line of best fit • Simplest case linear • Line of best fit between: • dependent variable Y • BMD • independent variable X • dietary intake of Calcium Y= a + bX

  21. Role of regression • Used to predict • the value of the dependent variable • when value of independent variable(s) known • within the range of the known data • extrapolation risky! • relation between age and bone age • Does not imply causality

  22. SPSS output: regression

  23. Assumptions • Only if statistical inferences are to be made • significance of regression • values of slope and intercept

  24. Assumptions • If values of independent variable are randomly chosen then no further assumptions necessary • Otherwise • as in correlation, assumptions based on errors • balance out (mean=0) • variances equal for all values of independent variable • not related to magnitude of independent variable • seek advice / help

  25. Multivariate regression • More than one independent variable • BMD dependent on: • age • gender • calorific intake • etc

  26. Logistic regression • The dependent variable is binary • yes / no • predict whether a patient with Type 1 diabetes will undergo limb amputation given history of prior ulcer, time diabetic etc • result is a probability • Can be extended to more than two categories • Outcome after treatment • recovered, in remission, died

  27. Summary • Correlation • strength of linear relationship between two variables • Pearson’s - parametric • Spearman’s / Kendalls non-parametric • Interpret with care! • Regression • line of best fit • prediction • multivariate • logistic

More Related