1 / 22

Association

Association. Predicting One Variable from Another. Correlation. Usually refers to Pearson’s r computed on two interval/ratio scale variables. It measures the degree to which variance in one variable is “explained” by a second variable

yanni
Download Presentation

Association

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Association Predicting One Variable from Another

  2. Correlation • Usually refers to Pearson’s r computed on two interval/ratio scale variables. • It measures the degree to which variance in one variable is “explained” by a second variable • It measures the strength of a linear relationship between the variables

  3. Definition of r

  4. Properties of r • r is symmetrical and varies from -1 to +1 • 0 indicates no correlation or relationship • ±1 indicates a perfect correlation (knowledge of one variable makes it possible to predict the second one without any error).

  5. Properties of r2 • r2 is symmetrical and varies from 0 to 1 • r2 is the proportion of the variability in one variable that is “explained by” the other variable • cor.test(x, y, method=“pearson”) • cor(x, y, method=“pearson”)

  6. Spearman’s rho • For rank/ordinal data. • Pearson correlation computed on ranks • If Spearman coefficient is larger than Pearson, it may indicate a non-linear relationship • Ties make it difficult to compute p values

  7. Kendall’s tau • For rank/ordinal data • Evaluate pairs of observations (xi, yi) and (xj, yj) • Concordant – (xi > xj) and (yi > yj) OR (xi < xj) and (yi < yj) • Discordant – (xi > xj) and (yi < yj) OR (xi < xj) and (yi > yj)

  8. Kendall’s tau-a

  9. Kendall’s tau b • Divide by total number of pairs adjusted for all ties

  10. Kendall’s tau c • For grouped (tabled data) where the table is not square (rows ≠ columns)

  11. Nominal Measures • Measures based on Chi-Square: • Phi coefficient • Cramer’s V • Contingency coefficient • Odds ratio

  12. Phi and Cramer’s V • Phi ranges from 0 to 1 in a 2x2 table but can exceed 1 in larger tables. Cramer’s V adds a correction to keep the maximum value at 1 or less:

  13. Contingency Coefficient • Ranges from 0 to <1 depending on the number of rows and columns with 1 indicating a high relationship and 0 indicating no relationship

  14. Odds Ratio • For 2 x 2 tables it shows the relative odds between the two variables

  15. > Table <- xtabs(~Sex+Goods, data=EWG2) > Table Goods Sex Absent Present Female 38 28 Male 16 30 > ChiSq <- chisq.test(Table) > ChiSq Pearson's Chi-squared test with Yates' continuity correction data: Table X-squared = 4.7644, df = 1, p-value = 0.02905

  16. library(vcd) > assocstats(Table) X^2 df P(> X^2) Likelihood Ratio 5.7073 1 0.016894 Pearson 5.6404 1 0.017552 Phi-Coefficient : 0.224 Contingency Coeff.: 0.219 Cramer's V : 0.224 > cor(as.numeric(EWG2$Sex), as.numeric(EWG2$Goods), use="complete.obs") [1] 0.2244111 > oddsratio(Table, log=FALSE) [1] 2.544643

More Related