310 likes | 424 Views
This lecture explores the use of cross-tabulations to analyze the association between two variables, particularly focusing on the relationship between delivery methods and housing tenure. We'll understand how to work with the null hypothesis, which states that there is no association between the variables. We will also employ chi-square tests for large samples and Fisher's exact test for smaller samples to determine relationships. The odds ratio will be introduced, illustrating its significance in evaluating differences in events, supported by confidence intervals for statistical analysis. ###
E N D
The analysis of cross-tabulations Lecture 2
Cross-tabulations • Tables of countable entities or frequencies • Made to analyze the association, relationship, or connection between two variables • This association is difficult to describe statistically • Null- Hypothesis: “There is no association between the two variables” can be tested • Analysis of cross-tabulations with larges samples
Delivery and housing tenure • Expected number without any association between delivery and housing tenure
Delivery and housing tenureIf the null-hypothesis is true • 899/1443 = 62.3% are house owners. • 62.3% of the Pre-terms should be house owners: 99*899/1443 = 61.7
Delivery and housing tenureIf the null-hypothesis is true • 899/1443 = 62.3% are house owners. • 62.3% of the ‘Term’s should be house owners: 1344*899/1443 = 837.3
Delivery and housing tenureIf the null-hypothesis is true • 258/1443 = 17.9% are council tenant. • 17.9% of the ‘preterm’s should be council tenant: 99*258/1443 = 17.7
Delivery and housing tenureIf the null-hypothesis is true • In general
Delivery and housing tenureIf the null-hypothesis is true • In general
Delivery and housing tenuretest for association • If the numbers are large this will be chi-square distributed. • The degree of freedom is (r-1)(c-1) = 4 • From Table 13.3 there is a 1 - 5% probability that delivery and housing tenure is not associated
Delivery and housing tenureIf the null-hypothesis is true • It is difficult to say anything about the nature of the association.
Chi-squared test for small samples • Expected valued • > 80% >5 • All >1
Chi-squared test for small samples • Expected valued • > 80% >5 • All >1
Fisher’s exact test • An example
Fisher’s exact test • Survivers: • a, b, c, d, e • Deaths: • f, g, h • Table 1 can be made in 5 ways • Table 2: 30 • Table 3: 30 • Table 4: 5 • 70 ways in total
The properties of finding table 2 or a more extreme is: Fisher’s exact test • Survivers: • a, b, c, d, e • Deaths: • f, g, h • Table 1 can be made in 5 ways • Table 2: 30 • Table 3: 30 • Table 4: 5 • 70 ways in total
Yates’ correction for 2x2 • Yates correction:
Yates’ correction for 2x2 • Table 13.7 • Fisher: p = 0.001455384362148 • ‘Two-sided’ p = 0.0029 • χ2: p = 0.001121814118023 • Yates’ p = 0.0037
Odds and odds ratios • Odds, p is the probability of an event • Log odds / logit
Odds • The probability of coughs in kids with history of bronchitis. p = 26/273 = 0.095 o = 26/247 = 0.105 The probability of coughs in kids with history without bronchitis. p = 44/1046 = 0.042 o = 44/1002 = 0.044
Odds ratio • The odds ratio; the ratio of odds for experiencing coughs in kids with and kids without a history of bronchitis.
Is the odds ratio different form 1? • We could take ln to the odds ratio. Is ln(or) different from zero? • 95% confidence (assumuing normailty)
Confidence interval of the Odds ratio • ln (or) ± 1.96*SE(ln(or)) = 0.37 to 1.38 • Returning to the odds ratio itself: • e0.370 to e1.379 = 1.45 to 3.97 • The interval does not contain 1, indicating a statistically significant difference
Chi-square for goodness of fit • df = 4-1-1 = 2