Download
categorical outcomes making comparisons n.
Skip this Video
Loading SlideShow in 5 Seconds..
Categorical Outcomes Making Comparisons PowerPoint Presentation
Download Presentation
Categorical Outcomes Making Comparisons

Categorical Outcomes Making Comparisons

157 Views Download Presentation
Download Presentation

Categorical Outcomes Making Comparisons

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Categorical Outcomes Making Comparisons Chapter 4

  2. Outline • Describing: Numerical summaries Graphical summaries • One-sample comparisons: Historical controls • Multiple-sample comparisons: Dichotomous outcome Categorical outcomes • Measures of association

  3. Categorical Outcomes • Gaps: Only limited number of values/categories possible Nothing “in-between” • Examples: Dichotomous (two categories) Nominal (categories without order) Ordinal (categories with order)

  4. Learning Objectives • How do I describe categorical data? • How do I make comparisons? • How do I investigate associations?

  5. Public Health Application • More than three-quartersof global malaria deaths occur in under-five children living in malarious countries in sub-Saharan Africa. 25% of all childhood mortality below the age of five is attributable to malaria. About 30–40% of all fevers seen in health centers in Africa are due to malaria with huge seasonal variability between rainy and dry seasons.

  6. Data Description • Cross-sectional study conducted to investigate factors related to insecticide-treated net (ITN) use: 1876 households with an ITN: • Demographic variables (age of the head of household, household wealth, miles to the nearest healthcare facility, rural/urban, family size, etc.) • Children under the age of five? • Was an ITN used the previous night?

  7. Research Question What factors are associated with ITN use?

  8. DESCRIBING THE DATA

  9. Describing the Data • Numerical summaries: Counts, proportions, and percentages • Graphical summaries: Pie charts Bar graphs

  10. Most Important Step in Data Analysis • Describe the data: Before making conclusions or inferences, an investigator needs to fully understand what the data looks like. • Numerical and graphical summaries cannot be skipped! Need this information to choose the most appropriate statistical method Need this information for valid statistical inferences

  11. Graphical Summaries

  12. Bar Graphs • Provide a visual comparison among groups. Vertical axis represents the number of subjects. • The higher the bar, the more the subjects. Horizontal axis represents categories. • Ordinal: Order matters. • Nominal: Order does not matter.

  13. Bar Graphs Ordinal Variable Nominal Variable

  14. Bar Graphs • Graphically compare groups for some categorical outcome.

  15. Pie Charts • Provides a visual description of how parts compare to a whole

  16. Numerical Summaries

  17. Numerical Summaries • Categorical variables are described by reporting the number of subjects within each category. Counts Proportions Percentages

  18. Proportion • The fraction of the subjects belonging to a particular category. • The proportion of the population is a parameter. • The proportion of the sample is a statistic

  19. Commonly Used Numerical Summaries

  20. One-Sample Comparisons

  21. Description of the Sample • A sample 1876 of households living in a tropical region where malaria is problematic: • The majority (51%) of the households are more than 50 miles from a healthcare facility and live in a rural area (53%). • Almost half (44%) of the households have a child under the age of five. • The average age for the head of the household is 48 (SD = 7.4). • Median family size of 6 with a range of 1–12. • Most (73%) of the households did not use an ITN the previous night.

  22. Why a One-Sample Study? • Obtaining an additional group or sample for comparisons may not be practical. Comparisons involve historical control(s).

  23. Historical Controls • Want to compare what you found in the sample to something: Do your results differ from what has been previously published/reported? • Historical controls: Control data are not collected concurrently within the same study. • different time period • different region • different population • different kind of exposure • Seems economical—why not use historical controls all the time?

  24. One-Sample StudyITN Utilization • Data for this study were collected during the rainy season. How do the results compare with those of the dry season? • Is the season (rainy or dry) associated with the utilization of ITN?

  25. Inference for the One-Sample Study • Hypothesis tests • Assume the null parameter is the true parameter Historical control study: Null parameter = Historical value • Decide whether the data support this assumption • Confidence intervals • Estimate the true parameter using interval • Can use the interval estimate to determine if assumptions about the parameter are reasonable

  26. Inference for the One-Sample StudyHistorical Controls • Research hypothesis: The true proportion (p) in the rainy season is not 0.20. • Null hypothesis: The true proportion (p) in the rainy season is 0.20.

  27. Inference for the One-Sample StudyITN Utilization

  28. Planning • Estimation: Width of the interval Estimate of the proportion • Comparison of proportions: Power Significance level Effect size

  29. Exact Tests • When the sample size is large (and the proportion is not too small), the normal approximation is used. What if this is not reasonable? • Exact tests allow for comparisons without using the normal distribution. Use binomial distribution.

  30. Comparing a dichotomous outcome between two groups MULTIPLE-SAMPLE Comparisons

  31. Description of the Sample • Households with children under five (n = 833) and without (n = 1043): • Similar with respect to age and family size. • Those with children under five in the household report more net use than those without children under 5 (34% vs 21%).

  32. Description of the Sample Households using ITN (n = 500) • Report a higher percentage of children under five • Are more likely to live in a thatched roof • Have a higher percentage of households living within 15 miles of a healthcare facility • Are more likely to live in a rural area • Have, on average, younger household heads • Have larger families

  33. Why a Two-Sample Study? • Provides an independent comparator group: Treatment vs control Exposed vs unexposed • Different outcomes between the groups may mean that the group is associated with the outcome.

  34. 2 x 2 Contingency Table

  35. Contingency Table

  36. Conditional Probabilities • Proportion of subjects with a category given some other condition is true • Really an issue of what is the denominator • Makes a difference how you interpret Row proportion Column proportion

  37. Total, Row, and Column Proportions

  38. Difference in Proportions • Statistical test does not care if you are comparing differences between column proportions and row proportions. • A difference in proportions translates to the two categorical variables being dependent.

  39. Two-Sample Study • Does having a child under the age of five impact the utilization of ITN?

  40. Inference for the Two-Sample Study Hypothesis tests • Assume the null parameter is the true parameter • The groups have the same proportion. • The true difference between proportions is 0. • The two categorical variables are independent. • Decide whether the data support this assumption

  41. Inference for the Two-Sample Study pU5 = The true proportion of ITN use in households with children under five pO5 = The true proportion of ITN use in households with no children under five • Null hypothesis pU5 = pO5 Using ITN and having children under the age of five are independent • Research hypothesis pU5 ≠ pO5 Using ITN and having children under the age of five are dependent.

  42. Inference for the Two-Sample Study

  43. Planning • Balanced design? • Overall test or comparison between groups? • Estimation: Width of the interval Amount of variability • Comparison of means: Power Significance level Effect size

  44. Comparing categorical outcomes between two or more groups Multiple-Sample Comparisons

  45. Categorical Variables • Different research questions result in different types of categorical variables. The outcome does not have to be dichotomous. There can be more than two groups to compare.

  46. R x C Contingency Table

  47. Contingency Table

  48. Conditional Probabilities • Proportion of subjects with a category given some other condition is true • Really an issue of what is the denominator • Make a difference how you interpret Row proportion Column proportion • Same as when there were only two groups and only two categories in the outcome

  49. Inference • As categorical variables can be dichotomous, nominal, or ordinal, different hypotheses are possible. May require different tests • Hypothesis tests Assume the null hypothesis is true Decide whether the data support this assumption

  50. Two Nominal Variables • Is there an association between the type of roof and the type of net used?