Categorical Outcomes Making Comparisons

Categorical Outcomes Making Comparisons Chapter 4

Outline • Describing: Numerical summaries Graphical summaries • One-sample comparisons: Historical controls • Multiple-sample comparisons: Dichotomous outcome Categorical outcomes • Measures of association

Categorical Outcomes • Gaps: Only limited number of values/categories possible Nothing “in-between” • Examples: Dichotomous (two categories) Nominal (categories without order) Ordinal (categories with order)

Learning Objectives • How do I describe categorical data? • How do I make comparisons? • How do I investigate associations?

Public Health Application • More than three-quartersof global malaria deaths occur in under-five children living in malarious countries in sub-Saharan Africa. 25% of all childhood mortality below the age of five is attributable to malaria. About 30–40% of all fevers seen in health centers in Africa are due to malaria with huge seasonal variability between rainy and dry seasons.

Data Description • Cross-sectional study conducted to investigate factors related to insecticide-treated net (ITN) use: 1876 households with an ITN: • Demographic variables (age of the head of household, household wealth, miles to the nearest healthcare facility, rural/urban, family size, etc.) • Children under the age of five? • Was an ITN used the previous night?

Research Question What factors are associated with ITN use?

DESCRIBING THE DATA

Describing the Data • Numerical summaries: Counts, proportions, and percentages • Graphical summaries: Pie charts Bar graphs

Most Important Step in Data Analysis • Describe the data: Before making conclusions or inferences, an investigator needs to fully understand what the data looks like. • Numerical and graphical summaries cannot be skipped! Need this information to choose the most appropriate statistical method Need this information for valid statistical inferences

Graphical Summaries

Bar Graphs • Provide a visual comparison among groups. Vertical axis represents the number of subjects. • The higher the bar, the more the subjects. Horizontal axis represents categories. • Ordinal: Order matters. • Nominal: Order does not matter.

Bar Graphs Ordinal Variable Nominal Variable

Bar Graphs • Graphically compare groups for some categorical outcome.

Pie Charts • Provides a visual description of how parts compare to a whole

Numerical Summaries

Numerical Summaries • Categorical variables are described by reporting the number of subjects within each category. Counts Proportions Percentages

Proportion • The fraction of the subjects belonging to a particular category. • The proportion of the population is a parameter. • The proportion of the sample is a statistic

Commonly Used Numerical Summaries

One-Sample Comparisons

Description of the Sample • A sample 1876 of households living in a tropical region where malaria is problematic: • The majority (51%) of the households are more than 50 miles from a healthcare facility and live in a rural area (53%). • Almost half (44%) of the households have a child under the age of five. • The average age for the head of the household is 48 (SD = 7.4). • Median family size of 6 with a range of 1–12. • Most (73%) of the households did not use an ITN the previous night.

Why a One-Sample Study? • Obtaining an additional group or sample for comparisons may not be practical. Comparisons involve historical control(s).

Historical Controls • Want to compare what you found in the sample to something: Do your results differ from what has been previously published/reported? • Historical controls: Control data are not collected concurrently within the same study. • different time period • different region • different population • different kind of exposure • Seems economical—why not use historical controls all the time?

One-Sample StudyITN Utilization • Data for this study were collected during the rainy season. How do the results compare with those of the dry season? • Is the season (rainy or dry) associated with the utilization of ITN?

Inference for the One-Sample Study • Hypothesis tests • Assume the null parameter is the true parameter Historical control study: Null parameter = Historical value • Decide whether the data support this assumption • Confidence intervals • Estimate the true parameter using interval • Can use the interval estimate to determine if assumptions about the parameter are reasonable

Inference for the One-Sample StudyHistorical Controls • Research hypothesis: The true proportion (p) in the rainy season is not 0.20. • Null hypothesis: The true proportion (p) in the rainy season is 0.20.

Inference for the One-Sample StudyITN Utilization

Planning • Estimation: Width of the interval Estimate of the proportion • Comparison of proportions: Power Significance level Effect size

Exact Tests • When the sample size is large (and the proportion is not too small), the normal approximation is used. What if this is not reasonable? • Exact tests allow for comparisons without using the normal distribution. Use binomial distribution.

Comparing a dichotomous outcome between two groups MULTIPLE-SAMPLE Comparisons

Description of the Sample • Households with children under five (n = 833) and without (n = 1043): • Similar with respect to age and family size. • Those with children under five in the household report more net use than those without children under 5 (34% vs 21%).

Description of the Sample Households using ITN (n = 500) • Report a higher percentage of children under five • Are more likely to live in a thatched roof • Have a higher percentage of households living within 15 miles of a healthcare facility • Are more likely to live in a rural area • Have, on average, younger household heads • Have larger families

Why a Two-Sample Study? • Provides an independent comparator group: Treatment vs control Exposed vs unexposed • Different outcomes between the groups may mean that the group is associated with the outcome.

2 x 2 Contingency Table

Contingency Table

Conditional Probabilities • Proportion of subjects with a category given some other condition is true • Really an issue of what is the denominator • Makes a difference how you interpret Row proportion Column proportion

Total, Row, and Column Proportions

Difference in Proportions • Statistical test does not care if you are comparing differences between column proportions and row proportions. • A difference in proportions translates to the two categorical variables being dependent.

Two-Sample Study • Does having a child under the age of five impact the utilization of ITN?

Inference for the Two-Sample Study Hypothesis tests • Assume the null parameter is the true parameter • The groups have the same proportion. • The true difference between proportions is 0. • The two categorical variables are independent. • Decide whether the data support this assumption

Inference for the Two-Sample Study pU5 = The true proportion of ITN use in households with children under five pO5 = The true proportion of ITN use in households with no children under five • Null hypothesis pU5 = pO5 Using ITN and having children under the age of five are independent • Research hypothesis pU5 ≠ pO5 Using ITN and having children under the age of five are dependent.

Inference for the Two-Sample Study

Planning • Balanced design? • Overall test or comparison between groups? • Estimation: Width of the interval Amount of variability • Comparison of means: Power Significance level Effect size

Comparing categorical outcomes between two or more groups Multiple-Sample Comparisons

Categorical Variables • Different research questions result in different types of categorical variables. The outcome does not have to be dichotomous. There can be more than two groups to compare.

R x C Contingency Table

Contingency Table

Conditional Probabilities • Proportion of subjects with a category given some other condition is true • Really an issue of what is the denominator • Make a difference how you interpret Row proportion Column proportion • Same as when there were only two groups and only two categories in the outcome

Inference • As categorical variables can be dichotomous, nominal, or ordinal, different hypotheses are possible. May require different tests • Hypothesis tests Assume the null hypothesis is true Decide whether the data support this assumption

Two Nominal Variables • Is there an association between the type of roof and the type of net used?

Categorical Outcomes Making Comparisons

Categorical Outcomes Making Comparisons

Presentation Transcript

Making Comparisons

Making Comparisons

Tests for Binary/Categorical outcomes

Making Comparisons!

Making Comparisons

Making Comparisons

Continuous Outcomes Making Comparisons

Making Comparisons

Making Comparisons

Making Comparisons

Making Comparisons

Making comparisons

Making comparisons

MAKING COMPARISONS

Making comparisons B2-level

Making Comparisons with Adjectives

Making Comparisons

Making Comparisons

MAKING COMPARISONS

Making Comparisons