Chapter 10

1 / 54

# Chapter 10 - PowerPoint PPT Presentation

Chapter 10. Hypothesis testing: Categorical Data Analysis. Learning Objectives. Comparison of binomial proportion using Z and  2 Test. Explain  2 Test for Independence of 2 variables Explain The Fisher’s test for independence McNemar’s tests for correlated data Kappa Statistic

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Chapter 10' - sadie

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Chapter 10

Hypothesis testing: Categorical Data Analysis

EPI809/Spring 2008

Learning Objectives
• Comparison of binomial proportion using Z and 2 Test.
• Explain 2 Test for Independence of 2 variables
• Explain The Fisher’s test for independence
• McNemar’s tests for correlated data
• Kappa Statistic
• Use of SAS Proc FREQ

EPI809/Spring 2008

Data Types

EPI809/Spring 2008

Qualitative Data
• Qualitative Random Variables Yield Responses That Can Be Put In Categories. Example: Gender (Male, Female)
• Measurement or Count Reflect # in Category
• Nominal (no order) or Ordinal Scale (order)
• Data can be collected as continuous but recoded to categorical data. Example (Systolic Blood Pressure - Hypotension, Normal tension, hypertension )

EPI809/Spring 2008

### Z Test for Differences in Two Proportions

EPI809/Spring 2008

Z Test for Difference in Two Proportions

1. Assumptions

• Populations Are Independent
• Populations Follow Binomial Distribution
• Normal Approximation Can Be Used for large samples (All Expected Counts  5)
• Z-Test Statistic for Two Proportions

EPI809/Spring 2008

Z Test for Two Proportions Thinking Challenge

MA

• You’re an epidemiologist for the US Department of Health and Human Services. You’re studying the prevalence of disease X in two states (MA and CA). In MA, 74 of 1500people surveyed were diseased and in CA, 129 of 1500 were diseased. At .05 level, does MA have a lower prevalence rate?

CA

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0:

Ha:

 =

nMA = nCA=

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

=

nMA = nCA=

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

= .05

nMA= 1500 nCA= 1500

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

= .05

nMA= 1500 nCA= 1500

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

= .05

nMA= 1500 nCA= 1500

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

Z = -4.00

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

= .05

nMA= 1500 nCA= 1500

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

Z = -4.00

Reject at  = .05

EPI809/Spring 2008

Z Test for Two Proportions Solution*

H0: pMA - pCA = 0

Ha: pMA - pCA < 0

= .05

nMA= 1500 nCA= 1500

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

Z = -4.00

Reject at  = .05

There is evidence MA is less than CA

EPI809/Spring 2008

### 2 Test of Independence Between 2 Categorical Variables

EPI809/Spring 2008

2 Test of Independence

1. Shows If a Relationship Exists Between 2 Qualitative Variables, but does Not Show Causality

2. Assumptions

Multinomial Experiment

All Expected Counts  5

3. Uses Two-Way Contingency Table

EPI809/Spring 2008

2 Test of Independence Contingency Table
• 1. Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables

EPI809/Spring 2008

2 Test of Independence Contingency Table

1. Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables

Levels of variable 2

Levels of variable 1

EPI809/Spring 2008

2 Test of Independence Hypotheses & Statistic

1. Hypotheses

• H0: Variables Are Independent
• Ha: Variables Are Related (Dependent)

EPI809/Spring 2008

2 Test of Independence Hypotheses & Statistic

1. Hypotheses

H0: Variables Are Independent

Ha: Variables Are Related (Dependent)

2. Test Statistic

Observed count

Expected count

EPI809/Spring 2008

2 Test of Independence Hypotheses & Statistic

1. Hypotheses

H0: Variables Are Independent

Ha: Variables Are Related (Dependent)

2. Test Statistic

Degrees of Freedom: (r - 1)(c - 1)

Observed count

Expected count

Rows Columns

EPI809/Spring 2008

2 Test of Independence Expected Counts

1. Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities

2. Compute Marginal Probabilities & Multiply for Joint Probability

3. Expected Count Is Sample Size Times Joint Probability

EPI809/Spring 2008

Expected Count Example

EPI809/Spring 2008

Expected Count Example

112 160

Marginal probability =

EPI809/Spring 2008

Expected Count Example

112 160

Marginal probability =

78 160

Marginal probability =

EPI809/Spring 2008

112 160

78 160

Expected Count Example

112 160

Joint probability =

Marginal probability =

78 160

Marginal probability =

EPI809/Spring 2008

112 160

78 160

112 160

78 160

Expected count = 160·

Expected Count Example

112 160

Joint probability =

Marginal probability =

78 160

Marginal probability =

= 54.6

EPI809/Spring 2008

Expected Count Calculation

EPI809/Spring 2008

Expected Count Calculation

EPI809/Spring 2008

Expected Count Calculation

112x78 160

112x82 160

48x78 160

48x82 160

EPI809/Spring 2008

2 Test of Independence Example on HIV
• You randomly sample 286 sexually active individuals and collect information on their HIV status and History of STDs. At the .05 level, is there evidence of a relationship?

EPI809/Spring 2008

2 Test of Independence Solution

H0:

Ha:

 =

df =

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 =

df =

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 = .05

df = (2 - 1)(2 - 1) = 1

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 = .05

df = (2 - 1)(2 - 1) = 1

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

 = .05

EPI809/Spring 2008

2 Test of Independence Solution

E(nij) 5 in all cells

116x132 286

154x116 286

170x132 286

170x154 286

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 = .05

df = (2 - 1)(2 - 1) = 1

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

2 = 54.29

 = .05

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 = .05

df = (2 - 1)(2 - 1) = 1

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

2 = 54.29

Reject at  = .05

 = .05

EPI809/Spring 2008

2 Test of Independence Solution

H0: No Relationship

Ha: Relationship

 = .05

df = (2 - 1)(2 - 1) = 1

Critical Value(s):

Test Statistic:

Decision:

Conclusion:

2 = 54.29

Reject at  = .05

 = .05

There is evidence of a relationship

EPI809/Spring 2008

2 Test of Independence SAS CODES

Data dis;

input STDs HIV count;

cards;

1 1 84

1 2 32

2 1 48

2 2 122

;

run;

Procfreq data=dis order=data;

weight Count;

tables STDs*HIV/chisq;

run;

EPI809/Spring 2008

2 Test of Independence SAS OUTPUT

Statistics for Table of STDs by HIV

Statistic DF Value Prob

------------------------------------------------------- Chi-Square 1 54.1502 <.0001

Likelihood Ratio Chi-Square 1 55.7826 <.0001

Continuity Adj. Chi-Square 1 52.3871 <.0001

Mantel-Haenszel Chi-Square 1 53.9609 <.0001

Phi Coefficient 0.4351

Contingency Coefficient 0.3990

Cramer's V 0.4351

EPI809/Spring 2008