Statistics 303

1 / 23

Statistics 303 - PowerPoint PPT Presentation

Statistics 303. Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables. Relationships between two categorical variables Depending on the situation, one of the variables is the explanatory variable and the other is the response variable.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Statistics 303

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Statistics 303

Chapter 9

Two-Way Tables

Relationships Between Two Categorical Variables
• Relationships between two categorical variables
• Depending on the situation, one of the variables is the explanatory variable and the other is the response variable.
• In this case, we look at the percentages of one variable for each level of the other variable.
• Examples:
• Gender and Soda Preference
• Country of Origin and Marital Status
• Smoking Habits and Socioeconomic Status
Relationships Between Two Categorical Variables
• Relationships between two categorical variables
• A two-way table can summarize the data for relationships between two categorical variables.
• Example: Gender and Highest Degree Obtained
SPSS OUTPUT
• Example:

Percents

Review of Two-Way Tables
• Two-way tables come about when we are interested in the relationship between two categorical variables.
• One of the variables is the row variable.
• The other is the column variable.
• The combination of a row variable and a column variable is a cell.

Column variable

Cells

Row Totals

Column Totals

Row variable

Overall Total

Review of Two-Way Tables
• Example:
Chi-Squared Test for Independence
• To test whether or not there is a relationship between the row variable and the column variable, we use the chi-square statistic (X2), which can be calculated in the computer.
• The null hypothesis (H0) is no relationship among the two variables, i.e. the variables are independent.
• The alternative hypothesis (HA) is that there is a relationship, i.e. the variables are not independent.
• For 2x2 tables, we require that all four expected cell counts be 5 or more.
• For tables larger than 2x2, we will use this approximation whenever the average of the expected counts is 5 or more and the smallest expected count is 1 or more.
Chi-Squared Test for Independence
• A comparison of the proportion of “successes” in two populations leads to a 2x2 table.
• We can compare two population proportions either by the chi-square test or by the two-sample z test from section 8.2
• These tests always give exactly the same result.
• The chi-square statistic is equal to the square of the z statistic and χ2(1) critical values are equal to the squares of the corresponding N(0,1) critical values.
• Advantage of the z test: We can test either one-sided or two-sided alternatives
• Chi-square test always tests the two-sided alternative
• Advantage of chi-square: We can compare more than two populations
• z-Test compares only two populations
Chi-Squared Test for Independence
• The chi-square statistic compares the observed cell counts with the expected cell counts
• The chi-square statistic is a measure of how much the observed cell counts in a two-way table diverge from the expected cell counts.
• If the expected counts and the observed counts are very different, a large value of X2 will result. Large values of X2 provide evidence against the null hypothesis.
Chi-Square Test
• Like the t distributions, the χ2 distributions are described by a single parameter, degrees of freedom (df).
• The degrees of freedom for the chi-square test are

df = (r – 1)*(c – 1 ) = (#rows – 1)*(#columns – 1).

• For a 2x2 table, we have df = (2 – 1)(2 – 1) = 1.
• The p-value is determined by looking in Table F.
• P(χ2 ≥ X2) Notice Table F gives probabilities to the right.

Also, note χ2 distributions take only positive valuesand are skewed to the right.

We are interested in this row:

Analysis in SPSS gives us:

The p-value is 0.103. Because this is larger than 0.05 we fail to reject H0 and conclude there is no significant relationship between gender and tomato enjoyment.

Link between Diabetes and Heart Disease?
• Background:

•  1. A diabetic’s risk of dying after a first heart attack is the same as that of someone without diabetes. There is no link between diabetes and heart disease.

vs.

• 2. Diabetes takes a heavy toll on the body and diabetes patients often suffer heart attacks and strokes or die from cardiovascular complications at a much younger age.
• So we use hypothesis test based on the latest data to see what’s the right conclusion.
• There are a total of 5167 managed-care patients, among which 1131 patients are non-diabetics and 4036 are diabetics. Among the non-diabetic patients, 42% of them had their blood pressure properly controlled (therefore it’s 475 of 1131). While among the diabetic patients only 20% of them had the blood pressure controlled (therefore it’s 807 of 4036).
Link between Diabetes and Heart Disease?Data:Diabetes: 1=Not have diabetes, 2=Have DiabetesControl: 1=Controlled, 2=Uncontrolled
Link between Diabetes and Heart Disease?

Hypothesis test:

1) H0: There is no link between diabetes and heart disease. (There is no relationship between diabetes and heart disease. Diabetes and heart disease are independent.)

2) HA: There is link between diabetes and heart disease. (There is a relationship between diabetes and heart disease. Diabetes and heart disease are dependent.)

3) Assume a significance level of .05

Link between Diabetes and Heart Disease?

4) The computer gives us a Chi-Square Statistic of 229.268

5) The computer gives us a p-value of .000

6) Because our p-value is less than alpha, we would reject the null hypothesis.

7) There IS sufficient evidence that there is link between diabetes and heart disease.

Is there a relationship between exposure to R-rated movies and adolescent smoking?
• The study attempted to examine the relationship between exposure to R-Rated movies and smoking habits among adolescents. Smoking in R-rated movies is higher than any other movie-rating category. Therefore, the objective of this study was to determine if an association existed between parental restrictions on movies and adolescent cigarette use.
Is there a relationship between exposure to R-rated movies and adolescent smoking?
• Hypothesis Test

1) H0: There is no relationship between exposure to R-rated movies and tobacco use among adolescents

2) HA: There is a relationship between the occurrence of tobacco use and the exposure to R-rated movies among adolescents

3) alpha = 0.05

SPSS Output

4) The computer gives us a chi-square test statistic of 469.003

5) The computer output gives us a p-value that is 0.000

Is there a relationship between exposure to R-rated movies and adolescent smoking?

6) Decision Rule:

• If p-value ≤ alpha, we reject H0
• If p-value > alpha, we fail to reject H0

Because our p-value is less than our significance level (alpha), we would reject the null hypothesis

7) Because we rejected H0, we can conclude that there IS significant evidence that a relationship between exposure to R-rated movies and adolescent tobacco use exists.