1 / 58

Chi-Square Procedures - PowerPoint PPT Presentation

Chi-Square Procedures. Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Chi-Square Procedures

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions

The chi-square Goodness of Fit Test: you have only one set of data on a single characteristic, and you want to know if it matches an expected distribution based on the laws of probability(1 variable, 1population)

Ho: The data follow a specified distribution

The alternative hypothesis is always

Ha: The data does not follow a specified distribution

The idea behind testing these types of claims is to compare actual counts to the counts we would expect if the null hypothesis were true. If a significant difference between the actual counts and expected counts exists, we would take this as evidence against the null hypothesis.

The method for obtaining the expected counts requires that we determine the number of observations within each cell under the assumption the null hypothesis is true.

Test Statistic for the Test of Goodness of Fit we determine the number of observations within each cell under the assumption the null hypothesis is true.

Let Oi represent the observed number of counts in the ith cell, Ei represent the expected number of counts in the ith cell. Then,

approximately follows the chi-square distribution with(# of cells– 1) degrees of freedom in the contingency table

The Chi-Square Test for Goodness of Fit we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the data following a certain distribution, we can use the following steps to test the claim provided

1. the data is randomly selected

The Chi-Square Test for Goodness of Fit we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the data following a certain distribution, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

The Chi-Square Test for Goodness of Fit we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the data following a certain distribution, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

3. 80% of the expected cell counts are greater than or equal to 5.

EXAMPLE we determine the number of observations within each cell under the assumption the null hypothesis is true.Testing for Goodness of Fit

In consumer marketing, a common problem that any marketing manager faces is the selection of appropriate colors for package design. Assume that a marketing manager wishes to compare five different colors of package design. He is interested in knowing if there is a preference among the five colors so that it can be introduced in the market. A random sample of 400 consumers reveals the following. Do the consumer preferences for package colors show any significant difference?

Step 1. A claim is made regarding the data fit to a certain distribution.

Ho: the number of customers who prefer each color are the same.

Ha: the number of customers who prefer each color are not the same.

Step 2: Calculate the expected frequencies (counts) for each cell in the contingency table.

Observed Counts

Expected Counts

Step 3: Verify the requirements for the chi-square test for goodness of fit are satisfied.

(1) data is randomly selected

(2) all expected frequencies are greater than or equal to 1

(3) 80% of the expected cell counts are greater than or equal to 5.

Step 4: Select a proper level of significance 

Step 5: Compute the goodness of fit are satisfied.test statistic and P-value

P-value = cdf(min,max,df)

Step 5: Compute the goodness of fit are satisfied.test statistic and P-value

P-value = 0.0224

11.4

If P-value < goodness of fit are satisfied., reject null hypothesis

If P-value < goodness of fit are satisfied., reject null hypothesis

11.4>9.49 and 0.0224<0.05. Therefore I would reject the null hypothesis. The data is statistically significant and I am led to believe that there is a difference in preference of package color

The goodness of fit are satisfied.chi-square independence test: you have two characteristics of a population, and you want to see if there is any association between the characteristics(2 variables, 1 population)

Ho: the variables are independent

The alternative hypothesis is always

Ha: the variables are dependent

The idea behind testing these types of claims is to compare actual counts to the counts we would expect if the null hypothesis were true (if the variables are independent). If a significant difference between the actual counts and expected counts exists, we would take this as evidence against the null hypothesis.

The method for obtaining the expected counts requires that we determine the number of observations within each cell under the assumption the null hypothesis is true.

Expected Frequencies in a Chi-Square Independence Test we determine the number of observations within each cell under the assumption the null hypothesis is true.

To find the expected frequencies in a cell when performing a chi-square independence test, multiply the row total of the row containing the cell by the column total of the column containing the cell and divide this result by the table total. That is

Test Statistic for the Test of Independence we determine the number of observations within each cell under the assumption the null hypothesis is true.

Let Oi represent the observed number of counts in the ith cell, Ei represent the expected number of counts in the ith cell. Then,

approximately follows the chi-square distribution with(r – 1)(c – 1) degrees of freedom where r is the number of rows and c is the number of columns in the contingency table

The Chi-Square Test for Independence we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the association between (or independence of) two variables in a contingency table, we can use the following steps to test the claim provided

1. the data is randomly selected

The Chi-Square Test for Independence we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the association between (or independence of) two variables in a contingency table, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

The Chi-Square Test for Independence we determine the number of observations within each cell under the assumption the null hypothesis is true.

If a claim is made regarding the association between (or independence of) two variables in a contingency table, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

3. 80% of the expected cell counts are greater than or equal to 5.

EXAMPLE we determine the number of observations within each cell under the assumption the null hypothesis is true. Testing for Independence

Ho: there is not association between gender of lifestyle choice, the variables are independent

Ha: there is an association between gender of lifestyle choice, the variables are dependent

Step 2: Calculate the expected frequencies (counts) for each cell in the contingency table.

Observed Counts

Expected Counts

Step 3: Verify the requirements for the chi-square test for independence are satisfied.

(1) data is randomly selected

(2) all expected frequencies are greater than or equal to 1

(3) 80% of the expected cell counts are greater than or equal to 5.

Step 4: Select a proper level of significance 

Step 5: Compute the independence are satisfied.test statistic and P-Value

P-value = cdf(min,max,df)

Step 5: Compute the independence are satisfied.test statistic and P-Value

36.84 P = 0.00000001

If P-value < independence are satisfied., reject null hypothesis

If P-value < independence are satisfied., reject null hypothesis

36.84>5.99 and 0.00000001<0.05. Therefore I would reject the null hypothesis. The data is statistically significant and I am led to believe that there is an association between gender and lifestyle choice and that these variables are dependent

In a independence are satisfied.chi-square test for homogeneity:you take samples from different populations, and you want to test to see if the proportions in various categories is the same for each population(1 variable, multiple populations)

In a chi-square independence are satisfied.homogeneity test, the null hypothesis is always

Ho: populations have the same proportion of individuals with some characteristic.

The alternative hypothesis is always

Ha: populations have different proportion of individuals with some characteristic.

The idea behind testing these types of claims is to compare actual counts to the counts we would expect if the null hypothesis were true (proportions are equal). If a significant difference between the actual counts and expected counts exists, we would take this as evidence against the null hypothesis.

The method for obtaining the expected counts requires that we determine the number of observations within each cell under the assumption the null hypothesis is true.

Expected Frequencies in a Chi-Square Homogeneity Test we determine the number of observations within each cell under the assumption the null hypothesis is true.

To find the expected frequencies in a cell when performing a chi-square independence test, multiply the row total of the row containing the cell by the column total of the column containing the cell and divide this result by the table total. That is

Test Statistic for the Test of Homogeneity we determine the number of observations within each cell under the assumption the null hypothesis is true.

Let Oi represent the observed number of counts in the ith cell, Ei represent the expected number of counts in the ith cell. Then,

approximately follows the chi-square distribution with(r – 1)(c – 1) degrees of freedom where r is the number of rows and c is the number of columns in the contingency table

The Chi-Square Test for we determine the number of observations within each cell under the assumption the null hypothesis is true. Homogeneity

If a claim is made regarding that different populations have the same proportion of individuals with some characteristic, we can use the following steps to test the claim provided

1. the data is randomly selected

The Chi-Square Test for we determine the number of observations within each cell under the assumption the null hypothesis is true. Homogeneity

If a claim is made regarding that different populations have the same proportion of individuals with some characteristic, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

The Chi-Square Test for we determine the number of observations within each cell under the assumption the null hypothesis is true. Homogeneity

If a claim is made regarding that different populations have the same proportion of individuals with some characteristic, we can use the following steps to test the claim provided

1. the data is randomly selected

2. all expected frequencies are greater than or equal to 1.

3. 80% of the expected cell counts are greater than or equal to 5.

EXAMPLE we determine the number of observations within each cell under the assumption the null hypothesis is true. A Test of Homogeneity of Proportions

The following question was asked of a random sample of individuals in 1992, 1998, and 2001: “Would you tell me if you feel being a teacher is an occupation of very great prestige?” The results of the survey are presented below:

Ho: the proportions of individuals who feel teaching is an occupation of very great prestige in each year are equal

Ha: the proportions of individuals who feel teaching is an occupation of very great prestige in each year are not equal

Step 2: Calculate the expected frequencies (counts) for each cell in the contingency table.

Observed Counts

Expected Counts

Step 3: Verify the requirements for the chi-square test for homogeneity are satisfied.

(1) data is randomly selected

(2) all expected frequencies are greater than or equal to 1

(3) 80% of the expected cell counts are greater than or equal to 5.

Step 4: Select a proper level of significance 

Step 5: Compute the homogeneity are satisfied.test statistic and P-Value

P-value = cdf(min,max,df)

Step 5: Compute the homogeneity are satisfied.test statistic and P-Value

2.26 P = 0.3228

If P-value < homogeneity are satisfied., reject null hypothesis

If P-value < homogeneity are satisfied., reject null hypothesis

2.26<9.21 and 0.323>0.01. Therefore I would fail to reject the null hypothesis. The data is not statistically significant and I can not conclude that the proportions of individuals who feel teaching is an occupation of very great prestige is different each year