- 78 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Prepared by Lloyd R. Jaisingh ' - jane-hoffman

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

A PowerPoint Presentation Package to Accompany

Applied Statistics in Business & Economics, 4th edition David P. Doane and Lori E. Seward

Prepared by Lloyd R. Jaisingh

15.1 Chi-Square Test for Independence

15.2 Chi-Square Tests for Goodness-of-Fit

15.3 Uniform Goodness-of-Fit Test

15.4 Poisson Goodness-of-Fit Test

15.5 Normal Chi-Square Goodness-of-Fit Test

15.6 ECDF Tests (Optional)

Chi-Square TestsChapter 15

LO15-1: Recognize a contingency table.

LO15-2:Find degrees of freedom and use the chi-square table of critical values.

LO15-3:Perform a chi-square test for independence on a contingency table.

LO15-4:Perform a goodness-of-fit (GOF) test for a uniform distribution.

LO15-5:Explain the GOF test for a Poisson distribution.

LO15-6: Use computer software to perform a chi-square GOF test for normality.

LO15-7: State advantages of ECDF tests as compared to chi-square GOF tests.

Chi-Square TestsChapter 15

15.1 Chi-Square Test for Independence

LO15-1

Chapter 15

- A contingency table is a cross-tabulation of n paired observations into categories.
- Each cell shows the count of observations that fall into the category defined by its row (r) and column (c) heading.
- For example:

LO15-1: Recognize a contingency table.

Contingency Tables

15.1 Chi-Square Test for Independence

LO15-3, 2

Chapter 15

LO15-3: Perform a chi-square test for independence on a

contingency table.

- In a test of independence for an r x c contingency table, the hypotheses areH0: Variable A is independent of variable BH1: Variable A is not independent of variable B
- Use the chi-square test for independence to test these hypotheses.
- This non-parametric test is based on frequencies.
- The n data pairs are classified into c columns and r rows and then the observed frequencyfjk is compared with the expected frequencyejk.
- The critical value comes from the chi-square probability distribution with n degrees of freedom. (See Appendix E for table values).

d.f. = degrees of freedom = (r – 1)(c – 1)where r = number of rows in the tablec = number of columns in the table

LO15-2: Find degrees of freedom and use the chi-square

table of critical values.

Chi-Square Test

15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Expected Frequencies

- Assuming that H0 is true, the expected frequency of row j and column k is:

ejk = RjCk/n

where Rj = total for row j (j = 1, 2, …, r)Ck = total for column k (k = 1, 2, …, c)n = sample size

Steps in Testing the Hypotheses

- Step 1: State the Hypotheses.
- H0: Variable A is independent of variable B
- H1: Variable A is not independent of variable B
- Step 2: Specify the Decision Rule.
- Calculate d.f. = (r – 1)(c – 1)
- For a given a, look up the right-tail critical value (c2R) from Appendix E or by using Excel.

15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Steps in Testing the Hypotheses

- Step 4: Calculate the Test Statistic.
- The chi-square test statistic is
- Step 5: Make the Decision.
- Reject H0 if test statistic > c2R or if the p-value ≤ a.

Small Expected Frequencies

- The chi-square test is unreliable if the expected frequencies are too small.
- Rules of thumb:
- Cochran’s Rule requires that ejk > 5 for all cells.
- Up to 20% of the cells may have ejk < 5.
- Most agree that a chi-square test is infeasible if ejk < 1 in any cell.
- If this happens, try combining adjacent rows or columns to enlarge the expected frequencies.

15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Test of Two Proportions

- Chi-square tests for independence can also be used to analyze quantitative variables by coding them into categories.

- For a 2 × 2 contingency table, the chi-square test is equivalent to a two-tailed z test for two proportions, if the samples are large enough to ensure normality.
- The hypotheses are:

Cross-Tabulating Raw Data

Why Do a Chi-Square Test on Numerical Data?

- The researcher may believe there’s a relationship between X and Y, but doesn’t want to use regression.
- There are outliers or anomalies that prevent us from assuming that the data came from a normal population.
- The researcher has numerical data for one variable but not the other.

Figure 14.6

15.2 Chi-Square Tests for Goodness-of-Fit

Chapter 15

- The goodness-of-fit (GOF) test helps you decide whether your sample resembles a particular kind of population.
- The chi-square test will be used because it is versatile and easy to understand.

Purpose of the Test

Multinomial GOF Test

- A multinomial distribution is defined by any k probabilities p1, p2, …, pk that sum to unity. For example,

H0: p1 = .13, p2 = .13, p3 = .24, p4 = .20, p5 = .16, p6 = .14H1: At least one of the pj differs from the hypothesized value.

- If no parameters are estimated (m = 0) and there are c = 6 classes, so the degrees of freedom will be d.f. = c – m – 1 = 6 – 0 – 1 = 5.

15.2 Chi-Square Tests for Goodness-of-Fit

Chapter 15

Hypotheses for GOF

- The hypotheses are:

H0: The population follows a _____ distributionH1: The population does not follow a ______ distribution

- The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal).

Test Statistic and Degrees of Freedom for GOF

Where fj = the observed frequency of

observations in class j and ej = the expected

frequency in class j if H0 were true.

- The test statistic follows the chi-square distribution with degrees of freedomd.f. = c – m – 1 where c is the number of classes used in the test m is the number of parameters estimated.

15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

LO15-4: Perform a goodness of-fit (GOF) test for a uniform

distribution.

- The uniform goodness-of-fit test is a special case of the multinomial in which every value has the same chance of occurrence.
- The chi-square test for a uniform distribution compares all c groups simultaneously.
- The hypotheses are:

H0: p1 = p2 = …, pc = 1/cH1: Not all pj are equal

Uniform Distribution

- The test can be performed on data that are already tabulated into groups.
- Calculate the expected frequency ejfor each cell.
- The degrees of freedom are d.f. = c – 1 since there are no parameters for the uniform distribution.
- Obtain the critical value c2a from Appendix E for the desired level of significance a.
- The p-value can be obtained from Excel.
- Reject H0 if p-value ≤ a.

15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

Uniform GOF Test: Raw Data

- First form c bins of equal width and create a frequency distribution.
- Calculate the observed frequency fj for each bin.
- Define ej= n/c.
- Perform the chi-square calculations.
- The degrees of freedom are d.f. = c – 1 since there are no parameters for the uniform distribution.
- Obtain the critical value from Appendix E for a given significance level a and make the decision.
- Maximize the test’s power by defining bin width as (As a result, the expected frequencies will be as large as possible.)

15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

Uniform GOF Test: Raw Data

- Calculate the mean and standard deviation of the uniform distribution as:
- If the data are not skewed and the sample size is large (n > 30), then the mean is approximately normally distributed.
- So, test the hypothesized uniform mean using

15.4 Poisson Goodness-of-Fit Test

LO15-5

Chapter 15

- In a Poisson distribution model, X represents the number of events per unit of time or space.
- X is a discrete nonnegative integer (X = 0, 1, 2, …).
- Event arrivals must be independent of each other.
- Sometimes called a model of rare events because X typically has a small mean.

LO15-5: Explain the GOF test for a Poisson distribution.

Poisson Data-Generating Situations

Poisson Goodness-of-Fit Test

- The mean l is the only parameter.
- If l is unknown, it must be estimated from the sample.
- Use the estimated l to find the Poisson probability P(X) for each value of X.
- Compute the expected frequencies.
- Perform the chi-square calculations.
- Make the decision.
- You may need to combine classes until expected frequencies become large enough for the test (at least until ej> 2).

15.4 Poisson Goodness-of-Fit Test

LO15-5

Chapter 15

Poisson GOF Test: Tabulated Data

- Calculate the sample mean as:
- Using this estimate mean, calculate the Poisson probabilities either by using the Poisson formula P(x) = (lxe-l)/x! or Excel.
- For c classes with m = 1 parameter estimated, the degrees of freedom are d.f. = c – m – 1
- Obtain the critical value for a given a from Appendix E.
- Make the decision.

15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

LO15-6: Use computer software to perform a chi-square GOF test for normality.

- Two parameters, the mean m and the standard deviation s, fully describe the normal distribution.
- Unless m and s are know apriori, they must be estimated from a sample.
- Using these statistics, the chi-square goodness-of-fit test can be used.

Normal Data Generating Situations

Method 1: Standardizing the Data

- Transform the sample observations x1, x2, …, xninto standardized values.

15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

Method 2: Equal Bin Widths

- To obtain equal-width bins, divide the exact data range into c groups of equal width.
- Step 1: Count the sample observations in each bin to get observed frequencies fj.
- Step 2: Convert the bin limits into standardized z-values by using the formula.

- Step 3: Find the normal area within each bin assuming a normal distribution.
- Step 4: Find expected frequencies ej by multiplying each normal area by the sample size n.
- Classes may need to be collapsed from the ends inward to enlarge expected frequencies.

15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

Method 3: Equal Expected Frequencies

- Define histogram bins in such a way that an equal number of observations would be expected within each bin under the null hypothesis.
- Define bin limits so that ej = n/c
- A normal area of 1/c in each of the c bins is desired.
- The first and last classes must be open-ended for a normal distribution, so to define c bins, we need c – 1 cut-points.
- The upper limit of bin j can be found directly by using Excel.
- Alternatively, find zj for bin j using Excel and then calculate the upper limit for bin j as
- Once the bins are defined, count the observations fj within each bin and compare them with the expected frequencies ej = n/c.

15.6 ECDF Tests

LO15-7

Chapter 15

LO15-7: State advantages of ECDF tests as compared to chi-square

GOF tests.

- There are many alternatives to the chi-square test based on the Empirical Cumulative Distribution Function (ECDF).
- The Kolmogorov-Smirnov (K-S) test uses the largest absolute difference between the actual and expected cumulative relative frequency of the n data values
- The K-S test is not recommended for grouped data.
- The K-S test assumes that no parameters are estimated.
- If parameters are estimated, use a Lilliefors test.
- Both of these tests are done by computer.
- The Anderson-Darling (A-D) test is widely used for non-normality because of its power.
- The A-D test is based on a probability plot.
- When the data fit the hypothesized distribution closely, the probability plot will be close to a straight line.

Download Presentation

Connecting to Server..