prepared by lloyd r jaisingh
Download
Skip this Video
Download Presentation
Prepared by Lloyd R. Jaisingh

Loading in 2 Seconds...

play fullscreen
1 / 19

Prepared by Lloyd R. Jaisingh - PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on

A PowerPoint Presentation Package to Accompany. Applied Statistics in Business & Economics, 4 th edition David P. Doane and Lori E. Seward. Prepared by Lloyd R. Jaisingh. Chapter Contents 15.1 Chi-Square Test for Independence 15.2 Chi-Square Tests for Goodness-of-Fit

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Prepared by Lloyd R. Jaisingh ' - jane-hoffman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
prepared by lloyd r jaisingh

A PowerPoint Presentation Package to Accompany

Applied Statistics in Business & Economics, 4th edition David P. Doane and Lori E. Seward

Prepared by Lloyd R. Jaisingh

chi square tests

Chapter Contents

15.1 Chi-Square Test for Independence

15.2 Chi-Square Tests for Goodness-of-Fit

15.3 Uniform Goodness-of-Fit Test

15.4 Poisson Goodness-of-Fit Test

15.5 Normal Chi-Square Goodness-of-Fit Test

15.6 ECDF Tests (Optional)

Chi-Square Tests

Chapter 15

chi square tests1

Chapter Learning Objectives

LO15-1: Recognize a contingency table.

LO15-2:Find degrees of freedom and use the chi-square table of critical values.

LO15-3:Perform a chi-square test for independence on a contingency table.

LO15-4:Perform a goodness-of-fit (GOF) test for a uniform distribution.

LO15-5:Explain the GOF test for a Poisson distribution.

LO15-6: Use computer software to perform a chi-square GOF test for normality.

LO15-7: State advantages of ECDF tests as compared to chi-square GOF tests.

Chi-Square Tests

Chapter 15

15 1 chi square test for independence
15.1 Chi-Square Test for Independence

LO15-1

Chapter 15

  • A contingency table is a cross-tabulation of n paired observations into categories.
  • Each cell shows the count of observations that fall into the category defined by its row (r) and column (c) heading.
  • For example:

LO15-1: Recognize a contingency table.

Contingency Tables

15 1 chi square test for independence1
15.1 Chi-Square Test for Independence

LO15-3, 2

Chapter 15

LO15-3: Perform a chi-square test for independence on a

contingency table.

  • In a test of independence for an r x c contingency table, the hypotheses areH0: Variable A is independent of variable BH1: Variable A is not independent of variable B
  • Use the chi-square test for independence to test these hypotheses.
  • This non-parametric test is based on frequencies.
  • The n data pairs are classified into c columns and r rows and then the observed frequencyfjk is compared with the expected frequencyejk.
  • The critical value comes from the chi-square probability distribution with n degrees of freedom. (See Appendix E for table values).

d.f. = degrees of freedom = (r – 1)(c – 1)where r = number of rows in the tablec = number of columns in the table

LO15-2: Find degrees of freedom and use the chi-square

table of critical values.

Chi-Square Test

15 1 chi square test for independence2
15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Expected Frequencies

  • Assuming that H0 is true, the expected frequency of row j and column k is:

ejk = RjCk/n

where Rj = total for row j (j = 1, 2, …, r)Ck = total for column k (k = 1, 2, …, c)n = sample size

Steps in Testing the Hypotheses

  • Step 1: State the Hypotheses.
  • H0: Variable A is independent of variable B
  • H1: Variable A is not independent of variable B
  • Step 2: Specify the Decision Rule.
  • Calculate d.f. = (r – 1)(c – 1)
  • For a given a, look up the right-tail critical value (c2R) from Appendix E or by using Excel.
15 1 chi square test for independence3
15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Steps in Testing the Hypotheses

  • Step 4: Calculate the Test Statistic.
  • The chi-square test statistic is
  • Step 5: Make the Decision.
  • Reject H0 if test statistic > c2R or if the p-value ≤ a.

Small Expected Frequencies

  • The chi-square test is unreliable if the expected frequencies are too small.
  • Rules of thumb:
    • Cochran’s Rule requires that ejk > 5 for all cells.
    • Up to 20% of the cells may have ejk < 5.
  • Most agree that a chi-square test is infeasible if ejk < 1 in any cell.
  • If this happens, try combining adjacent rows or columns to enlarge the expected frequencies.
15 1 chi square test for independence4
15.1 Chi-Square Test for Independence

LO15-3

Chapter 15

Test of Two Proportions

  • Chi-square tests for independence can also be used to analyze quantitative variables by coding them into categories.
  • For a 2 × 2 contingency table, the chi-square test is equivalent to a two-tailed z test for two proportions, if the samples are large enough to ensure normality.
  • The hypotheses are:

Cross-Tabulating Raw Data

Why Do a Chi-Square Test on Numerical Data?

  • The researcher may believe there’s a relationship between X and Y, but doesn’t want to use regression.
  • There are outliers or anomalies that prevent us from assuming that the data came from a normal population.
  • The researcher has numerical data for one variable but not the other.

Figure 14.6

15 2 chi square tests for goodness of fit
15.2 Chi-Square Tests for Goodness-of-Fit

Chapter 15

  • The goodness-of-fit (GOF) test helps you decide whether your sample resembles a particular kind of population.
  • The chi-square test will be used because it is versatile and easy to understand.

Purpose of the Test

Multinomial GOF Test

  • A multinomial distribution is defined by any k probabilities p1, p2, …, pk that sum to unity. For example,

H0: p1 = .13, p2 = .13, p3 = .24, p4 = .20, p5 = .16, p6 = .14H1: At least one of the pj differs from the hypothesized value.

  • If no parameters are estimated (m = 0) and there are c = 6 classes, so the degrees of freedom will be d.f. = c – m – 1 = 6 – 0 – 1 = 5.
15 2 chi square tests for goodness of fit1
15.2 Chi-Square Tests for Goodness-of-Fit

Chapter 15

Hypotheses for GOF

  • The hypotheses are:

H0: The population follows a _____ distributionH1: The population does not follow a ______ distribution

  • The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal).

Test Statistic and Degrees of Freedom for GOF

Where fj = the observed frequency of

observations in class j and ej = the expected

frequency in class j if H0 were true.

  • The test statistic follows the chi-square distribution with degrees of freedomd.f. = c – m – 1 where c is the number of classes used in the test m is the number of parameters estimated.
15 3 uniform goodness of fit test
15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

LO15-4: Perform a goodness of-fit (GOF) test for a uniform

distribution.

  • The uniform goodness-of-fit test is a special case of the multinomial in which every value has the same chance of occurrence.
  • The chi-square test for a uniform distribution compares all c groups simultaneously.
  • The hypotheses are:

H0: p1 = p2 = …, pc = 1/cH1: Not all pj are equal

Uniform Distribution

  • The test can be performed on data that are already tabulated into groups.
  • Calculate the expected frequency ejfor each cell.
  • The degrees of freedom are d.f. = c – 1 since there are no parameters for the uniform distribution.
  • Obtain the critical value c2a from Appendix E for the desired level of significance a.
  • The p-value can be obtained from Excel.
  • Reject H0 if p-value ≤ a.
15 3 uniform goodness of fit test1
15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

Uniform GOF Test: Raw Data

  • First form c bins of equal width and create a frequency distribution.
  • Calculate the observed frequency fj for each bin.
  • Define ej= n/c.
  • Perform the chi-square calculations.
  • The degrees of freedom are d.f. = c – 1 since there are no parameters for the uniform distribution.
  • Obtain the critical value from Appendix E for a given significance level a and make the decision.
  • Maximize the test’s power by defining bin width as (As a result, the expected frequencies will be as large as possible.)
15 3 uniform goodness of fit test2
15.3 Uniform Goodness-of-Fit Test

LO15-4

Chapter 15

Uniform GOF Test: Raw Data

  • Calculate the mean and standard deviation of the uniform distribution as:
  • If the data are not skewed and the sample size is large (n > 30), then the mean is approximately normally distributed.
  • So, test the hypothesized uniform mean using
15 4 poisson goodness of fit test
15.4 Poisson Goodness-of-Fit Test

LO15-5

Chapter 15

  • In a Poisson distribution model, X represents the number of events per unit of time or space.
  • X is a discrete nonnegative integer (X = 0, 1, 2, …).
  • Event arrivals must be independent of each other.
  • Sometimes called a model of rare events because X typically has a small mean.

LO15-5: Explain the GOF test for a Poisson distribution.

Poisson Data-Generating Situations

Poisson Goodness-of-Fit Test

  • The mean l is the only parameter.
  • If l is unknown, it must be estimated from the sample.
  • Use the estimated l to find the Poisson probability P(X) for each value of X.
  • Compute the expected frequencies.
  • Perform the chi-square calculations.
  • Make the decision.
  • You may need to combine classes until expected frequencies become large enough for the test (at least until ej> 2).
15 4 poisson goodness of fit test1
15.4 Poisson Goodness-of-Fit Test

LO15-5

Chapter 15

Poisson GOF Test: Tabulated Data

  • Calculate the sample mean as:
  • Using this estimate mean, calculate the Poisson probabilities either by using the Poisson formula P(x) = (lxe-l)/x! or Excel.
  • For c classes with m = 1 parameter estimated, the degrees of freedom are d.f. = c – m – 1
  • Obtain the critical value for a given a from Appendix E.
  • Make the decision.
15 5 normal chi square goodness of fit test
15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

LO15-6: Use computer software to perform a chi-square GOF test for normality.

  • Two parameters, the mean m and the standard deviation s, fully describe the normal distribution.
  • Unless m and s are know apriori, they must be estimated from a sample.
  • Using these statistics, the chi-square goodness-of-fit test can be used.

Normal Data Generating Situations

Method 1: Standardizing the Data

  • Transform the sample observations x1, x2, …, xninto standardized values.
15 5 normal chi square goodness of fit test1
15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

Method 2: Equal Bin Widths

  • To obtain equal-width bins, divide the exact data range into c groups of equal width.
  • Step 1: Count the sample observations in each bin to get observed frequencies fj.
  • Step 2: Convert the bin limits into standardized z-values by using the formula.
  • Step 3: Find the normal area within each bin assuming a normal distribution.
  • Step 4: Find expected frequencies ej by multiplying each normal area by the sample size n.
  • Classes may need to be collapsed from the ends inward to enlarge expected frequencies.
15 5 normal chi square goodness of fit test2
15.5 Normal Chi-Square Goodness-of-Fit Test

LO15-6

Chapter 15

Method 3: Equal Expected Frequencies

  • Define histogram bins in such a way that an equal number of observations would be expected within each bin under the null hypothesis.
  • Define bin limits so that ej = n/c
  • A normal area of 1/c in each of the c bins is desired.
  • The first and last classes must be open-ended for a normal distribution, so to define c bins, we need c – 1 cut-points.
  • The upper limit of bin j can be found directly by using Excel.
  • Alternatively, find zj for bin j using Excel and then calculate the upper limit for bin j as
  • Once the bins are defined, count the observations fj within each bin and compare them with the expected frequencies ej = n/c.
15 6 ecdf tests
15.6 ECDF Tests

LO15-7

Chapter 15

LO15-7: State advantages of ECDF tests as compared to chi-square

GOF tests.

  • There are many alternatives to the chi-square test based on the Empirical Cumulative Distribution Function (ECDF).
  • The Kolmogorov-Smirnov (K-S) test uses the largest absolute difference between the actual and expected cumulative relative frequency of the n data values
  • The K-S test is not recommended for grouped data.
  • The K-S test assumes that no parameters are estimated.
  • If parameters are estimated, use a Lilliefors test.
  • Both of these tests are done by computer.
  • The Anderson-Darling (A-D) test is widely used for non-normality because of its power.
  • The A-D test is based on a probability plot.
  • When the data fit the hypothesized distribution closely, the probability plot will be close to a straight line.
ad