- 48 Views
- Uploaded on
- Presentation posted in: General

M obile C omputing G roup

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Mobile Computing Group

A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing

The presentation follows the pyramid schema

Chi2 tests for GoF

Goodness-of-fit (GoF)

Background -concepts

- Descriptive vs. inferential statistics
- Descriptive : data used only for descriptive purposes (use tables, graphs, measures of variability etc.)
- Inferential : data used for drawing inferences, make predictions etc.

- Sample vs. population
- A sample is drawn from a population, assumed to have some characteristics.
- The sample is often used to make inferences about the population (inferential statistics) :
- Hypothesis testing
- Estimation of population parameters

- Statistic vs. parameter
- A statistic is related (estimated from) a sample. It can be used for both descriptive and inferential purposes
- A parameter refers to the whole population. A sample statistic is often used to infer a population parameter
- Example : the sample mean may be used to infer the population mean (expected value)

- Hypothesis testing
- A procedure where sample data are used to evaluate a hypothesis regarding the population
- A hypothesis may refer to several things : properties of a single population, relation between two populations etc.
- Two statistical hypotheses are defined: a null H0 and an alternative H1
- H0 is the often a statement of no effect or no difference. It is the hypothesis the researcher seeks to reject

- Inferential statistical test
- Hypothesis testing is carried out via an inferential statistic test :
- Sample data are manipulated to yield a test statistic
- The obtained value of the test statistic is evaluated with respect to a sampling distribution, i.e.,a theoretical probability distribution for the possible values of the test statistic
- The theoretical values of the statistic are usually tabulated and let someone assess the statistical significance of the result of his statistical test

- Hypothesis testing is carried out via an inferential statistic test :
- The goodness-of-fit is a type of hypothesis testing
- devise inferential statistical tests, apply them to the sample, infer the matching of a theoretical distribution to the population distribution

- Hypothesis H0:
- The sample is derived from a theoretical distribution F()

- The sample data are manipulated to derive a test statistic
- In the case of the chi2 statistic this includes aggregation of data into bins and some computations

- The statistic, as computed from data, is checked against the sampling distribution
- For the chi2 test, the sampling distribution is the chi2 distribution, hence the name

- Statistical tests and statistics : the big picture

EDF-based tests

Chi2 type tests

Specialized tests

e.g., KS test, Anderson-Darling test

e.g., Shapiro-Wilk test for normality

Generalized chi2 statistics

Classical chi2 statistics

Log-likelihood ratio statistic

Modified chi2 statistic

Pearson chi2 statistic

- M : number of bins
- Oi (Ni):observed frequency in bin i
- n : sample size
- Ei (npi) : expected frequency in bin i according to the theoretical distribution F()

If X1, X2, X3…Xn , the random sample and F() the theoretical distribution under test,

the Pearson chi2 statistic is computed as:

- Theory says that the Pearson chi2 statistic follows a chi2 distribution, whose df are
- M-1, when the parameters of the fitted distribution are given a priori (case 0 test)
- Somewhere between M-1 and M-1-q, when the q parameters of the distribution are estimated by the sample data
- Usually, the df for this case are taken to be M-1-q

- Having estimated the value of the chi2 statistic X2 , I check the chi2 distribution with M-1 (M-1-q) df to find
- What is the probability to get a value equal to or greater than the computed value X2, called p-value
- If p > a, where a is the significance level of my test, the hypothesis is rejected, otherwise it is retained
- Standard values for a are 0.1, 0.05, 0.01 – the higher a is the more conservative I am in rejecting the hypothesis H0

- A die is rolled 120 times
- 1 comes 20 times, 2 comes 14, 3 comes 18, 4 comes 17, 5 comes 22 and 6 comes 29 times
- The question is: “Is the die biased?” –or better: “Do these data suggest that the die is biased?”
- Hypothesis H0 : the die is not biased
- Therefore, according to the null hypothesis these numbers should be distributed uniformly
- F() : the discrete uniform distribution

- Interpretation
- The distribution of the test statistic has 5 df
- The probability to get a value smaller or equal than 6.7 under a chi2 distribution with 5 df (p-value) is 0.75, which is < 1-a for all a in {0.01..0.1}.
- Therefore the hypothesis that the die is not biased cannot be rejected

- Computations:

- Graphical illustration

- At 10% significance level, I would reject the hypothesis if the computed X2>9.24)

10% of the area under the curve

6.7

9.24

11.07

15.09

z

P-value :

0.25

0.1

0.05

0.01

- It can be estimated for both discrete and continuous variables
- Holds for all chi2 statistics. Max flexibility but fails to make use of all available information for continuous variables

- It is maybe the simplest one from computational point of view
- As with all chi2 statistics, one needs to define number and borders of bins
- These are generally a function of sample size and the theoretical distribution under test

- How many and which?
- Different opinions in literature, no rigid proof of optimality

- There seems to be convergence on the following aspects
- Probability of bins
- The bins should be chosen equiprobable with respect to the theoretical distribution under test

- Minimum expected frequencies npi :
- (Cramer, 46) : npi > 10, for all bins
- (Cochran, 54) : npi > 1 for all bins, npi >= 5 for 80% of bins
- (Roscoe and Byars,71)

- Probability of bins

- Relevance of bins M to sample size N
- (Mann and Wald, 42), (Schorr, 74) : for large sample sizes
1.88n2/5 < M < 3.76n2/5

- (Koehler and Larntz,80) : for small sample size
M>=3, n>=10 and n2/M>=10

- (Roscoe and Byars, 71)
- Equi-probable bins hypothesis : N > M when a = 0.01 and a = 0.05
- Non-equiprobable bins : N>2M (a = 0.05) and N>4M (a=0.01)

- (Mann and Wald, 42), (Schorr, 74) : for large sample sizes

- Bins vs. sample size according to Mann and Ward

1.0

0.9

0.8

0.7

0.6

Equi-probable bins easy to select

0.5

0.4

0.3

0.2

0.1

Bin i

1.0

Less straightforward to define equi-probable bins

1

2

3

4

5

6

7

Textbooks

- D.J. Sheskin, Handbook of parametric and nonparametric statistical procedures
- Introduction (descriptive vs. inferential statistics, hypothesis testing, concepts and terminology)
- Test 8 (chap. 8) – The Chi-Square Goodness-of-Fit Test (high-level description with examples and discussion on several aspects)

- R. Agostino, M. Stephens, Goodness-of-fit techniques
- Chapter 3 – Tests of Chi-square type
- Reviews the theoretical background and looks more generally at chi2 tests, not only the Pearson test.

- Chapter 3 – Tests of Chi-square type

Papers

- S. Horn, Goodness-of-Fit tests for discrete data: A review and an Application to a Health Impairment scale
- Good discussion of the properties and pros/cons of most goodness-of-fit tests for discrete data
- accessible, tutorial-like