1 / 19

# M obile C omputing G roup - PowerPoint PPT Presentation

M obile C omputing G roup. A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing. Outline. The presentation follows the pyramid schema. Chi2 tests for GoF. Goodness-of-fit (GoF). Background -concepts. Background. Descriptive vs. inferential statistics

Related searches for M obile C omputing G roup

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'M obile C omputing G roup' - sage

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Mobile Computing Group

A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing

The presentation follows the pyramid schema

Chi2 tests for GoF

Goodness-of-fit (GoF)

Background -concepts

• Descriptive vs. inferential statistics

• Descriptive : data used only for descriptive purposes (use tables, graphs, measures of variability etc.)

• Inferential : data used for drawing inferences, make predictions etc.

• Sample vs. population

• A sample is drawn from a population, assumed to have some characteristics.

• The sample is often used to make inferences about the population (inferential statistics) :

• Hypothesis testing

• Estimation of population parameters

• Statistic vs. parameter

• A statistic is related (estimated from) a sample. It can be used for both descriptive and inferential purposes

• A parameter refers to the whole population. A sample statistic is often used to infer a population parameter

• Example : the sample mean may be used to infer the population mean (expected value)

• Hypothesis testing

• A procedure where sample data are used to evaluate a hypothesis regarding the population

• A hypothesis may refer to several things : properties of a single population, relation between two populations etc.

• Two statistical hypotheses are defined: a null H0 and an alternative H1

• H0 is the often a statement of no effect or no difference. It is the hypothesis the researcher seeks to reject

• Inferential statistical test

• Hypothesis testing is carried out via an inferential statistic test :

• Sample data are manipulated to yield a test statistic

• The obtained value of the test statistic is evaluated with respect to a sampling distribution, i.e.,a theoretical probability distribution for the possible values of the test statistic

• The theoretical values of the statistic are usually tabulated and let someone assess the statistical significance of the result of his statistical test

• The goodness-of-fit is a type of hypothesis testing

• devise inferential statistical tests, apply them to the sample, infer the matching of a theoretical distribution to the population distribution

• Hypothesis H0:

• The sample is derived from a theoretical distribution F()

• The sample data are manipulated to derive a test statistic

• In the case of the chi2 statistic this includes aggregation of data into bins and some computations

• The statistic, as computed from data, is checked against the sampling distribution

• For the chi2 test, the sampling distribution is the chi2 distribution, hence the name

• Statistical tests and statistics : the big picture

EDF-based tests

Chi2 type tests

Specialized tests

e.g., KS test, Anderson-Darling test

e.g., Shapiro-Wilk test for normality

Generalized chi2 statistics

Classical chi2 statistics

Log-likelihood ratio statistic

Modified chi2 statistic

Pearson chi2 statistic

• M : number of bins

• Oi (Ni):observed frequency in bin i

• n : sample size

• Ei (npi) : expected frequency in bin i according to the theoretical distribution F()

If X1, X2, X3…Xn , the random sample and F() the theoretical distribution under test,

the Pearson chi2 statistic is computed as:

• Theory says that the Pearson chi2 statistic follows a chi2 distribution, whose df are

• M-1, when the parameters of the fitted distribution are given a priori (case 0 test)

• Somewhere between M-1 and M-1-q, when the q parameters of the distribution are estimated by the sample data

• Usually, the df for this case are taken to be M-1-q

• Having estimated the value of the chi2 statistic X2 , I check the chi2 distribution with M-1 (M-1-q) df to find

• What is the probability to get a value equal to or greater than the computed value X2, called p-value

• If p > a, where a is the significance level of my test, the hypothesis is rejected, otherwise it is retained

• Standard values for a are 0.1, 0.05, 0.01 – the higher a is the more conservative I am in rejecting the hypothesis H0

• A die is rolled 120 times

• 1 comes 20 times, 2 comes 14, 3 comes 18, 4 comes 17, 5 comes 22 and 6 comes 29 times

• The question is: “Is the die biased?” –or better: “Do these data suggest that the die is biased?”

• Hypothesis H0 : the die is not biased

• Therefore, according to the null hypothesis these numbers should be distributed uniformly

• F() : the discrete uniform distribution

Example – cont.

• Interpretation

• The distribution of the test statistic has 5 df

• The probability to get a value smaller or equal than 6.7 under a chi2 distribution with 5 df (p-value) is 0.75, which is < 1-a for all a in {0.01..0.1}.

• Therefore the hypothesis that the die is not biased cannot be rejected

• Computations:

• Graphical illustration

• At 10% significance level, I would reject the hypothesis if the computed X2>9.24)

10% of the area under the curve

6.7

9.24

11.07

15.09

z

P-value :

0.25

0.1

0.05

0.01

• It can be estimated for both discrete and continuous variables

• Holds for all chi2 statistics. Max flexibility but fails to make use of all available information for continuous variables

• It is maybe the simplest one from computational point of view

• As with all chi2 statistics, one needs to define number and borders of bins

• These are generally a function of sample size and the theoretical distribution under test

• How many and which?

• Different opinions in literature, no rigid proof of optimality

• There seems to be convergence on the following aspects

• Probability of bins

• The bins should be chosen equiprobable with respect to the theoretical distribution under test

• Minimum expected frequencies npi :

• (Cramer, 46) : npi > 10, for all bins

• (Cochran, 54) : npi > 1 for all bins, npi >= 5 for 80% of bins

• (Roscoe and Byars,71)

• Relevance of bins M to sample size N

• (Mann and Wald, 42), (Schorr, 74) : for large sample sizes

1.88n2/5 < M < 3.76n2/5

• (Koehler and Larntz,80) : for small sample size

M>=3, n>=10 and n2/M>=10

• (Roscoe and Byars, 71)

• Equi-probable bins hypothesis : N > M when a = 0.01 and a = 0.05

• Non-equiprobable bins : N>2M (a = 0.05) and N>4M (a=0.01)

• Bins vs. sample size according to Mann and Ward

1.0

0.9

0.8

0.7

0.6

Equi-probable bins easy to select

0.5

0.4

0.3

0.2

0.1

Bin i

1.0

Less straightforward to define equi-probable bins

1

2

3

4

5

6

7

Textbooks

• D.J. Sheskin, Handbook of parametric and nonparametric statistical procedures

• Introduction (descriptive vs. inferential statistics, hypothesis testing, concepts and terminology)

• Test 8 (chap. 8) – The Chi-Square Goodness-of-Fit Test (high-level description with examples and discussion on several aspects)

• R. Agostino, M. Stephens, Goodness-of-fit techniques

• Chapter 3 – Tests of Chi-square type

• Reviews the theoretical background and looks more generally at chi2 tests, not only the Pearson test.

Papers

• S. Horn, Goodness-of-Fit tests for discrete data: A review and an Application to a Health Impairment scale

• Good discussion of the properties and pros/cons of most goodness-of-fit tests for discrete data

• accessible, tutorial-like