Chapter 14

1 / 47

Chapter 14 - PowerPoint PPT Presentation

Chapter 14. Chi Square -  2. Chi Square. Chi Square is a non-parametric statistic used to test the null hypothesis. It is used for nominal data. It is equivalent to the F test that we used for single factor and factorial analysis. … Chi Square.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Chapter 14' - erika

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Chapter 14

Chi Square - 2

Chi square

Chi Square
• Chi Square is a non-parametric statistic used to test the null hypothesis.
• It is used for nominal data.
• It is equivalent to the F test that we used for single factor and factorial analysis.

Chi square

… Chi Square
• Nominal data puts each participant in a category. Categories are best when mutually exclusive and exhaustive. This means that each and every participant fits in one and only one category
• Chi Square looks at frequencies in the categories.

Chi square

Expected frequencies and the null hypothesis ...
• Chi Square compares the expected frequencies in categories to the observed frequencies in categories.
• “Expected frequencies”are the frequencies in each cell predicted by the null hypothesis

Chi square

… Expected frequencies and the null hypothesis ...

The null hypothesis:

• H0: fo = fe
• There is no difference between the observed frequency and the frequency predicted (expected) by the null.

The experimental hypothesis:

• H1: fo  fe
• The observed frequency differs significantly from the frequency predicted (expected) by the null.

Chi square

Calculating 2

For each cell:

• Calculate the deviations of the observed from the expected.
• Square the deviations.
• Divide the squared deviations by the expected value.

Chi square

Calculating 2
• Then, look up 2 in Chi Square Table
• df = k - 1 (one sample 2)
• OR df= (Columns-1) * (Rows-1)
• (2 or more samples)

Chi square

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Degrees of

freedom

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Critical values

 = .05

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Critical values

 = .01

Example

If there were 5 degrees of freedom, how big would 2

have to be for significance at the .05 level?

Chi square

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Another example

If there were 2 degrees of freedom, how big would 2

have to be for significance at the .05 level?

Note: Unlike most other tables you have seen, the critical

values for Chi Square get larger as df increase. This is

because you are summing over more cells, each of which

usually contributes to the total observed value of chi square.

Chi square

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

2 = 13.33

One sample example from the cpe: Party: 75% male, 25% femaleThere are 40 swimmers. Since 75% of people at party are male, 75% of swimmers should be male. So expected value for males is .750 X 40 = 30. For women it is .250 x 40 = 10.00

Observed

20

20

Expected

30

10

O-E

-10

10

(O-E)2

100

100

(O-E)2/E

3.33

10

Male

Female

df = k-1 = 2-1 = 1

Chi square

2 (1, n=40)= 13.33

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Exceeds critical value at  = .01

Reject the null hypothesis.

Gender does affect who goes

swimming.

Women go swimming

more than expected.

Men go swimming

less than expected.

Freshmen

Sophomores

2 sample example

Freshman and sophomores who like horror movies.

150

50

Likes horror films

100

200

Dislikes horror films

Chi square

Freshmen

Sophomores

… CPE 15.2.1 Freshman and sophomores and horror movies.

There are 500 altogether. 200 (or a proportion of .400 are freshmen, 300 (.600) are sophmores. (Proportions appear in parentheses in the margins.) Multiplying by row totals yield the following expected frequency for the first cell. (This time we use the formula: (Proprowncol)=Expected Frequency).

(EF appears in parentheses in each cell.)

(100)

200

(.400)

150

50 (100)

Likes horror films

100 (150)

200 (150)

300

(.600)

Dislikes horror films

250

500

250

Chi square

O-E

50

-50

-50

50

(O-E)2

2500

2500

2500

2500

(O-E)2/E

25.00

16.67

25.00

16.67

2 = 83.33

Computing 2

Observed

150

100

50

200

Expected

100

150

100

150

Fresh Likes

Fresh Dislikes

Soph Likes

Soph Dislikes

df = (C-1)(R-1) = (2-1)(2-1) = 1

Chi square

2 (1, n=500)= 83.33

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Critical at  = .01

Reject the null hypothesis.

Fresh/Soph dimension does affect

liking for horror movies.

Proportionally, more freshman

than sophomores

like horror movies

The only (slightly) hard part is computing expected frequencies

In one sample case, multiply n by hypothetical proportion based on random model.

Random model says that proportion in population in each category should be same as in the sample.

Chi square

Station Station Station Station

A B C D

25

25

25

25

Expected Values

40

30

20

10

Observed Values

Simple Example - 100 teenagers listen to radio stations

H1: Some stations are more popular with teenagers than others.

H0: Radio station do not differ in popularity with teenagers.

NOTE: YOU ALWAYS TEST H0

Expected frequencies are the frequencies predicted by the null hypothesis. In this case, the problem is simple because the null predicts an equalproportion of teenagers will prefer each of the four radio stations.

Is the observed

significantly different

from the expected?

Chi square

O-E

(O-E)2

(O-E)2/E

2 = 20.00

Observed

Expected

40

30

20

10

25

25

25

25

15

5

-5

15

225

25

25

225

9.00

1.00

1.00

9.00

Closeness to final exam

Category 1

Station 2

Station 3

Station 4

df = k-1 = (4-1) = 3

2(3, n=100) = 20.00, p<.01

Example - Admissions to Psychiatric Hospitals Close to a once/year final

H1: More students are admitted to psychiatric

hospitals when it is near their final exam.

H0: Time from final exam does not have an

.

Category 1: Within 7 days of final. (11 admitted)

Category 2: Between 8 and 30 days. (24 admitted)

Category 3: Between 31 and 90 days. (69 admitted)

Category 4: More than 90 days. (96 admitted)

Chi square

Number of days

Category 1 (within 7):

Category 2 (8-30):

Category 3 (31-90):

Category 4 (rest of year):

• Expected frequency=expected proportion of days*n
• There are 365 days and 1 final and 200 patients admitted each year.
• Proportion of each kind of day computed below:

Chi square

Days: Category 1 (within 7):

Category 2 (8-30):

Category 3 (31-90 ):

Category 4 (rest of year):

Expected Frequencies

To obtain expected frequencies with 200 admissions: multiply proportion of days of each type by n=200. This time the proportions are not equal.

Chi square

O-E

(O-E)2

(O-E)2/E

2 = 1.57

Observed

Expected

11

24

69

96

8

26

66

100

3

-2

3

-4

9

4

9

16

1.12

0.15

0.14

0.16

Closeness to final exam

Category 1

Category 2

Category 3

Category 4

df = k-1 = (4-1) = 3

2(3, n=200) = 1.57, n.s.

The only (slightly)hard part is computing expected frequencies

In the multi-sample case, multiply the proportion in each row by n in each column to obtain EF in each cell.

Chi square

Vit C and flu study
• Sixty randomly chosen participants.
• Thirty get Vitamin C.
• Of that 30, 10 get the flu, 20 do not
• Thirty get placebo
• Of that 30, 15 get the flu, 15 do not

Chi square

Expected frequency = proportionROW nCOL
• got flu no flu row n (prop.)
• Vit C 10 20 30 (.500)
• No Vit C 15 15 30 (.500)
• Col. Totals 30 30 n=60

Chi square

No influenza.

10

(12.50)

20

(17.50)

Vitamin C

15

(12.50)

15

(17.50)

Placebo

(Expected)

Values

Observed

Values

Expected frequencies

Multiply the proportion in each row times the number in each column. Here Vitamin C row has 30 research participants. Total n = 60.

So proportion in that row =30/60=.500. Same for placebo group.

Number in each column: Twenty-five got influenza. So (25 X .500=12.50 should come from the Vitamin C group. Same for placebo.

Thirty five did not get influenza, so 35X.500 = 17.5 of each group should not have gotten the flu.

Are the observed

significantly different

from the expected?

Chi square

O-E

-2.50

2.50

2.50

-2.50

(O-E)2

6.25

6.25

6.25

6.25

(O-E)2/E

.50

.36

.50

.36

2 = 1.72

Computing 2

Observed

10

20

15

15

Expected

12.50

17.50

12.50

17.50

VitC-got flu

VitC-no flu

Placebo-got flu

Placebo-no flu

df = (C-1)(R-1) = (2-1)(2-1) = 1

Chi square

Differences are not significant

2 (1, n=60)= 1.72, n.s.

Vit C consumption not significantly related to getting the flu in this study.

Chi square

A 3 x 4 Chi Square

Women, stress, and seating preferences.

(and perimeter vs. interior, front vs. back

Front Front Back Back

Perim Inter Perim Inter

Very Stressed Females

Moderately Stressed Females

Control Group Females

10

70

5

15

100

15

50

10

25

100

35

30

15

20

100

30

60

n=300

150

60

Chi square

Proportion in each row

nROW/n=100/300=.333

Chi square

Expected frequencies

Women, stress, and perimeter versus interior

seating preferences.

Front Front Back Back

Perim Inter Perim Inter

Very Stressed Females

Moderately Stressed Females

Control Group Females

10

(20)

70

5

15

100

(20)

15

50

10

25

100

(20)

35

30

15

20

100

30

60

300

150

60

Chi square

Column 2

Women, stress, and perimeter versus interior

seating preferences.

Front Front Back Back

Perim Inter Perim Inter

Very Stressed Females

Moderately Stressed Females

Control Group Females

10

(20)

70

5

15

(50)

100

(20)

15

50

(50)

10

25

100

(20)

35

30

(50)

15

20

100

30

60

300

150

60

Chi square

Column 3

Women, stress, and perimeter versus interior

seating preferences.

Front Front Back Back

Perim Inter Perim Inter

Very Stressed Females

Moderately Stressed Females

Control Group Females

10

(20)

70

5

15

(50)

(10)

100

(20)

15

50

(50)

10

(10)

25

100

(20)

35

30

(50)

15

(10)

20

100

30

60

300

150

60

Chi square

All the expected frequencies

Women, stress, and perimeter versus interior

seating preferences.

Front Front Back Back

Perim Inter Perim Inter

Very Stressed Females

Moderately Stressed Females

Control Group Females

10

(20)

70

5

15

(50)

(10)

(20)

100

(20)

15

50

(50)

10

(10)

25

(20)

100

(20)

35

30

(50)

15

(10)

20

(20)

100

30

60

300

150

60

Chi square

O-E

-10

20

-5

-5

(O-E)2

100

400

25

25

(O-E)2/E

5.00

8.00

2.50

1.25

2 = 41.00

Observed

10

70

5

15

Expected

20

50

10

20

Very Stressed

FrontP

FrontI

BackP

BackI

15

50

10

25

20

50

10

20

-5

0

0

5

25

0

0

25

1.25

0.00

0.00

1.25

Moderately Stressed

FrontP

FrontI

BackP

BackI

35

30

15

20

20

50

10

20

15

-20

5

0

225

400

25

0

11.25

8.00

2.50

0.00

Control Group

FrontP

FrontI

BackP

BackI

df = (C-1)(R-1) = (4-1)(3-1) = 6

2 (6, N=300)= 41.00

Critical values of2

df 1 2 3 4 5 6 7 8

.05 3.84 5.99 5.82 9.49 11.07 12.59 14.07 15.51

.01 6.63 9.21 11.34 13.28 15.09 16.81 18.48 20.09

df 9 10 11 12 13 14 15 16

.05 16.92 18.31 19.68 21.03 22.36 23.68 25.00 26.30

.01 21.67 23.21 24.72 26.22 27.69 29.14 30.58 32.00

df 17 18 19 20 21 22 23 24

.05 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42

.01 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98

df 25 26 27 28 29 30

.05 37.65 38.89 40.14 41.34 42.56 43.77

.01 44.31 45.64 46.96 48.28 49.59 50.89

Critical at  = .01

Reject the null hypothesis.

There is an effect

between stressed women and

seating position.

Observed

10

70

5

15

Expected

20

50

10

20

O-E

-10

20

-5

-5

(O-E)2

100

400

25

25

(O-E)2/E

5.00

8.00

2.50

1.25

Very Stressed

FrontP

FrontI

BackP

BackI

15

50

10

25

20

50

10

20

-5

0

0

5

25

0

0

25

1.25

0.00

0.00

1.25

Moderately Stressed

FrontP

FrontI

BackP

BackI

Very stressed women avoid

the perimeter and

prefer the front interior.

The control group prefers

the perimeter and avoids

the front interior.

35

30

15

20

20

50

10

20

15

-20

5

0

225

400

25

0

11.25

8.00

2.50

0.00

Control Group

FrontP

FrontI

BackP

BackI

2 = 41.00

df = (C-1)(R-1) = (4-1)(3-1) = 6

Summary: Different Ways of Computing the Frequencies Predicted by the Null Hypothesis
• One sample
• Expect subjects to be distributed equally in each cell. OR
• Expect subjects to be distributed proportionally in each cell. OR
• Expect subjects to be distributed in each cell based on prior knowledge, such as, previous research.
• Multi-sample
• Expect subjects in different conditions to be distributed similarly to each other. Find the proportion in each row and multiply by the number in each column to do so.

Chi square

Conclusion - Chi Square
• Chi Square is a non-parametric statistic,used for nominal data.
• It is equivalent to the F test that we used for single factor and factorial analysis.
• Chi Square compares the expected frequencies in categories to the observed frequencies in categories.

Chi square

… Conclusion - Chi Square

The null hypothesis:

• H0: fo = fe
• There is no difference between the observed frequency and frequency predicted by the null hypothesis.

The experimental hypothesis:

• H1: fo  fe
• The observed frequency differs significantly from the frequency expected by the null hypothesis.

Chi square

The end.

Chi square