Stat131 w6la association from contingency tables
This presentation is the property of its rightful owner.
Sponsored Links
1 / 37

STAT131 W6La Association from Contingency Tables PowerPoint PPT Presentation


  • 41 Views
  • Uploaded on
  • Presentation posted in: General

STAT131 W6La Association from Contingency Tables. by Anne Porter [email protected] Null and Alternative hypotheses Activity. Card game. Activity Outcomes. We draw a card from a pack until such time as there is a protest.

Download Presentation

STAT131 W6La Association from Contingency Tables

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Stat131 w6la association from contingency tables

STAT131W6La Association from Contingency Tables

by

Anne Porter

[email protected]


Null and alternative hypotheses activity

Null and Alternative hypothesesActivity

  • Card game


Activity outcomes

Activity Outcomes

  • We draw a card from a pack until such time as there is a protest.

  • The cards have been stacked such that all red come first (or all black)

  • The draw is meant be be random ie a mix of red and black.

  • At some point students reject the idea of fairness

  • The proportion of reds is higher than expected by chance

  • (or blacks depending on which was drawn first)

  • Students are in fact rejecting the null hypothesis that the proportion

  • of red cards is 0.5.

  • (or that the proportion of red and black cards is equal)

  • They are accepting the hypothesis that the proportion is not equal 0.5


Null and alternative hypotheses

Null and Alternative hypotheses

  • Null hypothesis is that the proportion of red cards (females) is 0.5 (or that the proportion of red and black cards is equal)

  • Alternative hypothesis is that the proportion of red cards (females) is not equal 0.5


Null and alternative hypotheses formal

Null and Alternative hypothesesformal

  • H0: p = 0.5 and

  • HA p ≠ 0.5

  • The p we refer to is the population proportion

  • We do not hypothesise about a sample proportion

  • We make inference about a population parameter p

Tests of proportions


Lecture outline

Lecture Outline

  • Test hypotheses about association between categorical variables

  • Testing Hypotheses (5 steps)

    • Null and alternative hypotheses

    • a level of significance

    • Select test and state decision rule

    • Perform experiment

    • Draw conclusions

  • test of association AND

  • model fit

  • p values


Contingency tables

Contingency tables

  • For this contingency table what is

  • P(Male) =

  • P(Support)=

20/70

40/70


Contingency tables1

Contingency tables

  • If event male is independent of event support then

  • P(Male and Support) =

P(Male)xP(Support)

=20/70 x 40/70 = 0.1632


Contingency tables2

Contingency tables

  • Given 70 observed people, if P(Male & Support)=0.1632

  • How many are expected to be male and support given independence?

11.43

= 0.1632 x70 = 11.43 if events Males and Support are independent


Contingency tables3

Contingency tables

  • Knowing the expected frequency for (male and support) we have no more degrees of freedom, the remaining values are fixed.

11.43

20-11.43=8.57

30-8.57=21.43

40-11.43=38.57

Note: We had 1 degree of freedom


Contingency tables4

Contingency tables

  • If we observe a sample of data we may ask if the variables sex and level of support are associated? To test this we formally test the hypotheses…

E=11.43

E=8.57

E=38.57

E=21.43


Hypotheses no association

Hypotheses: no association

  • Ho: Under model of independence, E distributed

    (Row total * column total)/grand total

  • Ha: E not distributed

    • (row total*column total)/grand total

E=8.57

E=11.43

E=38.57

E=21.43


2 assign a

2. Assign a

  • a is determined such that we have a desired level of confidence in our procedures (ie in our results).

  • For the chi-square test for association we will use a=0.05

  • We will examine choosing alpha (a) later


Degrees of freedom

Degrees of freedom

  • Knowing the expected frequency for (male and support) we have no more degrees of freedom, the remaining values are fixed.

11.43

20-11.43=8.57

21.43

38.57

Note: We had 1 degree of freedom


Degrees of freedom1

Degrees of freedom

  • The degrees of freedom for a rows x column matrix may be calculated as (r-1)x(c-1)=(2-1)x(2-1)=1

  • r is the number or rows and c is the number of columns

11.43

8.57

21.43

38.57

Note: We had 1 degree of freedom


Hypotheses no association1

Hypotheses: no association

  • Ho: Under model of independence, E distributed

    (Row total * column total)/grand total

  • Ha: E not distributed

    • (row total*column total)/grand total

E=8.57

E=11.43

E=38.57

E=21.43


3 select a test statistic and determine the rejection region

3. Select a test statistic and... determine the rejection region

To test about association in contingency tables we calculate

  • And determine the region of rejection ie how big chi-square has to be before we conclude that the observed are sufficiently different to the expected to reject the null hypothesis

  • eij expected count for the ith row and jth column of the table


3 determine the rejection region

3... determine the rejection region

For our contingency table

df=1,

a=0.05

Then reject Ho there is evidence that the variables are not independent

If the calculated >

3.841


3 determine the rejection region1

3... determine the rejection region

For our contingency table

df=1,

a=0.05

Then reject Ho there is evidence that the variables are not independent

If the calculated >

3.841


4 calculate

4. Calculate

E=11.43

E=8.57

E=28.57

E=21.43


Decision

Decision

  • As calculated value of 0.70 < 3.841 (tabulated value) there insufficient evidence to reject the model that sex and level of support are independent. That is there is no evidence of an association between sex and level of support. The profile of support by males is similar to the profile of support for females. 13/40 (32.5%)males support, 7/30 (23.3%) females support


Spss data entry looks like

SPSS: data entry looks like

  • Data, weight cases by freq has been selected

  • Analyse, Descriptives, Crosstabs and options have been selected


Spss output contingency table

SPSS output: contingency table


Spss output pearson chi square

SPSS output: Pearson Chi-Square

Value of chi-square

Assumption of expected frequencies > 5 hold


Spss output pearson chi square1

SPSS output: Pearson Chi-Square

Probability of getting a statistic as high or greater than 0.706 is 0.401. This is high >0.05 therefore retain Ho, we can get this chi value by chance under independence

Value of chi-square


Example from utts p 528 spss data

Example from Utts p. 528SPSS: data

  • Yes / No Ear infection

  • P Placebo gum

  • X xylitol gum

  • L xylitol lozenge

  • Is there an association between ear infection and gum used?


Under independence expected frequency

Under Independence: Expected frequency


Under independence expected

Under Independence: Expected


Under independence expected1

Under Independence: Expected

Degrees of freedom=

2


Hypotheses

Hypotheses

  • Ho: Under model of independence, E distributed

    -(Row total * column total)/grand total

  • Ha: E not distributed

    -(row total*column total)/grand total

    If

    p1=proportion who get an infection in population given placebo

    p2=proportion who get an infection given Xylitol gum

    P3=proportion who get an infection in a population given Xylitol lozenges

    Ho:

    Ha:

p1=p2=p3

p1, p2 and p3 are not all the same


5 step hypothesis test

5 step hypothesis test

  • Ho: Under model of independence, E distributed

    (Row total * column total)/grand total

  • Ha: E not distributed in this manner

  • a=

  • df=

  • Statistic and Region of rejection

0.05

(3-1)x(2-1)=2

If calculated chi-square >5.991 reject Ho there is evidence that the variables are not independent


Conclusion using decision rule spss

Conclusion: using decision rule & SPSS

>5.991 therefore there is evidence that

the data do not fit the model of independence

  • Chi-square = 6.690


P values sig

P values (sig)

For chi-square test (one tailed) the p value is

  • the probability of getting this statistic or greater


Conclusion using p value from spss

Conclusion using p value from SPSS

The probability of getting a chi-square as high as this or higher is 0.035. This is a small probability (<0.05) if the H0 were true. There is evidence of an association between infection and gum used

  • Chi-square = 6.690

Assumptions re expected frequency>5 OK


Significance tests formal

Significance Tests - Formal

1.Null and alternative hypotheses

2.Assign a

3.Select a statistic and determine the rejection region

4.Perform the experiment and calculate the observed value of c2 or T or Z or…other statistic

5.Draw conclusions in context of problem


Previous hypothesis testing situations

Previous hypothesis testing situations

Model fit

Ho: Expected distributed Binomial(2,0.5)

Ha: Expected not distributed Binomial (2,0.5)

2. Ho: Expected distributed Poisson (0.4)

Ha: Expected not Poisson (0.4)

3. Ho: Expected distributed as per the random stopping model

Ha: Expected not distributed as per random stopping model


Future hypothesis testing situations

Future hypothesis testing situations

  • the null hypothesis may be proportion= 0.5 and

  • alternative hypothesis proportion ≠ 0.5

  • the null hypothesis may be m = 0 and

  • alternative hypothesis m ≠ 0.

Tests of proportions

Tests of means


  • Login