Chi square and analysis of variance anova
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Chi-Square and Analysis of Variance (ANOVA) PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on
  • Presentation posted in: General

Chi-Square and Analysis of Variance (ANOVA). Lecture 9. The Chi-Square Distribution and Test for Independence. Hypothesis testing between two or more categorical variables. Chi-square Test of Independence. Tests the association between two nominal (categorical) variables.

Download Presentation

Chi-Square and Analysis of Variance (ANOVA)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chi square and analysis of variance anova

Chi-Square and Analysis of Variance (ANOVA)

Lecture 9


The chi square distribution and test for independence

The Chi-Square Distribution and Test for Independence

Hypothesis testing between two or more categorical variables


Chi square test of independence

Chi-square Test of Independence

  • Tests the association between two nominal (categorical) variables.

    • Null Hyp: The 2 variables are independent.

  • Its really just a comparison between expected frequencies and observed frequencies among the cells in a crosstabulation table.


Example crosstab gender x binary question

Example Crosstab: gender x binary question


Degrees of freedom

Degrees of freedom

  • Chi-square degrees of freedom

    • df = (r-1) (c-1)

      • Where r = # of rows, c = # of columns

      • Thus, in any 2x2 contingency table, the degrees of freedom = 1.

      • As the degrees of freedom increase, the distribution shifts to the right and the critical values of chi-square become larger.


Chi square distribution

Chi-Square Distribution

  • The chi-square distribution results when independent variables with standard normal distributions are squared and summed.


Requirements for chi square test

Requirements for Chi-Square test

  • Must be a random sample from population

  • Data must be in raw frequencies

  • Variables must be independent

  • Categories for each I.V. must be mutually exclusive and exhaustive


Using the chi square test

Using the Chi-Square Test

  • Often used with contingency tables (i.e., crosstabulations)

    • E.g., gender x race

  • Basically, the chi-square test of independence tests whether the columns are contingent on the rows in the table.

    • In this case, the null hypothesis is that there is no relationship between row and column frequencies.


Practical example

Practical Example:

  • Expected frequencies versus observed frequencies

  • General Social Survey Example


Anova and the f distribution

ANOVA and the f-distribution

Hypothesis testing between a 3+ category variable and a metric variable


Analysis of variance

Analysis of Variance

  • In its simplest form, it is used to compare means for three or more categories.

    • Example:

      • Life Happiness scale and Marital Status (married, never married, divorced)

  • Relies on the F-distribution

    • Just like the t-distribution and chi-square distribution, there are several sampling distributions for each possible value of df.


What is anova

What is ANOVA?

  • If we have a categorical variable with 3+ categories and a metric/scale variable, we could just run 3 t-tests.

    • The problem is that the 3 tests would not be independent of each other (i.e., all of the information is known).

  • A better approach: compare the variability between groups (treatment variance + error) to the variability within the groups (error)


The f ratio

The F-ratio

  • MS = mean square

  • bg = between groups

  • wg = within groups

  • Numerator is the “effect” and denominator is the “error”

  • df = # of categories – 1 (k-1)


Between group sum of squares numerator

Between-Group Sum of Squares (Numerator)

  • Total variability – Residual Variability

  • Total variability is quantified as the sum of the squares of the differences between each value and the grand mean.

    • Also called the total sum-of-squares

  • Variability within groups is quantified as the sum of squares of the differences between each value and its group mean

    • Also called residual sum-of-squares


Null hypothesis in anova

Null Hypothesis in ANOVA

  • If there is no difference between the means, then the between-group sum of squares should = the within-group sum of squares.


F distribution

F-distribution

  • F-test is always a one-tailed test.

    • Why?


Logic of the anova

Logic of the ANOVA

  • Conceptual Intro to ANOVA


Bringing it all together choosing the appropriate bivariate statistic

Bringing it all together: Choosing the appropriate bivariate statistic


Reminder about causality

Reminder About Causality

  • Remember from earlier lectures: bivariate statistics do not test causal relationships, they only show that there is a relationship.

  • Even if you plan to use more sophisticated causal tests, you should always run simple bivariate statistics on your key variables to understand their relationships.


Choosing the appropriate statistical test

Choosing the Appropriate Statistical Test

  • General rules for choosing a bivariate test:

    • Two categorical variables

      • Chi-Square (crosstabulations)

    • Two metric variables

      • Correlation

    • One 3+ categorical variable, one metric variable

      • ANOVA

    • One binary categorical variable, one metric variable

      • T-test


Assignment 2

Assignment #2

  • Online (course website)

  • Due next Monday in class (April 10th)


  • Login