Analysis of Variance (ANOVA)

1 / 15

# Analysis of Variance (ANOVA) - PowerPoint PPT Presentation

Analysis of Variance (ANOVA). Agenda. Lab Stuff Questions about Chi-Square? Intro to Analysis of Variance (ANOVA). This Thursday: Lab 4. Final lab will be distributed on Thursday Very similar to lab 3, but with different data

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Analysis of Variance (ANOVA)' - arleen

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Analysis of Variance (ANOVA)

Agenda
• Lab Stuff
• Intro to Analysis of Variance (ANOVA)
This Thursday: Lab 4
• Final lab will be distributed on Thursday
• Very similar to lab 3, but with different data
• You will be expected to find appropriate variables for three major tests (correlation, t-test, chi-square test of independence)
• You will be expected to interpret the findings from each test (one short paragraph per test).
• We will use the first 15 minutes of class to return lab 3 and discuss common issues and questions
Analysis of Variance
• In its simplest form, it is used to compare means for three or more categories.
• Example:
• Income (metric) and Marital Status (many categories)
• Relies on the F-distribution
• Just like the t-distribution and chi-square distribution, there are several sampling distributions for each possible value of df.
What is ANOVA?
• If we have a categorical variable with 3+ categories and a metric/scale variable, we could just run 3 t-tests.
• One problem is that the 3 tests would not be independent of each other (i.e., all of the information is known).
• As number of comparisons grow, likelihood of some differences are expected– but do not necessarily indicate an overall difference.
• A better approach: compare the variability between groups (treatment variance + error) to the variability within the groups (error)
The F-ratio
• MS = mean square
• bg = between groups
• wg = within groups
• The numerator and denominator have their own degrees of freedom
• df = # of categories – 1 (k-1)
Interpreting the F-ratio
• Generally, an f-ratio is a measure of how different the means are relative to the variability within each sample
• Larger values  greater likelihood that the difference between means are not just due to chance alone
Null Hypothesis in ANOVA
• If there is no difference between the means, then the between-group sum of squares should = the within-group sum of squares.
F-distribution
• A right-skewed distribution
• It is a ratio of two chi-square distributions
F-distribution
• F-test for ANOVA is a one-tailed test.
Visual ANOVA and f-ratio

http://tinyurl.com/271ANOVA

ANOVA and t-test
• How do we know where the differences exist once we know that we have an overall difference between groups?
• t-tests become important after an ANOVA so that we can find out which pairs are significantly different (post-hoc tests).
• Certain ‘corrections’ can be applied to such post-hoc t-tests so that we account for multiple comparisons (e.g., Bonferroni correction, which divides p-value by the number of comparisons being made)
• There are many means comparisons test available (Tukey, Sidak, Bonferroni, etc). All are basically modified means comparisons.
Logic of the ANOVA
• Conceptual Intro to ANOVA
• Class Example:
• anova.do
• GSS96_small.dta