Using the Bayesian Information Criterion to Judge Models and Statistical Significance

1 / 15

# Using the Bayesian Information Criterion to Judge Models and Statistical Significance - PowerPoint PPT Presentation

Using the Bayesian Information Criterion to Judge Models and Statistical Significance. Paul Millar University of Calgary. Problems. Choosing the “best” model Aside from OLS, few recognized standards

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Using the Bayesian Information Criterion to Judge Models and Statistical Significance

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Using the Bayesian Information Criterion to Judge Models and Statistical Significance

Paul Millar

University of Calgary

Problems
• Choosing the “best” model
• Aside from OLS, few recognized standards
• Few ways to judge if adding an explanatory variable is justified by the additional explained variance
• Conventional p-values are problematic
• Large, small N
• Potential unrecognized relationships between explanatory variables
• Random associations not always detected

Judging Models

• Explanatory Framework
• Need to find the “best” or most likely model, given the data
• Two aspects
• Which variables should comprise the model?
• Which form should the model take?
• Predictive Framework
• Of the potential variables and model forms, which best predicts the outcome?
Bayesian Approach
• Origins (Bayes 1763)
• Bayes Factors (Jeffreys 1935)
• BIC (Swartz 1978)
• Variable Significance (Raftery 1995)
• Judging Variables and Models
• Stata Commands

Bayes Law

Joint Distribution:

(A,B) or (A B)

B

A

A= Low Education

B= High Income

Bayes Law and Model Probability

Assume: Two Models

Assume: Equal Priors

Bayes Law and Model Probability

• Jeffreys (1935)
• Allows comparison of any two models
• Nesting not required
• Explanatory framework
• Problem
• Complexity
• Challenging to solve
• Problem
An Approximation: BIC
• Bayesian Information Criterion (BIC)
• Function of N, df, deviance or c2 from the LRT
• Readily obtainable from most model output
• Allows approximation of the Bayes Factor
• Two versions
• relative to saturated model (BIC) or null model (BIC’)
• Assumptions
• “large” N
• Nested Models
• Prior expectation of model parameters is multivariate normal
• Attributed to Schwartz (1978)
An Alternative to the t-test
• Produces over-confident results for large datasets
• Random relationships sometimes pass the test
• Widely varying results possible when combined with stepwise regression
• Only other significance testing method (re-sampling) provides no guidance on form or content of model
BIC-based Significance
• Raftery (1995)
• Examines all possible models with the given variables (2k models)
• For each model calculates a BIC-based probability
• Computationally intensive
A Further Approximation
• Compare the model with all variables to the model without a specific variable
• Only requires a model for each IV (k)
• Experiment: k=10, n=100,000
-pre-
• Prediction only
• The reduction in errors for categorical variables
• logistic, probit, mlogit, cloglog
• Allows calculation of “best” cutoff
• The reduction in squared errors for continuous variables
• regress, etc.
• Allows comparison of prediction capability across model forms
• Ex. mlogit vs. ologit vs. nbreg vs. poisson
bicdrop1
• Used when –bic– takes too long or when comparisons to the AIC are desired
-bic-
• Reports probability for each variable using Raftery’s procedure
• Also reports pseudo-R2, pre, bicdrop1 results
• Reports most likely models, given the theory and data (hence a form of stepwise)
Further Development
• “-pre-” –wise regression
• Find the combination of IVs and model specification that best predict the outcome variable
• Variable significance ignored
• Bayesian cross-model comparisons
• Safer than stepwise
• Bayes Factors
• Requires development of reasonable empirical solutions to integrals