1 / 161

1.63k likes | 2.2k Views

Bayesian data analysis 1 using Bugs 2 and R 3. 1 Fit complicated models 2 clearly 3 display and understand results. Bayesian data analysis using BUGS and R. Prof. Andrew Gelman Dept. of Statistics and Political Science Columbia University.

Download Presentation
## Bayesian data analysis 1 using Bugs 2 and R 3

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Bayesian data analysis1 using Bugs2 and R3**1Fit complicated models 2clearly 3display and understand results**Bayesian data analysisusing BUGS and R**Prof. Andrew Gelman Dept. of Statistics and Political Science Columbia University Joint Program in Survey Methodology, University of Maryland, College Park January 17-18, 2008**Goals**• Should be able to… • Fit multilevel models and display and understand the results • Use Bugs from R • Theoretical • Connection between classical linear models and GLM, multilevel models, and Bayesian inference • Practical • Dealing with data and graphing in R • Reformatting models in Bugs to get them to work**Realistic goals**• You won’t be able to go out and fit Bayesian models • But . . . you’ll have a sense of what Bayesian data analysis “feels like”: • Model building • Model fitting • Model checking**Overview**• What is Bayesian data analysis? Why Bayes? • Why R and Bugs [schools example] • Hierarchical linear regression [radon example] • Bayesian data analysis: what it is and what it is not [several examples] • Model checking [several examples] • Hierarchical Poisson regression [police stops example] • Understanding the Gibbs sampler and Metropolis algorithm • Troubleshooting: simulations don’t work, run too slowly, make no sense • Advanced computation [social networks example] • Accounting for design of surveys and experiments [polling example] • Prior distributions [logistic-regression separation example] • If there is time: hierarchical logistic regression with varying intercepts and slopes [red/blue states example] • Loose ends**Structure of this course**• Main narrative on PowerPoint • Formulas on blackboard • On my computer and yours: • Bugs, R, and a text editor • Data, Bugs models, and R scripts for the examples • We do computer demos together • Pause to work on your own to play with the models—we then discuss together • Lecturing for motivating examples • Interrupt to ask questions**Structure of this course**• Several examples. For each: • Set up the Bugs model • Use R to set up the data and initial values, and to run Bugs • Use R to understand the inferences • Sometimes fit model directly in R • Play with variations on your own computer • Understanding how Bugs works and how to work with Bugs • More examples • Working around problems in Bugs • Using the inferences to answer real questions • Not much theory. If you want more theory, let me know! • Emphasis on general methods for model construction, expansion, patching, and understanding**What is Bayesian data analysis? Why Bayes?**• Effective and flexible • Modular (“Tinkertoy”) approach • Combines information from different sources • Estimate inferences without overfitting for models with large number of parameters • Examples from sample surveys and public health include: • Small-area estimation, longitudinal data analysis, and multilevel regression**What are BUGS and R?**• Bugs • Fit Bayesian statistical models • No programming required • R • Open source language for statistical computing and graphics • Bugs from R • Offers flexibility in data manipulation before the analysis and display of inferences after • Avoids tedious issues of working with Bugs directly [Open R: schools example]**Open R: schools example**• Open R: • setwd (“c:/bugs.course/schools”) • Open RWinEdt • Open file “schools/schools.R” • Open file “schools.bug” • Click on Window, Tile Horizontally • Copy the commands from schools.R and paste them into the R window**8 schools example**• Motivates hierarchical (multilevel) modeling • Easy to do using Bugs • Background: educational testing experiments • Classical analyses (load data into R) • No pooling • Complete pooling • Hierarchical analysis using Bugs • Talk through the 8-schools model • Fit the model from R • Look at the inferences, compare to classical**Playing with the 8 schools**• Try it with different n.chains, n.iter • Different starting values • Changing the data values y • Changing the data variances sigma.y • Changing J=#schools • Changing the model**A brief prehistory of Bayesian data analysis**• Bayes (1763) • Links statistics to probability • Laplace (1800) • Normal distribution • Many applications, including census [sampling models] • Gauss (1800) • Least squares • Applications to astronomy [measurement error models] • Keynes, von Neumann, Savage (1920’s-1950’s) • Link Bayesian statistics to decision theory • Applied statisticians (1950’s-1970’s) • Hierarchical linear models • Applications to animal breeding, education [data in groups]**A brief history of Bayesian data analysis, Bugs, and R**• “Empirical Bayes” (1950’s-1970’s) • Estimate prior distributions from data • Hierarchical Bayes (from 1970) • Include hyperparameters as part of the full model • Markov chain simulation (from 1940’s [physics] and 1980’s [statistics]) • Computation with general probability models • Iterative algorithms that give posterior simulations (not point estimates) • Bugs (from 1994) • Bayesian inference Using Gibbs Sampling (David Spiegelhalter et al.) • Can fit general models with modular structure • S (from 1980’s) • Statistical language with modeling, graphics, and programming • S-Plus (commercial version) • R (open-source) • lme() and lmer() functions by Doug Bates for fitting hierarchical linear and generalized linear models**My own experiences with Bayes, Bugs, and R**• Bayesian methods really work, especially for models with lots of parameters • Use R for data manipulations and various preprogrammed models • Use Bugs (from R) as first try in fitting complex models • Use R to make graphs to check that fitted model makes sense • When Bugs doesn’t work, program in R directly**R and Bugs for classical inference**• Radon example • Fitting a linear regression in Bugs • Displaying the results in R • Complete-pooling and no-pooling models • Model extensions [open R: radon example]**Open R: radon example**• In RWinEdt • Open file “radon/radon_setup.R” • We’ll talk through the code • In R: • setwd (“c:/bugs.course/radon”) • source (“radon_setup.R”) • Regression output will appear on the R console and graphics windows**Fitting a linear regression in R and Bugs**• The Bugs model • Setting up data and initial values in R • Running Bugs and checking convergence • Displaying the fitted line and residuals in R**Radon example in R(complete pooling)**• In RWinEdt • Open file “radon/radon.classical1.R” • In other window, open file “radon.classical.bug” • Copy the commands (one paragraph at a time) from radon.classical1.R and paste them into the R window • Fit the model • Plot the data and estimates • Plot the residuals (two versions) • Label the plot**Simple extensions of the radon model**• Separate variances for houses with and without basements • t instead of normal errors • Fitting each model, then understanding it**Radon regression with 2 variance parameters**• Separate variances for houses with and without basements • Bugs model • Setting it up in R • Using the posterior inferences to compare the two variance parameters**Radon example in R (regression with 2 variance parameters)**• In RWinEdt • Open file “radon/radon.extend.1.R” • Other window, open file “radon.classical.2vars.bug” • Copy from radon.extend.1.R and paste into the R window • Fit the model • Display a posterior inference • Compute a posterior probability • STOP**Robust regression for radon data**• t instead of normal errors • Issues with the degrees-of-freedom parameter • Looking at the posterior simulations and comparing to the prior distribution • Interpreting the scale parameter**Radon example in R (robust regression using the t model)**• In RWinEdt • Stay with “radon/radon.extend.1.R” • Other window: “radon.classical.t.bug” • Copy rest of radon.extend.1.R and paste into the R window • Fit the t model • Run again with n.iter=1000 iterations • Make some plots**Fit your own model to the radon data**• Alter the Bugs model • Change the setup in R • Run the model and check convergence • Display the posterior inferences in R [Suggestions of alternative models?]**Fitting several regressions in R and Bugs**• Back to the radon example • Complete pooling [we just did] • No pooling [do now: see next slide] • Bugs model is unchanged • Fit separate model in each county and save them as a list • Displaying data and fitted lines from 11 counties • Next step: hierarchical regression**Radon example in R (no pooling)**• In RWinEdt • Open file “radon/radon.classical2.R” • In other window, open file “radon.classical.bug” • Copy from radon.classical2.R and paste into the R window • Fit the model (looping through n.county=11 counties) • Plot the data and estimates • Plot the residuals (two versions) • Label the plot**Hierarchical linear regression: models**[Show the models on blackboard] • Generalizing the Bugs model • Varying intercepts • Varying intercepts, varying slopes • Adding predictors at the group level • Also write each model using algebra • More than one way to write (or program) each model**Hierarchical linear regression: fitting and understanding**[working on blackboard] • Displaying results in R • Comparing models • Interpreting coefficients and variance parameters • Going beyond R-squared**Hierarchical linear regression: varying intercepts**• Example: county-level variation for radon • Recall classical models • Complete pooling • No pooling (separate regressions) • Including county effects hierarchically • Bugs model • Written version • More than one way to write the model • Fitting and understanding • Interpreting the parameters • Displaying results in R**Radon example in Bugs and R (varying intercepts)**• In RWinEdt • “radon/radon.multilevel.1.R” • Other window: “radon.multilevel.1.bug” • Copy from radon.multilevel.1.R and paste into the R window • Fit the model (debug=TRUE) • Close the Bugs window • Run for 100, then 1000 iterations • Plot the data and estimates for 11 counties • Plot all 11 regression lines at once**Fitting the radon model using lmer()**• lmer(): an R function for multilevel (hierarchical) regression and glm • Not as flexible as Bugs • Has problems when group-level variance parameters are near zero (as in 8-schools example) • But . . . is easy to use and pretty fast • (Similar routines available in Stata and SAS)**Radon example using lmer() (varying intercepts)**• In RWinEdt • “radon/radon.multilevel.lmer.1.R” • Copy from radon.multilevel.lmer.1.R and paste into the R window • Fit the model using lmer() • Display the results • Pull out the intercepts and slopes • Plot the data and estimates for 11 counties • Get posterior simulations and compare to Bugs**Hierarchical linear regression: varying intercepts, varying**slopes • The model • Bugs model • Written version • Setup in R • Running Bugs, checking convergence • Displaying the fitted model in R • Understanding the new parameters**Radon example in R (varying intercepts and**slopes) • In RWinEdt • “radon/radon.multilevel.2.R” • Other window: “radon.multilevel.2.bug” • Copy from radon.multilevel.2.R and paste into the R window • Fit the model (debug=TRUE) • Close the Bugs window • Run for 500 iterations • Plot the data and estimates for 11 counties • Plot all 11 regression lines at once • Plot slopes vs. intercepts**Radon example using lmer() (varying intercepts and slopes)**• In RWinEdt • “radon/radon.multilevel.lmer.2.R” • Copy from radon.multilevel.lmer.2.R and paste into the R window • Fit the model using lmer() • Display the results • Play around with centering and rescaling of x • Pull out the intercepts and slopes • Plot the data and estimates for 11 counties • Get posterior simulations and compare to Bugs**Structure of lmer()**• Data are grouped, can have nonnested groups (see book for examples) • Coefficients can be “fixed” (not varying by group) or “random” (varying by group) • lmer (y ~ fixed + (random | group)) • Also glm version: lmer (y ~ fixed + (random | group), family=binomial (link=“logistic”))**Simple lmer() examples**• From R window • ?mcsamp • Scroll to the bottom of the help window and paste the example into the R window • Simulate some fake multilevel data • Fit and display a varying-intercept model • Varying-intercept, varying-slope model • Logistic model: varying intercept • Logistic: varying-intercept, varying-slope • Non-nested varying-intercept, varying-slope model**Playing with hierarchical modeling for the radon example**• Uranium level as a county-level predictor • Changing the sample sizes • Perturbing the data • Fitting nonlinear models**Radon example in R (adding a county-level**predictor) • In RWinEdt • radon/radon.multilevel.3.R • Other window: radon.multilevel.3.bug • Copy from radon.multilevel.3.R and paste into the R window • Fit the model • Plot the data and estimates for 11 counties • Plot all 11 regression lines at once • Plot county parameters vs. county uranium level**Varying intercepts and slopes with correlated group-level**errors • The model • Statistical notation • Bugs model • Show on blackboard • Correlation and group-level predictors • More in chapters 13 and 17 of Gelman and Hill (2006)**Bayesian data analysis: what it is and what it is not**• Popular view of Bayesian statistics • Subjective probability • Elicited prior distributions • Bayesian data analysis as we do it • Hierarchical modeling • Many applications • Conceptual framework • Fit a probability model to data • Check fit, ride the model as far as it will take you • Improve/expand/extend model

More Related