WinBUGS Examples

WinBUGS Examples “I confess that I have been a blind mole, but it is better to learn wisdom late than never to learn it at all.” Sherlock Holmes The Man with the Twisted Lip James B. Elsner Department of Geography, Florida State University Portions of this presentation are taken from Getting Started with WinBUGS, Gillian Raab, June 1999.

We practice the WinBUGS language format using a simple example of estimating a mean and variance of 20 observations using a diffuse prior for the mean and variance (precision). In other words, we do not want to bring in any prior information. In WinBUGS we specify normal distributions in terms of their mean and precision. The precision that is usually represented by tau is 1/sigma^2. WinBUGS requires all priors to be proper. That is, they integrate to a finite number. For the mean of a distribution, the usual choice is a normal distribution. If we don’t want to bring in any prior information, then we allow the standard deviation to cover a very wide range of values. The default in WinBUGS is a value of 1.0E-6 for the precision. Thus we are allowing a 95% credible interval between –2,000,000 and +2,000,000. That is pretty vague, but you can always make it even more vague. Priors for the precision are less obvious. One way to specify the prior on the precision is to use a uniform distribution over a fixed range, which gives a Pareto distribution for the precision. Another is to use a member of the Gamma family of distributions for the precision, which is a conjugate prior for the precision in a normal distribution. Conjugate priors are choices of prior distributions that are naturally connected with a particular likelihood (Poisson, Gamma, etc.). The conjugate prior has the same form as the likelihood, except that the role of the “data” and the “estimate” are switched, and the conjugate prior and the likelihood have different normalizing constants.

It is recommended that you specify a vague prior for tau as Gamma(0.001,0.001). This is the default when you write your model with a doodle. The first parameter of the distribution is labeled a and the second is labeled q. The mean of the distribution is given by aq and the variance by aq^2. We will use doodle. Open WinBUGS > Doodle > New… We have 20 values of the random variable x, so we need to set up a plate in our doodle.

Doodle Quick Help

Node Plate Edge Select Doodle > Write Code model { mean ~ dnorm(0,0.00001) # Prior on normal mean tau ~dgamma(0.01, 0.01) # Prior on normal precision sd <-sqrt(1/tau) # Variance = 1/precision for (i in 1:N) { x[i] ~ dnorm(mean,tau)} } Notice that because we use tau to define the variability, we calculate the standard deviation from tau, with the sd being a logical node. In the Code window add the data. list(N=20, x=c(98,160,136,128,130,114,123,134,128,107,123,125,129,132,154,115,126,132,136,130))

X[i]~dnorm(mean,tau) Select Model > Specification check model load data gen inits shape parameter of gamma tau too small – cannot sample. To overcome this, we can specify an initial value for tau—any reasonable value will do, even if it is wrong, the sampler will bring them back into the correct range after a few samples.

Recheck the model, load the data, and compile. Then select the list(tau=1) and click load inits. Then click gen inits to generate the other initial values (in this case, mean). Select Inference > Sample to bring up the Sample Monitor Tool. Type mean and set, then sd and select. Select Model > Update to bring up the Update Tool. Choose 1000 updates, then go back to the Sample Monitor Tool and select stats.

The summary statistics from one WinBUGS run shows a posterior mean for the mean node of 128.0. The posterior sd for the mean is 3.41. The 95% credible interval for the mean is (121.2, 134.9). This is compared with a frequentist result using the t distribution. To see this using R, type: >mean(x)+1.96*sd(x)/sqrt(length(x)) > [1] 134.1 >mean(x)-1.96*sd(x)/sqrt(length(x)) > [1] 121.9 The mean is 128.0 with a 95% confidence interval (121.9, 134.1). The posterior mean for the sd node is 14.49, this compares with a frequentist result of 13.9. Note the posterior credible limits (10.58, 20.46). Using frequentist approach, we would use the c-squared distribution.

The MC error tells you to what extent simulation error contributes to the uncertainty in the estimation of the mean. This can be reduced by generating additional samples. The simulation errors on the 95% credible interval values will be considerably larger because of the small numbers in the tails of the distribution. You should examine the trace of all your samples. To do this select the history button on the Sample Monitor Tool. If the samples are largely independent, it will look very jagged indicating little autocorrelation between samples. You can address the question of sample autocorrelation directly by selecting the auto cor button. Autocorrelation is not a problem in this example. If autocorrelation exists, you need to generate additional samples. The results from the WinBUGS Bayesian approach are very similar to the results from a frequentist approach. Why? This is a complicated analysis for a simple problem, but it forms the building blocks for more complex problems.

We continue practicing with WinBUGS. Here is an ordinary least squares regression example. The following data need to be fit by linear regression. They concern the fuel consumption (reponse variable, y) and weight (explanatory variable, x) of ten cars. Y[] X[] 3.4 5.5 3.8 5.9 9.1 6.5 2.2 3.3 2.6 3.6 2.9 4.6 2.0 2.9 2.7 3.6 1.9 3.1 3.4 4.9 The plot shows the data along with the OLS regression line. We can see that there is what appears to be a single influential point that may be an outlier.

The regression fit using a frequentist approach is described by the follow summary, as given in R.

Alternatively, we can set up a Bayesian regression model. Start with Doodle. Add a plate. index: i from: 1 up to: N Add nodes. On the plate. name: Y[i] type: stochastic density: dnorm mean: mu[i] precision tau name: mu[i] type: logical density: identity value: a+b*X[i] name: X[i] type: constant Off the plate. name: a type: stochastic density: dnorm mean: 0.0 precision 1.0E-6 name: b type: stochastic density: dnorm mean: 0.0 precision 1.0E-6

Add nodes. Off the plate. name: tau type: stochastic density: dgamma shape: 1.0E-3 scale: 1.0E-3 name: sd type: logical link: identity value: 1/sqrt(tau) Connect the nodes using edges to produce the following linear regression doodle. Select Doodle > Write Code, then add data and initial value for tau.

Use the Sample Monitor Tool to set the node values for a and b. The select Model > Update … to generate samples. Check for autocorrelation among the samples. What do you find? How do the results compare with the frequentist approach?

What about the outlier? It is certainly an influential point in the regression. A frequentist approach might eliminate it. Classical regression results with that point removed are:

In ordinary regression we assume normally distributed errors. If we think that there might be outliers, then it is better to assume an error distribution with longer tails than the normal. The t distribution is better, for example. We can do this easily in WinBUGS by replacing the normal distribution assumption on Y[] by a t distribution with some small number of degrees of freedom. The smaller the degrees of freedom, the longer the tails. This makes the regression robust to outliers. Run the sampler and generate 5000 samples. It is always a good idea to discard the first few hundred or so samples so that the final solution has not “memory” of it’s initial values (burn-in). In the Sample Monitor Tool dialog set beg to 1001 and end to 5000. Print out the statistics for the slope and intercept nodes.

Plot the trace and autcorrelation to check for correlation in the samples. Some weak autocorrelation in the samples. Examine the posterior density for the slope parameter. Compare it to the original model. Select coda to print out the successive sample values. You can cut/paste or export to another program.

With the t distribution, we get a very different result. The regression is now what we call resistant regression, meaning that it is not unduly influenced by outliers. It works because the likelihood now recognizes the possibility of finding an odd point in the tails of the distribution. You can experiment with different values of the degrees of freedom to see how the posterior distribution on the slope parameter changes. How do the results from the Bayesian model compare with those from the leave-it-out frequentist model? Discussion questions: How do you choose a value for the degrees of freedom in the t distribution? Would we want to assume this is also a parameter with a distribution? Model choice is part of the prior, so should it be chosen before we see the data? Next: Hierarchical models.

WinBUGS Examples