Stan for Bayesian Analysis


Bayesian analysis has been growing in popularity among ecologists recently, largely due to accessible books such as Models for Ecological Data: An Introduction, Introduction to WinBUGS for Ecologists, and Bayesian Methods for Ecology. Most ecologists with limited programming background have chosen to implement Bayesian analyses using WinBUGS, OpenBUGS, or JAGS, all of which can be run through R packages. JAGS is the best of these, works on most operating systems, and is flexible enough for most Bayesian analyses in ecology. However, recently due to inherent inefficiencies in these programs, which all implement MCMC using a Gibbs sampler (version of the Metropolis-Hastings Algorithm), a group of statisticians and computer programmers got together to solve these problems. The result was Stan.

Stan uses a version of a Hamiltonian Monte Carlo, specifically the No U-Turn (NUTS) sampler. NUTS in combination with Stan being written in C++ makes it faster than the above-mentioned Gibbs samplers, particularly for complex problems. Models should converge faster and in cases when Gibbs samplers can’t converge at all, Stan might provide a solution. For those of us who don’t work in C++ there is a bit of a learning curve but there is a nice R package (rstan) for implementation, a great manual, and a very active users group.

I decided to go through Marc Kery’s excellent book, and run all of the examples using Stan via rstan for comparison. Apparently, I am not the only one. You can find an example from Chapter 7 (t-test) on Shige’s useful blog. I recently finished translating chapter 8 into Stan code. Running the model was relatively easy, but it took me a while to figure out how to do all the post-processing model assessment (e.g. plots, posterior predictive checks, etc.). Now that I’ve done it though, it should be fairly easy to repeat. You can find the code below for running the model for Chapter 8, exercise 3 below and the full code for the chapter and exercises as a word document here: Kery_Chp8_Stan

bm8_3 <- '
data{
int n;
vector[n] y;
vector[n] x;
}

parameters{ 
real alpha; // these will be monitored and saved
real beta;
real sigma;
}

model{
vector[n] mu; // mu is defined in the model so won't be reported

// Priors
alpha ~ normal(0, 100);
beta ~ normal(0, 100);
sigma ~ uniform(0, 30);

// Likelihood
mu y ~ normal(mu, sigma);
}
'
data_8_3 <- list(n = length(mmd.grass$year), y = mmd.grass$mmd, 
                 x = mmd.grass$year)

fit.8.3 <- stan(model_code = bm8_3, data = data_8_3,
  chains = 3,
  iter = 50000,
  warmup = 25000,
  thin = 1)

print(fit.8.3, dig = 3)

traceplot(fit.8.3)

7 thoughts on “Stan for Bayesian Analysis

  1. hi Dan,

    interesting post. I have heard of STAN, of course, but admit to being some kind of a parasite, in that I probably wait with using it till other people, more clever in coding and stats, have worked out a couple of examples that I find interesting for my work.

    Most importantly, from what I’ve heard, it is at present not possible to fit models with discrete random effect (or only with complications ?). Almost all models that I find really interesting and important for my work (site-occupancy, Nmix etc; see our BPA book) have discrete random effects.

    I’d be very interested to know whether you succeed in translating the last two chapters of my book into STAN and if so how.

    Kind regards — Marc

    PS: And thanks for the nice words about my blue bug book

    • Hi Marc,

      Thanks for the message. I wish I realized that STAN had issues with discrete random effects before I spent so much time figuring it out. Probably over 90% of the Bayesian analyses that I will do have discrete random effects. Similar to you, I primarily use Bayesian computation for hierarchical models to account for detection probability.

      After reading your message, I was hoping that better ways of dealing with discrete random effects would just be coming along in later versions of STAN. Unfortunately, after looking through the google user group history, I found that Bob Carpenter said these quantities don’t work well with Hamiltonian Monte Carlo. It looks like STAN will probably never be of use for types of models. Maybe if someone develops a variation of HMC that works better with these parameters, but I don’t even know if that’s possible. I suppose it’s back to JAGS for me. I will just have to work on field methods to increase the detection probability and consistency of detection of terrestrial amphibians. That seems to be the big challenge for implementing these models in some of my work.

      Best,
      Dan

  2. Hi Dan,

    Thank you for this post. I have been exploring the possibility of using stan to fit dynamic occupancy models, but Marc Kéry mentions that stan does not support discrete random effects. Do you know if there have been any changes recently in the code? I don’t use discrete random effects in my JAGS code, but not sure if stan will work with these types of problems.
    Thanks for the feedback!

    Jorge

    • Hi Jorge,

      I don’t believe Stan will ever support discrete random effects because they cause serious problems with the HMC (Hamiltonian Monte Carlo) jump/iteration process. HMC is the basis of Stan, so that is unlikely to change. It seems like it should work for dynamic occupancy models without discrete random effects, but I have never tried it. When I realized it wouldn’t work for those, I stopped using Stan because I use them too much. I’m apparently stuck with the Gibbs sampler. I’ve been fairly happy with the JAGS implementation via rjags in R, but sometimes I have to go back to WinBUGS or OpenBUGS for a specific model (or between the three programs I can figure out coding errors from the various messages).

      Let me know how things work out if you do use Stan.
      Good luck,
      Dan

  3. Pingback: 2013 in review | Daniel J. Hocking

  4. Pingback: Year in Review: 2014 | Daniel J. Hocking

  5. Right. We still don’t have plans to add discrete random effects. On the other hand, the effects can usually be marginalized out. I know Marc says it’s too complicated for most ecologists, but if you drop me or our mailing list an e-mail, I can help you formulate the models.

    You’ll see in the latest Stan manual that we show how to implement the CJS model in Stan. It can be extended to random effects.

Leave a comment