Blog Archives

Installing and Running JAGS on Mac OS 10.5.8

JAGS is an alternative to BUGS (WinBUGS or OpenBUGS) for conducting a Bayesian Analysis. It stands for Just Another Gibbs Sampler, and like WinBUGS, it is essentially an MCMC machine that employs a Gibbs sampler so you don’t have to write your own for every analysis. JAGS code is very similar to the more popular BUGS so it is an easy transition. JAGS has the advantage of running on multiple platforms (Windows, Mac, Linux). It is also open source and based in C++ so it will likely have more continued development than the more well established BUGS software. Unlike WinBUGS, JAGS has no user interface and you will not see it in your Programs/Applications folder. It has to be run from another program, most commonly R using rjags. R2JAGS is an R wrapper for JAGS and rjags that provides some additional features.

I have not yet updated my operating system or all of my software, and as such, I’ve had some difficulty installing and running JAGS/rjags. I finally got it working after two long days and thought I’d post my solution in case anyone finds themselves in the same situation. Hopefully when I do update to Snow Leopard in the next month I don’t have any problems just using the most up to date versions. For now, here is a solution using:

Mac OS 10.5.8
R 2.13.2
JAGS 2.2.0
rjags 2.2

1. Go to https://sourceforge.net/projects/mcmc-jags/files/ and download JAGSdist-2.2.0.dmg. Follow the normal install procedures.

2. From the same site download the rjags_2.2.0-1.tar.gz file to your desktop

3. Install the rjags package. I tried install.packages(‘/Users/Dan/Desktop/rjags_2.2.0-1.tar.gz’, repos = NULL, type = “source”) and it seemed to work, but when I typed library(rjags) I got the following error message:
Error: package ‘rjags’ is not installed for ‘arch=x86_64′

4. If that happens, I found that installing from the Terminal with additional instructions worked like a charm. If like me you are not experienced with the Mac Terminal and command line entry, I will provide explicit instructions that I found:

  • Open Terminal (/Applications/Utilities/Terminal.app)
  • Navigate to where you downloaded the source package. In my case the desktop, so I typed, “cd /Users/Dan/Desktop/” (without the quotes). You should notice that the cursor is now indicating that directory.
  • Now that it knows where to find the file, have the Terminal tell R to install the package as a 64-bit version by typing the following into the Terminal: R –arch x86_64 CMD INSTALL rjags_2.2.0-1.tar.gz

5. Open R64 back up and rjags should be installed. Load the library with “library(rjags)” and it should work. At least it worked for me. Good luck!

GLMM Hell

I have been starting to analyze some data I have of repeated counts of salamanders from 5 plots over 4 years. I am trying to develop a predictive model of salamander nighttime surface activity as a function of weather variables. The repeated counting leads to the need for Generalized Linear Mixed Models (GLMM). Count data often results in data that are best described with a Poisson distribution, hence the “generalized” term. Because the counts were repeated on the same plots, plot needs to be considered a random effect. If the plot term was not included in this way it would suggest that all the counts were independent but in reality counts on one plot over time are likely to have some correlation that needs to be accounted for to avoid pseudoreplication. So I am stuck with a GLMM. The problem with GLMM in a frequentist statistical framework is that they are difficult to analyze. Bolker and colleagues give the best overview of the analysis process and it’s challenges in: Generalized Linear Mixed Models: A Practical Guide for Ecology and Evolution. They do have an online supplement to that paper that provides a workthrough example complete with R code using the lme4 package. I HIGHLY recommend everyone read Bolker’s paper if considering using GLMMs. Personally, I like the idea of analyzing GLMMs with Bayesian statistics rather than traditional frequentist stats. Below are a few emails that I’ve recently been exchanging with colleagues regarding GLMM. Let me know what you think.

Question About Selection of Correlated Predictor Variables and Model Selection:
 How much correlation among independent variables is too much in a GLMM? If I have correlation in the variables does it affect the interpretation or model selection?

Answer from a Statistician Friend:
0.8 and above is high and often one variable can be replaced by the other, and
both are not necessary in the model.

Below 0.7 typically both variables are needed for a good model fit.
I usually use stepAIC (from the MASS package in R) for model selection.

The difficulty comes in interpreting the regression coefficients: with correlation in the predictor variables, the variable that appears first
on the model statement usually gets the larger absolute value, whereas
the other variable has a smaller (in absolute value) coefficient.
Remember the interpretation of regression coefficients: the change
in the response per unit increase GIVEN ALL THE OTHER VARIABLEs IN THE
MODEL.

If you want coefficients that represent “additive” contributions to the
variation in the response (regardless of the order in which predictors
appear in the model statement), and if you have considerable multicollinearity
you might want to consider doing a principal component regression with all
or perhaps with only a subgroup of similar predictor variables.

As with most issues in statistics, there is not a clear-cut hard-fact simple
answer. Live would be simpler if there was….

Question of GLMM Bayesian Approach:
Hey Dan – I’m using GLMM b/c I have a repeated-measures design, count data response (negative binomial distribution), etc. I’m finding admb in R is doing the job – and I read the article you mentioned a few months back, when I started considering GLMMs…

I have never worked with Bayesian stats and wouldn’t even know where to begin. Do you have any recommendations for overview reading, and can I analyze a repeated-measures design (i.e., is there a way to cope with random factors)?

My Response:
My data sounds very similar to yours. I usually use lmer in the lme4 package. Right now I am just essentially copying the code in Bolker et al 2009 from the online supplements in the TREE paper previously mentioned. I have never see the admb package and will have to check it out. I’ve tried glmmPQL and glmmML but there are more examples in lmer and it’s Splus predecessor. I am annoyed that in Zuur et al. “Mixed Effects Models and Extensions in Ecology with R” they don’t spend much time on model assumptions or model comparison. I feel like they show users how to do the analysis but not how to evaluate it. Pinheiro and Bates do a better job in “Mixed-Effects Models in S and S-Plus” but they focus on linear mixed models and non-linear mixed models and less on GLMM. Plus the code is similar to but differs enough from R that it can be challenging to use at times. The “SAS for Mixed Models” book is good but SAS isn’t free and the code isn’t as transparent to me. Plus it doesn’t have good graphics so I prefer R.

Anyway, Bayesian stats have their own can of worms but I find it more intuitively appealing and I like the transparency in the code using WinBUGS (no Mac version) called from R. There are two very good, practical books to get started. McCarthy presents a good overview and introduction to bayesian stats in “Bayesian Methods for Ecology” but the examples don’t get very advanced. Personally I recommend getting that from the library and reading the first few chapters. I would then buy Marc Kery’s excellent book, “Introduction to WinBUGS for Ecologists.” It is very well written and has a wider range of examples that typically relate to many animal ecology studies. Clark and Gelfand have a decent modeling book with Bayesian analysis in R examples but it’s more ecosystem/environmentally oriented than animal ecology.

Bayesian analysis treats all factors sort of like random variables from population distributions. Therefore there is not need for explicit random vs. fixed delineation. You get estimates and credibility intervals for all variables. You can essentially write the same GLMM model and then analyze it in a Bayesian framework. The big difference in the philosophy behind frequentist vs Bayesian statistics. Bayesians use prior information (even noninformative priors contain information on the underlying distributions). Some scientists are opposed to this but for various reasons that I won’t go into now, I like it. Some people do want a sensitivity analysis to go along with Bayesian analysis to determine the influence of the priors. I might go as far as to say that in the case of GLMM type data Bayesian statistics are more sound (robust?) than frequentist methods but they differ significantly from a philosophical standpoint.

Anyway, I hope that helps.

http://rcm.amazon.com/e/cm?t=run00e-20&o=1&p=8&l=bpl&asins=0691125228&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr
http://rcm.amazon.com/e/cm?t=run00e-20&o=1&p=8&l=bpl&asins=019856967X&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr
http://rcm.amazon.com/e/cm?t=run00e-20&o=1&p=8&l=bpl&asins=0123786053&fc1=000000&IS2=1&lt1=_blank&m=amazon&lc1=0000FF&bc1=000000&bg1=FFFFFF&f=ifr

JAGS – Bayesian Analysis

JAGS is used for Bayesian analysis using MCMC and stands for Just Another Gibbs Sampler.  It is an alternative to WinBUGS and can be accessed through R just like WinBUGS (via R2jags or RJags).  It will work on a Mac unlike WinBUGS.  The only problem is that most books include WinBUGS examples and not JAGS examples.  However, much of the language is similar.  I haven’t had time to try it out yet but plan to in the future.  They just released an updated version: JAGS 2.2.0 if you want to check it out.

JAGS Wiki Guide

GLMM and R issues

I have been trying to run a Generalized Linear Mixed Model (GLMM) for some count data with repeated measures on sub-sampled sites and fixed effects at the site level with covariates at both the sub-plot and time levels.  Plus there are different numbers of sub-plots within each site and not all sub-plots are sampled the same number of times.  It’s quite the gnarly dataset.  I tried to use binomial-mixture models for the analysis to account for differences in detection probability but unfortunately I didn’t have a sufficient number of independent sites to differentiate between detection and abundance.  Plus the sampling may have violated assumptions of closure.  So, anyway, I am back to GLMM and the troubles with that will have to wait for another day.

I use Mac OS 10.5.8 (Leopard) to run R unless I need to run WinBUGS (via R2WinBUGS).  Unfortunately, the R package “lme4″ won’t load on R for Macs.  It works fine on my Windows shell.  I have been looking up solutions.  I think the problem is that I need a fortran add-on for Xcode.  Now the problem is that I don’t understand anything about compiling R or Fortran or any of the other things on this website.  It does look like it might be part of the solution (maybe?) to my problems with running WinBUGS through Wine on my Mac.  So here is the solution I’ve found: http://r.research.att.com/tools/

This is the specific part that I hope will work:
GNU Fortran 4.2.4 for Mac OS X 10.5 (Leopard):
Download: gfortran-42-5577.pkg (for Xcode 3.1.4 only!)
This package adds GNU Fortran 4.2.4 to Apple’s Xcode 3.1.4 gcc 4.2 (build 5577) compilers on Mac OS X 10.5 (Leopard). It does NOT work on Snow Leopard. This binary has been built the Apple way with the gcc_42 (build 5577) sources (by adding the Fortran directories from gcc 4.2.4 release), so it features full Apple driver (i.e. all special flags work) and works directly with the gcc 4.2 system compiler. You have to install Xcode 3.1.4 first (from ADC). 

Another boring blog

I recently decided to create two blogs as outlets for my research.  The first (The Richness of Life) focuses more on the organisms I work with as an ecologist and my general interest as a student of natural history.  This blog on Quantitative Ecology stems from my recent obsessive frustration with analyzing various data sets.  I have a decent background in the design of ecological experiments but have recently been trying to increase my statistical fluency (see Ellison and Dennis 2010 – Frontiers in Ecology and the Environment).  While searching for information on coding in R and WinBUGS, I have utilized a variety of sources including forums and blogs where people have shared their experiences and deciphered cryptic error messages.  I also came across two articles on the benefits of blogging as an academic (here and here).  Without duplicating everything they wrote, I’ll say that my desire to blog about my research comes from a few different perspectives.  First, this is what I spend my time thinking about and it’s nice to share it with like-minded individuals.  Second, I hope that this could contribute to fun and productive collaborations.  Third, I hope to help people on their own (sometimes painful) journeys in the realm of experimental design and analysis (including statistics and inference).  Finally, I believe it will help me as a teacher if I practice articulating my thoughts and questions on these complex subjects.

I will start this first blog with a recommendation of one of my favorite books on data analysis that I’ve come across.  Introduction to WinBUGS for Ecologists: Bayesian approach to regression, ANOVA, mixed models and related analyses is an exceptional book for self-teaching and offers a nice introduction to using WinBUGS for analyzing ecological data in a Bayesian framework.

http://rcm.amazon.com/e/cm?lt1=_blank&bc1=000000&IS2=1&bg1=FFFFFF&fc1=000000&lc1=0000FF&t=run00e-20&o=1&p=8&l=as1&m=amazon&f=ifr&md=10FE9736YVPPT7A0FBG2&asins=0123786053

Follow

Get every new post delivered to your Inbox.

Join 51 other followers