Blog Archives

Mapping Abundance in Streams in R

I haven’t worked with raster or spatial polygon data in R much before, but I want to create maps to show the results of a spatiotemporal model. Specifically, I want to depict the changes in abundance of fish in a stream network over time and space. I decided to play around with various packages for manipulating spatial data and then using base `plot` and `ggplot2` functions to make the maps. Rather than post the code and output here, it was easier to publish it to RPubs and link to it:

Fish abundance in a stream network

Fish abundance in a stream network

The scale, colors, projection, and background will have to be adjusted based on the purpose of the map, but hopefully the code and explanations will help people get started. This was intended as a learning tool for me, rather than strictly as a tutorial, but I figured I would share in case it proved useful to others.

Here’s some links to additional information I found useful along the way:

Add a scale bar

Add a north arrow and scale bar

R GIS Tutorial

R CRS System by NCEAS

ggplot2 mapping in R by Zev Ross (he has tons of great info on his site!!!)

ggplot2 cheat sheet

raserVis – haven’t tried yet but looks very useful

Raster data in R by NEON


High Resolution Figures in R

LogisticAs I was recently preparing a manuscript for PLOS ONE, I realized the default resolution of R and RStudio images are insufficient for publication. PLOS ONE requires 300 ppi images in TIFF or EPS (encapsulated postscript) format. In R plots are exported at 72 ppi by default. I love RStudio but was disappointed to find that there was no options for exporting figures at high resolution.

PLOS ONE has extensive instructions for scaling, compressing, and converting image files to meet their standards. Unfortunately, there is no good way to go from low resolution to high resolution (i.e. rescaling in Photoshop) as my friend Liam Revell, and phytools author, pointed out with entertaining illustration from PhD comics (upper right panel). The point is if the original image isn’t created at a high enough resolution, then using Photoshop to artificially increase the resolution has problems of graininess because Photoshop can only guess how to interpolate between the points. This might not be a big problem with simple plots created in R because interpolation between points in a line shouldn’t be difficult, particularly when starting with a PDF.

Even if scaling up from a low resolution PDF would work, it would be better to have a direct solution in R.

[UPDATE: In the original post, I wrote about my trial and error when first trying this out. It was helpful for me but less helpful for others. Since this post gets a fair amount of traffic, I now have the solution first and the rest of the original post below that]

If you need to use a font not included in R, such as the Arial family of fonts for a publisher like PLOS, the extrafont package is very useful but takes a long time to run (but should only have to run once – except maybe when you update R you’ll have to do it again).

font_import() # this gets fonts installed anywhere on your computer, most commonly from MS Office install fonts. It takes a LONG while.

Now to make a nice looking plot with a higher resolution postscript is the best but you may need to download another program to view it on you computer. TIFF is good for Raster data including photos and color gradients. PDF should be good for line and point based plots. JPEG is not going to be good for high resolution figures due to compression and detail loss but is easy to view and use for presentations, and PNG is lower quality that is useful for websites. Here are some high resolution figure examples. They all start by telling R to open a connection (device) to make a PDF or TIFF or EPS rather than just print to the R/RStudio plot default device. Then making the plot, then closing the device, at which point the file is saved. It won’t show up on the screen but will be saved to your working directory.

x = 1:20
y = x * 2

setwd('/Users/.../Folder/') # place to save the file - can be over-ridden by putting a path in the 
                              file = “ “ part of the functions below.

pdf(file = "FileName.pdf", width = 12, height = 17, family = "Helvetica") # defaults to 7 x 7 inches
plot(x, y, type = "l")

postscript(“FileName.eps", width = 12, height = 17, horizontal = FALSE, 
           onefile = FALSE, paper = "special", colormodel = "cmyk", 
           family = "Courier")

bitmap(“FileName.tiff", height = 12, width = 17, units = 'cm', 
       type = "tifflzw", res = 300)

tiff(“FileName.tiff", height = 12, width = 17, units = 'cm', 
     compression = "lzw", res = 300)


It took some time to figure out but here are some trials and the ultimate solution I came up with:

x <- 1:100
y <- 1:100
plot(x, y) # Make plot

Nothing happens in this case. However, by setting up the tiff file first, then making the plot, the resulting TIFF file is saved to your working directory and is 924 KB, 72 ppi, 480 x 480 pixels.

To increase the resolution I tried the following:
tiff("Plot2.tif", res = 300)
plot(x, y) # Make plot

but in RStudio the plot could not be printed and hence not saved because it was too large for the print area. Therefore, I had to open up R directly and run the code. Interestingly, a blank TIFF file was created of the same size as Plot1.tiff. This is where I got hung up for a while. I eventually found that R can’t figure out the other image parameters when resolution is changes or because the default is too big to print, so they have to be specified directly as such:

tiff("Plot3.tiff", width = 4, height = 4, units = 'in', res = 300)
plot(x, y) # Make plot

This creates a TIFF file that is 5,800 KB, 300 ppi, 4 inches by 4 inches. Surprisingly, on a Mac it still indicates that it’s only 72 ppi when viewed in Preview. The larger size indicates that it is actually 300 ppi. I ran the same code but specifically specified res = 72 and the file was only 334 KB, suggesting that Preview is incorrect and the file is really 300 ppi. I played with various compressions but lzw and none were the same while rle resulted in a larger file (less compression). That seems odd again.

Finally, I tried using the bitmap function to create a TIFF:

bitmap("Plot7.tiff", height = 4, width = 4, units = 'in', type="tifflzw", res=300)
plot(x, y)
par(mfrow = c(1,1))

Interestingly, this file is only 9 KB but is listed as 300 dpi, 1200 x 1200 pixels. I’m really not sure why these functions don’t seem to be working as smoothly as expected but hopefully this can help get high resolution images directly from R for publication. I plan to use the bitmap function in the future to create high resolution TIFF files for publication. This is what is desired by outlets such as PLOS ONE. It’s easier than dealing with postscript. I also don’t know if EPS files from R or RStudio are created with LaTeX. I know that can be a problem for PLOS ONE.

[UPDATE: Here is the code for my final figure for PLOS ONE with both postscript and tiff options plus it uses the extrafonts package to allow Arial fonts in postscript figures as required by PLOS ONE]

font_import() # this gets fonts installed anywhere on your computer, 
# most commonly from MS Office install fonts. 
# It takes a long while.

bitmap("CummulativeCaptures.tiff", height = 12, width = 17, 
units = 'cm', type="tifflzw", res=300)
postscript("CummulativeCaptures.eps", width = 8, height = 8, 
horizontal = FALSE, onefile = FALSE, paper = "special", 
colormodel = "cmyk", family = "Arial")
par(mar = c(3.5, 3.5, 1, 1), mgp = c(2, 0.7, 0), tck = -0.01)
plot(refmCumm$date, refmCumm$cumm, type = "n", xaxt = "n",
xlab = "Date",
ylab = "Cumulative number of salamanders per plot",
xlim = c(r[1], r[2]),
ylim = ylims)
axis.POSIXct(1, at = seq(r[1], r[2], by = "year"), format = "%b %Y")
lines(refmCumm$date, refmCumm$cumm, lty = 1, pch = 19)
lines(depmCumm$date, depmCumm$cumm, lty = 2, pch = 24)
arrows(refmCumm$date, refmCumm$cumm+refmCumm$se, refmCumm$date, refmCumm$cumm-refmCumm$se, angle=90, code=3, length=0)
arrows(depmCumm$date, depmCumm$cumm+depmCumm$se, depmCumm$date, depmCumm$cumm-depmCumm$se, angle=90, code=3, length=0)

 EDIT: Just found a nice blog post with recommendations on device 
outputs on Revolutions here

Below is all the code that includes comparison of sizes with PDF, PNG, JPEG, and EPS files as well.

plot(x, y) # Make plot

tiff("Plot2.tiff", res = 300)
plot(x, y) # Make plot

tiff("Plot3.tiff", width = 4, height = 4, units = 'in', res = 300)
plot(x, y) # Make plot

tiff("Plot3.72.tiff", width = 4, height = 4, units = 'in', res = 72)
plot(x, y) # Make plot

tiff("Plot4.tiff", width = 4, height = 4, units = 'in', res = 300, compression = 'lzw')
plot(x, y) # Make plot

tiff("Plot5.tiff", width = 4, height = 4, units = 'in', res = 300, compression = 'none')
plot(x, y) # Make plot

tiff("Plot6.tiff", width = 4, height = 4, units = 'in', res = 300, compression = 'rle')
 # Make plot

bitmap("Plot7.tiff", height = 4, width = 4, units = 'in', type="tifflzw", res=300)
plot(x, y)
par(mfrow = c(1,1))

bitmap("Plot8.tiff", height = 480, width = 480, type="tifflzw", res=300)
plot(x, y)
par(mfrow = c(1,1))

jpeg("Plot3.jpeg", width = 4, height = 4, units = 'in', res = 300)
plot(x, y) # Make plot

tiff("Plot4b.tiff", width = 4, height = 4, pointsize = 1/300, units = 'in', res = 300)
plot(x, y) # Make plot

png("Plot3.png", width = 4, height = 4, units = 'in', res = 300)
plot(x, y) # Make plot

pdf("Plot3.pdf", width = 4, height = 4)
plot(x, y) # Make plot"Plot3.eps", width = 480, height = 480)
plot(x, y) # Make plot

Plotting 95% Confidence Bands in R

I am comparing estimates from subject-specific GLMMs and population-average GEE models as part of a publication I am working on. As part of this, I want to visualize predictions of each type of model including 95% confidence bands.

First I had to make a new data set for prediction. I could have compared fitted values with confidence intervals but I am specifically interested in comparing predictions for particular variables while holding others constant. For example, soil temperature is especially important for salamanders, so I am interested in the predicted effects of soil temperature from the different models. I used the expand.grid and model.matrix functions in R to generate a new data set where soil temperature varied from 0 to 30 C. The other variables were held constant at their mean levels during the study. Because of the nature of the contrast argument in the model.matrix function, I had to include more than one level of the factor “season”. I then removed all season except spring. In effect I am asking, what is the effect of soil temperature on salamander activity during the spring when the other conditions are constant (e.g. windspeed = 1.0 m/s, rain in past 24 hours =  This code is based on code from Ben Bolker via

# Compare Effects of SoilT with 95% CIs
newdat.soil <- expand.grid(
SoilT = seq(0, 30, 1),
RainAmt24 = mean(RainAmt24),
RH = mean(RH),
windspeed = mean(windspeed),
season = c("spring", "summer", "fall"),
droughtdays = mean(droughtdays),
count = 0
newdat.soil$SoilT2 <- newdat.soil$SoilT^2

# Spring
newdat.soil.spring <- newdat.soil[newdat.soil$season == 'spring', ]

mm = model.matrix(terms(glmm1), newdat.soil)

Next I calculated the 95% confidence intervals for both the GLMM and GEE models. For the GLMM the plo and phi are the low and high confidence intervals for the fixed effects assuming zero effect of the random sites. tlo and thi account for the uncertainty in the random effects.

newdat.soil$count = mm %*% fixef(glmm1)
pvar1 <- diag(mm %*% tcrossprod(vcov(glmm1),mm))
tvar1 <- pvar1+VarCorr(glmm1)$plot[1]
newdat.soil <- data.frame(
, plo = newdat.soil$count-2*sqrt(pvar1)
, phi = newdat.soil$count+2*sqrt(pvar1)
, tlo = newdat.soil$count-2*sqrt(tvar1)
, thi = newdat.soil$count+2*sqrt(tvar1)
mm.geeEX = model.matrix(terms(geeEX), newdat.soil)
newdat.soil$count.gee = mm.geeEX %*% coef(geeEX)
tvar1.gee <- diag(mm.geeEX %*% tcrossprod(geeEX$geese$vbeta, mm.geeEX))
newdat.soil <- data.frame(
, tlo.gee = newdat.soil$count-2*sqrt(tvar1.gee)
, thi.gee = newdat.soil$count+2*sqrt(tvar1.gee)

The standard error of the fixed effects are larger in the GEE model than in the GLMM, but when the variation associated with the random effects are accounted for, the uncertainty (95% CI) around the estimates is greater in the GLMM. This is especially evident when the estimated values are large since the random effects are exponential on the original scale. This can be seen in the below plots

Although this plot does the job, it isn’t an efficient use of space, nor is it easy to compare exactly where the different lines fall. It would be nice to plot everything on one set of axes. The only trouble is that all the lines could be difficult to see just using solid and dashed/dotted lines. To help with this, I combine the plots but added color and shading using the polygon function. The code and plot are below

plot(newdat.soil.spring$SoilT, exp(newdat.soil.spring$count.gee),
xlab = "Soil temperature (C)",
ylab = 'Predicted salamander observations',
type = 'l',
ylim = c(0, 25))
, rev(newdat.soil.spring$SoilT))
, c(exp(newdat.soil.spring$thi.gee)
, rev(exp(newdat.soil.spring$tlo.gee)))
, col = 'grey'
, border = NA)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$thi.gee),
type = 'l',
lty = 2)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$tlo.gee),
type = 'l',
lty = 2)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$count.gee),
type = 'l',
lty = 1,
col = 2)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$count),
col = 1)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$thi),
type = 'l',
lty = 2)
lines(newdat.soil.spring$SoilT, exp(newdat.soil.spring$tlo),
type = 'l',
lty = 2)

GLMM vs GEE plot with 95% confidence intervals

Now you can directly compare the results of the GLMM and GEE models. The predicted values (population-averaged) for the GEE is represented by the red line, while the average (random effects = 0, just fixed effects) from the GLMM are represented by the solid black line. The dashed lines represent the 95% confidence intervals for the GLMM and the shaded area is the 95% confidence envelope for the GEE model. As you can see, the GEE has much higher confidence in it’s prediction of soil temperature effects on salamander surface activity than the GLMM model. This would not be apparent without visualizing the predictions with confidence intervals because the standard errors of the fixed effects were lower in the GLMM than in the GEE. This is because the SEs in the GEE include the site-level (random effect) variation while the GLMM SEs of the covariates do not include this variation and are interpreted as the effect of a change of 1 X on Y at a given site.

Plotting grouped data vs time with error bars in R

This is my first blog since joining R-bloggers. I’m quite excited to be part of this group and apologize if I bore any experienced R users with my basic blogs for learning R or offend programmers with my inefficient, sloppy coding. Hopefully writing for this most excellent community will help improve my R skills while helping other novice R users.

I have a dataset from my dissertation research were I repeatedly counted salamanders from some plots and removed salamanders from other plots. I was interested in plotting the captures over time (sensu Hairston 1987). As with all statistics and graphs in R, there are a variety of ways to create the same or similar output. One challenge I always have is dealing with dates. Another challenge is plotting error bars on plots. Now I had the challenge of plotting the average captures per night or month ± SE for two groups (reference and depletion plots – 5 of each on each night) vs. time. This is the type of thing that could be done in 5 minutes on graph paper by hand but took me a while in R and I’m still tweaking various plots. Below I explore a variety of ways to handle and plot this data. I hope it’s helpful for others. Chime in with comments if you have any suggestions or similar experiences.
> ####################################################################
> # Summary of Count and Removal Data
> # Dissertation Project 2011
> # October 25, 2011
> # Daniel J. Hocking
> ####################################################################
> Data <- read.table(‘/Users/Dan/…/AllCountsR.txt’, header = TRUE, na.strings = “NA”)
> str(Data)
‘data.frame’:            910 obs. of  32 variables:
 $ dates            : Factor w/ 91 levels “04/06/09″,”04/13/11”,..: 12 14 15 16 17 18 19 21 22 23 …
You’ll notice that when importing a text file created in excel with the default date format, R treats the date variable as a Factor within the data frame. We need to convert it to a date form that R can recognize. Two such built-in functions are as.Date and as.POSIXct. The latter is a more common format and the one I choose to use (both are very similar but not fully interchangeable). To get the data in the POSIXct format in this case I use the strptime function as seen below. I also create a couple new columns of day, month, and year in case they become useful for aggregating or summarizing the data later.
> dates <-strptime(as.character(Data$dates), “%m/%d/%y”)  # Change date from excel storage to internal R format
> dim(Data)
[1] 910  32
> Data = Data[,2:32]               # remove      
> Data = data.frame(date = dates, Data)      #this is now the date in useful fashion
> Data$mo <- strftime(Data$date, “%m”)
> Data$mon <- strftime(Data$date, “%b”)
> Data$yr <- strftime(Data$date, “%Y”)
> monyr <- function(x)
+ {
+     x <- as.POSIXlt(x)
+     x$mday <- 1
+     as.POSIXct(x)
+ }
> Data$mo.yr <- monyr(Data$date)
> str(Data)
‘data.frame’:            910 obs. of  36 variables:
 $ date             : POSIXct, format: “2008-05-17” “2008-05-18” …
$ mo               : chr  “05” “05” “05” “05” …
 $ mon              : chr  “May” “May” “May” “May” …
 $ yr               : chr  “2008” “2008” “2008” “2008” …
 $ mo.yr            : POSIXct, format: “2008-05-01 00:00:00” “2008-05-01 00:00:00” …
As you can see the date is now in the internal R date form of POSIXct (YYYY-MM-DD).
Now I use a custom function to summarize each night of counts and removals. I forgot offhand how to call to a custom function stored elsewhere to I lazily pasted it in my script. I found this nice little function online but I apologize to the author because I don’t remember were I found it.
> library(ggplot2)
> library(doBy)
> ## Summarizes data.
> ## Gives count, mean, standard deviation, standard error of the mean, and confidence interval (default 95%).
> ## If there are within-subject variables, calculate adjusted values using method from Morey (2008).
> ##   measurevar: the name of a column that contains the variable to be summariezed
> ##   groupvars: a vector containing names of columns that contain grouping variables
> ##   na.rm: a boolean that indicates whether to ignore NA’s
> ##   conf.interval: the percent range of the confidence interval (default is 95%)
> summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE, conf.interval=.95) {
+     require(doBy)
+     # New version of length which can handle NA’s: if na.rm==T, don’t count them
+     length2 <- function (x, na.rm=FALSE) {
+         if (na.rm) sum(!
+         else       length(x)
+     }
+     # Collapse the data
+     formula <- as.formula(paste(measurevar, paste(groupvars, collapse=” + “), sep=” ~ “))
+     datac <- summaryBy(formula, data=data, FUN=c(length2,mean,sd), na.rm=na.rm)
+     # Rename columns
+     names(datac)[ names(datac) == paste(measurevar, “.mean”,    sep=””) ] <- measurevar
+     names(datac)[ names(datac) == paste(measurevar, “.sd”,      sep=””) ] <- “sd”
+     names(datac)[ names(datac) == paste(measurevar, “.length2″, sep=””) ] <- “N”
+     datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean
+     # Confidence interval multiplier for standard error
+     # Calculate t-statistic for confidence interval:
+     # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
+     ciMult <- qt(conf.interval/2 + .5, datac$N-1)
+     datac$ci <- datac$se * ciMult
+     return(datac)
+ }
> # summarySE provides the standard deviation, standard error of the mean, and a (default 95%) confidence interval
> gCount <- summarySE(Count, measurevar=”count”, groupvars=c(“date”,”trt”))
Now I’m ready to plot. I’ll start with a line graph of the two treatment plot types.
> ### Line Plot ###
> # Ref: Quick-R
> # convert factor to numeric for convenience
> gCount$trtcode <- as.numeric(gCount$trt)
> ntrt <- max(gCount$trtcode)
> # ranges for x and y axes
> xrange <- range(gCount$date)
> yrange <- range(gCount$count)
> # Set up blank plot
> plot(gCount$date, gCount$count, type = “n”,
+      xlab = “Date”,
+      ylab = “Mean number of salamanders per night”)
> # add lines
> color <- seq(1:ntrt)
> line <- seq(1:ntrt)
> for (i in 1:ntrt){
+   Treatment <- subset(gCount, gCount$trtcode==i)
+   lines(Treatment$date, Treatment$count, col = color[i])#, lty = line[i])
+ }

As you can see this is not attractive but it does show that I generally caught more salamander in the reference (red line) plots. This makes it apparent that a line graph is probably a bit messy for this data and might even be a bit misleading because data was not collected continuously or at even intervals (weather and season dependent collection). So, I tried a plot of just the points using the scatterplot function from the “car” package.
> ### package car scatterplot by groups ###
> library(car)
> # Plot
> scatterplot(count ~ date + trt, data = gCount,
+             smooth = FALSE, grid = FALSE, reg.line = FALSE,
+    xlab=”Date”, ylab=”Mean number of salamanders per night”)

This was nice because it was very simple to code. It includes points from every night but I would still like to summarize it more. Before I get to that, I would like to try having breaks between the years. The lattice package should be useful for this.
> library(lattice)
> # Add year, month, and day to dataframe
> chardate <- as.character(gCount$date)
> splitdate <- strsplit(chardate, split = “-“)
> gCount$year <- as.numeric(unlist(lapply(splitdate, “[“, 1)))
> gCount$month <- as.numeric(unlist(lapply(splitdate, “[“, 2)))
> gCount$day <- as.numeric(unlist(lapply(splitdate, “[“, 3)))
> # Plot
> xyplot(count ~ trt + date | year,
+ data = gCount,
+ ylab=”Daily salamander captures”, xlab=”date”,
+ pch = seq(1:ntrt),
+ scales=list(x=list(alternating=c(1, 1, 1))),
+ between=list(y=1),
+ par.strip.text=list(cex=0.7),
+ par.settings=list(axis.text=list(cex=0.7)))

Obviously there is a problem with this. I am not getting proper overlaying of the two treatments. I tried adjusting the equation (e.g. count ~ month | year*trt), but nothing was that enticing and I decided to go back to other plotting functions. The lattice package is great for trellis plots and visualizing mixed effects models.
 I now decided to summarize the data by month rather than by day and add standard error bars. This goes back to using the base plot function.
> ### Line Plot ###
> # summarySE provides the standard deviation, standard error of the mean, and a (default 95%) confidence interval
> mCount <- summarySE(Count, measurevar=”count”, groupvars=c(“mo.yr”,”trt”))
> refmCount <- subset(mCount, mCount$trt == “Reference”)
> depmCount <- subset(mCount, mCount$trt == “Depletion”)
> daterange=c(as.POSIXct(min(mCount$mo.yr)),as.POSIXct(max(mCount$mo.yr)))
> # determine the lowest and highest months
> ylims <- c(0, max(mCount$count + mCount$se))
> r <- as.POSIXct(range(refmCount$mo.yr), “month”)
> plot(refmCount$mo.yr, refmCount$count, type = “n”, xaxt = “n”,
+      xlab = “Date”,
+      ylab = “Mean number of salamanders per night”,
+      xlim = c(r[1], r[2]),
+      ylim = ylims)
> axis.POSIXct(1, at = seq(r[1], r[2], by = “month”), format = “%b”)
> points(refmCount$mo.yr, refmCount$count, type = “p”, pch = 19)
> points(depmCount$mo.yr, depmCount$count, type = “p”, pch = 24)
> arrows(refmCount$mo.yr, refmCount$count+refmCount$se, refmCount$mo.yr, refmCount$count-refmCount$se, angle=90, code=3, length=0)
> arrows(depmCount$mo.yr, depmCount$count+depmCount$se, depmCount$mo.yr, depmCount$count-depmCount$se, angle=90, code=3, length=0)

Now that’s a much better visualization of the data and that’s the whole goal of a figure for publication. The only thing I might change would be I might plot by year with the labels of Month-Year (format = %b $Y). I might add a legend but with only two treatments I might just include the info in the figure description.
Although that is probably going to be my final product for my current purposes, I wanted to explore a few other graphing options for visualizing this data. The first is to use box plots. I use the add = TRUE option to add a second group after subsetting the data.
> ### Boxplot ###
> #    as.POSIXlt(date)$mon     #gives the months in numeric order mod 12 with January = 0 and December = 11
> refboxplot <- boxplot(count ~ date, data = Count, subset = trt == “Reference”,
+                       ylab = “Mean number of salamanders per night”,
+                       xlab = “Date”)   #show the graph and save the data
> depboxplot <- boxplot(count ~ date, data = Count, subset = trt == “Depletion”, col = 2, add = TRUE)

Clearly this is a mess and not useful. But you can imagine that with some work and summarizing by month or season it could be a useful and efficient way to present the data. Next I tried the popular package ggplot2.
> ### ggplot ###
> library(ggplot2)
> ggplot(data = gCount, aes(x = date, y = count, group = trt)) +
+     #geom_point(aes(shape = factor(trt))) +
+     geom_point(aes(colour = factor(trt), shape = factor(trt)), size = 3) +
+     #geom_line() +
+     geom_errorbar(aes(ymin=count-se, ymax=count+se), width=.1) +
+     #geom_line() +
+    # scale_shape_manual(values=c(24,21)) + # explicitly have sham=fillable triangle, ACCX=fillable circle
+     #scale_fill_manual(values=c(“white”,”black”)) + # explicitly have sham=white, ACCX=black
+     xlab(“Date”) +
+     ylab(“Mean number of salamander captures per night”) +
+     scale_colour_hue(name=”Treatment”, # Legend label, use darker colors
+                      l=40) +                  # Use darker colors, lightness=40
+     theme_bw() + # make the theme black-and-white rather than grey (do this before font changes, or it overrides them)
+     opts(legend.position=c(.2, .9), # Position legend inside This must go after theme_bw
+           panel.grid.major = theme_blank(), # switch off major gridlines
+           panel.grid.minor = theme_blank(), # switch off minor gridlines
+          legend.title = theme_blank(), # switch off the legend title
+          legend.key = theme_blank()) # switch off the rectangle around symbols in the legend
This plot could work with some fine tuning, especially with the legend(s) but you get the idea. It wasn’t as easy for me as the plot function but ggplot is quite versatile and probably a good package to have in your back pocket for complicated graphing.
Next up was the gplots package for the plotCI function.
> library(gplots)
> plotCI(
+   x = refmCount$mo.yr,
+   y = refmCount$count,
+   uiw = refmCount$se, # error bar length (default is to put this much above and below point)
+   pch = 24, # symbol (plotting character) type: see help(pch); 24 = filled triangle pointing up
+ = “white”, # fill colour for symbol
+   cex = 1.0, # symbol size multiplier
+   lty = 1, # error bar line type: see help(par) – not sure how to change plot lines
+   type = “p”, # p=points, l=lines, b=both, o=overplotted points/lines, etc.; see help(plot.default)
+   gap = 0, # distance from symbol to error bar
+   sfrac = 0.005, # width of error bar as proportion of x plotting region (default 0.01)
+   xlab = “Year”, # x axis label
+   ylab = “Mean number of salamanders per night”,
+   las = 1, # axis labels horizontal (default is 0 for always parallel to axis)
+   font.lab = 1, # 1 plain, 2 bold, 3 italic, 4 bold italic, 5 symbol
+   xaxt = “n”) # Don’t print x-axis
> )
>   axis.POSIXct(1, at = seq(r[1], r[2], by = “year”), format = “%b %Y”) # label the x axis by month-years
> plotCI(
+   x=depmCount$mo.yr,
+   y = depmCount$count,
+   uiw=depmCount$se, # error bar length (default is to put this much above and below point)
+   pch=21, # symbol (plotting character) type: see help(pch); 21 = circle
+”grey”, # fill colour for symbol
+   cex=1.0, # symbol size multiplier
+   lty=1, # line type: see help(par)
+   type=”p”, # p=points, l=lines, b=both, o=overplotted points/lines, etc.; see help(plot.default)
+   gap=0, # distance from symbol to error bar
+   sfrac=0.005, # width of error bar as proportion of x plotting region (default 0.01)
+   xaxt = “n”,
+   add=TRUE # ADD this plot to the previous one
+ )

Now this is a nice figure. I must say that I like this. It is very similar to the standard plot code and graph but it was a little easier to add the error bars.
So that’s it, what a fun weekend I had! Let me know what you think or if you have any suggestions. I’m new to this and love to learn new ways of coding things in R.

Get every new post delivered to your Inbox.

Join 75 other followers