We recently installed the gputools package for R on the Flux cluster. gputools provides functions that enable offloading of a limited set of analysis procedures to a general purpose Graphical Programming Unit (GPU). The GPU has many processors of its own and some mathematical operations can be run in parallel there, often many times faster than they will run on the CPU. This post will show a couple of short examples of using gputools.

The examples will use some of the sample data sets included with the R distribution, so you should load the R library that includes those as well as gputools. Here is an example of calculating Euclidean distance, modified from Bobby Yuen’s Speeding up R with multicore and gputools page.

library(datasets)
library(gputools)
set.seed(5446)
p <- 20
X <- matrix(rnorm(2^p),ncol = 2^(p/2))

# Run dist on CPU and GPU to compare processing time
system.time(d <- dist(X))
system.time(gpud <- gpuDist(X))

When this was run on Flux using an older M2070 GPU, we got the following results.

> system.time(d  system.time(gpud <- gpuDist(X))
   user  system elapsed
  0.248   0.352   0.655

As you can see, there is significant speed-up using the GPU: it ran in about 3% of the time.

Here is another, slightly shortened, example from the gputools documentation, this time running a simple ANOVA model.

## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
anova(lm.D9 <- gpuLm(weight ~ group))
summary(lm.D90 <- gpuLm(weight ~ group - 1)) # omitting intercept
summary(resid(lm.D9) - resid(lm.D90)) # - residuals almost identical

Here is an example, modified from the example given with the anscombe data included with the R datasetslibrary. The only change required to use the GPU was to change the line

  mods[[i]] <- lmi <- lm(ff, data = anscombe)

to

  mods[[i]] <- lmi <- gpuLm(ff, data = anscombe)

Here is the full example.

require(stats); require(graphics)
summary(anscombe)
##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- gpuLm(ff, data = anscombe)
  print(anova(lmi))
}
## See how close they are (numerically!)
sapply(mods, coef)
lapply(mods, function(fm) coef(summary(fm)))

# Set up the Postscript graphics device
postscript("anscombe.eps", height=4, width=4, horizontal=F)
## Now, do what you should have done in the first place: PLOTS
op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13))
  abline(mods[[i]], col = "blue")
}
mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex = 1.5)
par(op)

# Close the Postscript graphics device
dev.off()

If you put the above code into a file called anscombe.R, to run that in batch and save the results to anscombe.out, you would use the command

R CMD BATCH --vanilla anscombe.R anscombe.out