Availability

R is a general purpose statistical programming environment that is designed to enable a high degree of interactivity with your data. R can also be run in a batch mode that is suitable for use on the Flux cluster. You can see the available versions of R by using

$ module spider R

A specific version can loaded by specifying the name and version number on the load request.

$ module load R/3.3.0

Running R interactively

You should only run R interactively on the login hosts to test syntax against small data files. If you need to run R interactively on larger data sets, you should submit an interactive job (see our documentation on submitting an interactive job for details).

Here is a short example of starting and stopping R interactively.

$ module load R/3.3.0
$ R

R version 3.3.0 (2016-05-03) -- "Supposedly Educational"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
    [ . . . ]
Type 'q()' to quit R.

> q()
Save workspace image? [y/n/c]: n

As you work with R, R saves the state of your current session, including data, defined functions, loaded libraries, etc. You should generally not choose to save your workspace on exiting R. You may also find your output easier to read if you use the --quiet option, as that will suppress printing the information about the R version and copyright.

Running R in batch mode

The preferred method for running R is in batch mode. This is done by creating a text file containing the R commands you wish to run, then when you invoke R, you give it the name of the file that contains the commands it should run.

Put the following R commands in a file called test.R

library(datasets)
data(iris)
summary(iris)

To run test.R, you would invoke R this way

$ R CMD BATCH --no-restore --no-save test.R

which will run R as a batch job. Unless you provide the name of an output file, R will append out to the input file name and write the output there. In this case, it will be test.Rout. If prefer, say, test.out, then you would use

$ R CMD BATCH --no-restore --no-save test.R test.out

Running R from PBS

Running R from a PBS script is just like running it in batch, except that you put the commands into a file that is submitted to PBS, call it test.pbs. (For more information on PBS, see the Torque web page.) Here is an example PBS script to run R.

####  PBS preamble
#PBS -N R_test
#PBS -M uniqname@umich.edu
#PBS -m abe

#PBS -l procs=1,mem=1gb
#PBS -j oe
#PBS -V

#PBS -A example_flux
#PBS -l qos=flux
#PBS -q flux

####  End PBS preamble

#  Put your job commands after this line

#  List nodes and processors used
if [ -e "${PBS_NODEFILE}" ] ; then
   uniq -c $PBS_NODEFILE
fi

#  Change to the work directory
if [ -d "$PBS_O_WORKDIR" ] ; then
    cd "$PBS_O_WORKDIR"
fi

R CMD BATCH --no-restore --no-save test.R test.out

To submit the job, you would run

qsub test.pbs

Creating graphics files

During batch runs, there is no way to display graphics, but there may be times when you want to create a graph during a batch run. R has facility to do so by setting up a “graphics device” for particular output types. The known types are for: PostScript/EPS, PNG, TIFF, and BMP formats. Here are two examples, one for EPS and one for PNG, that just plot a histogram for randomly generated data.

postscript("test1.eps", height=4, width=4, horizontal=F)
seq.norm <- seq(from=-4, to=4, length=100)
plot(dnorm(seq.norm, mean=0, sd = .5)
   ~ seq.norm, type="l", xlab="", ylab="")
dev.off()

png("test1.png")
seq.norm <- seq(from=-4, to=4, length=100)
plot(dnorm(seq.norm, mean=0, sd = .5) ~ seq.norm, type="l", xlab="", ylab="")
dev.off()

Additional help with R

Consulting for Statistics, Computing and Analytics Research (CSCAR) at the University of Michigan offers help using R. If you are having trouble with R itself, you can contact them for assistance. They can be reached by e-mail at ds.consulting@umich.edu. They also have telephone and walk-in support, and you can make an appointment with a consultant if your problem is statistical or complex. See the CSCAR website for contact information, hours, and location.