Description

CellProfiler is free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.

Vendor website: http://cellprofiler.org/

Accessing

CellProfiler is part of the LSA contributed software library. To use it, you must first load the lsa module, then the cellprofiler module.

You can either load the lsa module separately, as in

$ module load lsa
$ module load cellprofiler
$ module load lsa cellprofiler

Running CellProfiler interactively

Since Flux is designed and optimized for batch computing, running interactive programs with graphical user interface on Flux can be very slow. It is therefore strongly recommended that you do not run CellProfiler interactively on Flux. Instead, generate an analysis profile in CellProfiler on your local workstation, export that profile, copy the profile to Flux, and then run the profile using the batch mode of CellProfiler on the cluster. Instructions for doing this are in the section “Running CellProfiler from PBS” section below.
If you do need to start an interactive graphical CellProfiler session on Flux, make sure you are running an X Windows server on your local workstation, log in to Flux using X Windows forwarding (from a Linux or Mac workstation, you would do this by using the -X option with your ssh command; be sure to replace my-uniqname below with your uniqname), load the cellprofiler module, and then run “CellProfiler.py –do-not-build –do-not-fetch”). Note that it may take 1-2 minutes for the CellProfiler window to appear; this is normal.

[my-uniqname@flux-login2 ~]$ module load lsa cellprofiler
[my-uniqname@flux-login2 ~]$ CellProfiler.py --do-not-build --do-not-fetch
Plugin directory doesn't point to valid folder: /home2/my-uniqname/plugins
Pipeline saved with CellProfiler version 20130807122121
Pipeline saved with CellProfiler version 20130807122121
Pipeline saved with CellProfiler version 20130807122121
Pipeline saved with CellProfiler version 20130807122121
Version: 2013-08-07T12:21:21 f17ee88 / 20130807122121

As always, please be sure to not run long or intensive computations on the cluster login nodes, as this can cause problems for other users. For anything except for project file setup and quick tests, please run CellProfiler from PBS as described below, or run an interactive graphical job on the Flux compute nodes by adding the -X option to the “qsub -I” command in the “How to submit an interactive job” section at http://cac.engin.umich.edu/resources/systems/flux/pbs

If, when you start up CellProfiler in interactive graphical mode you do not get a window, and instead of the output shown above you see:

[my-uniqname@flux-login2 ~]$ CellProfiler.py --do-not-build --do-not-fetch
Load ilastik Core
stopping worker thread  0
stopping worker thread  1
stopping worker thread  2
stopping worker thread  3
stopping worker thread  4
stopping worker thread  5
stopping worker thread  6
stopping worker thread  7
stopping worker thread  8
stopping worker thread  9
stopping worker thread  10
stopping worker thread  11
stopping worker thread  12
[myuniqname@flux-login2 ~]$

Running CellProfiler from PBS

The following is an example of how to run CellProfiler in batch (headless) mode via PBS.

For this example, we will use the “Human cells” basic pipeline example that is available from http://cellprofiler.org/examples.shtml Start by logging into Flux and running the following commands to get the example files and unpack them:

[my-uniqname@flux-login2 ExampleHumanImages]$ ls -l
total 784
-rw-rw-r-- 1 my-uniqname lsa    326 Jul 20  2010 AboutTheseImages.rtf
-rwxrwxr-x 1 my-uniqname lsa 233336 Jul 20  2010 AS_09125_050116030001_D03f00d0.tif
-rwxrwxr-x 1 my-uniqname lsa 240924 Jul 20  2010 AS_09125_050116030001_D03f00d1.tif
-rw-rw-r-- 1 my-uniqname lsa 293652 Jul 20  2010 AS_09125_050116030001_D03f00d2.tif
-rw-rw-r-- 1 my-uniqname lsa  12981 Mar 11  2011 ExampleHuman.cp
[my-uniqname@flux-login2 ExampleHumanImages]$

The files ending in “.tif” are the images that will be analyzed, and ExampleHuman.cp is the CellProfiler pipeline file that contains the steps that will be performed during the analysis. Normally, you will want to run CellProfiler on your local workstation in order to create a pipeline, export the pipeline in order to create a .cp file (via the File -> Export -> Pipeline… menu), and then transfer the pipeline file and image files to Flux (using scp, SFTP, Globus, or a Value Storage share).

Next, also in the ExampleHumanImages directory, create the following PBS file, naming the file example.pbs. When you create the PBS file, change example_flux in the “#PBS -A” line to be the name of your Flux allocation, and change your-uniqname@umich.edu to your email address.

#!/bin/bash

####  PBS preamble

#PBS -N cellprofiler_example
#PBS -M uniqname@umich.edu
#PBS -m abe 

#PBS -l nodes=1:ppn=2,mem=4000mb,walltime=00:15:00
#PBS -j oe 
#PBS -V

#PBS -A example_flux
#PBS -l qos=flux
#PBS -q flux

####  End PBS preamble

#  Show list of CPUs you ran on, if you're running under PBS
if [ -n "$PBS_NODEFILE" ]; then cat $PBS_NODEFILE; fi

#  Change to the directory you submitted from
if [ -n "$PBS_O_WORKDIR" ]; then cd $PBS_O_WORKDIR; fi

#  Put your job commands after this line

# Create a subdirectory for CellProfiler's output files:
mkdir output-${PBS_JOBID}
echo "Putting output files into the subdirectory output-${PBS_JOBID}"

# Run CellProfiler:
CellProfiler.py --do-not-build --do-not-fetch --run-headless --run 
    --image-directory="./" --output-directory="./output-${PBS_JOBID}" 
    --pipeline="ExampleHuman.cp"

The above PBS file requests 2 cores on a single node and 4 GB of memory for 15 minutes. You should adjust these resources as necessary for non-example jobs that you run. However, note that CellProfiler cannot take advantage of cores spread out across multiple nodes; all cores must be on the same node in order for CellProfiler to be able to see and use them.

Note: there must be no spaces after the backslash on the last three lines of the PBS file. Alternatively, you can omit the backslashes and put the entire CellProfiler.py command onto a single line, like this:

CellProfiler.py --do-not-build --do-not-fetch --run-headless --run --image-directory="./" --output-directory="./output-${PBS_JOBID}" --pipeline="ExampleHuman.cp"

To submit the job, run the commands:

$ module load lsa cellprofiler
$ qsub example.pbs

After your job runs, you will have a file named cellprofiler_example.oXXXXXXXX in the ExampleHumanImages directory, where XXXXXXXX is the job number you got when you ran qsub. This file will contain any messages (including error messages) produced by CellProfiler when it ran; you should check this file first to see if your job ran correctly. Files created by the CellProfiler pipeline will then be in the subdirectory output-XXXXXXXX.nyx.engin.umich.edu:

[my-uniqname@flux-login2 ExampleHumanImages]$ ls -l output-10898746.nyx.engin.umich.edu/
total 260
-rw------- 1 my-uniqname lsa 120639 Aug 20 13:53 AS_09125_050116030001_D03f00d0outline.tiff
-rw------- 1 my-uniqname lsa  40355 Aug 20 13:53 DefaultOUT_Cells.csv
-rw------- 1 my-uniqname lsa  40214 Aug 20 13:53 DefaultOUT_Cytoplasm.csv
-rw------- 1 my-uniqname lsa   2201 Aug 20 13:53 DefaultOUT_Image.csv
-rw------- 1 my-uniqname lsa  39998 Aug 20 13:53 DefaultOUT_Nuclei.csv
[my-uniqname@flux-login2 ExampleHumanImages]$

Additional information

Additional information is available on the CellProfiler web site, http://cellprofiler.org/ For any Flux-specific assistance running CellProfiler, contact hpc-support@umich.edu.