Managing software with Lmod

By | | No Comments

Why software needs managing

Almost all software requires that you modify your environment in some way. Your environment consists of the running shell, typically bash on Flux, and the set of environment variables that are set. The most familiar environment variable ot most people is the PATH variable, which lists all the directories in which the shell will search for a command, but there may be many others, depending on the particular software package.

Beginning in July 2016, Flux uses a program called Lmod to resolve the changes needed to accommodate having many versions of the same software installed. We use Lmod to help manage conflicts among the environment variables across the spectrum of software packages. Lmod can be used to modify your own default environment settings, and it is also useful if you install software for your own use.

Basic Lmod usage

Listing, loading, and unloading modules

Lmod provides the module command, an easy mechanism for changing the environment as needed to add or remove software packages from your environment.

This should be done before submitting a job to the cluster and not from within a PBS submit script.

A module is a collection of environment variable settings that can be loaded or unloaded. When you first log into Flux, a set of modules is loaded by default in a module called SteEnv. To see which modules are currently loaded, you can use the command

$ module list

Currently Loaded Modules:
  1) intel/16.0.3   2) openmpi/1.10.2/intel/16.0.3   3) StdEnv

We try to make the names of the modules as close to the official name of the software as we can, so you can see what is available by using, for example,

$ module av matlab

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   matlab/R2016a

Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

where av stands for avail (available). To make the software found available for use, you use

$ module load matlab

(you can also use add instead of load, if you prefer.) If you need to use software that is incompatible with Matlab, you would remove it using

$ module unload matlab

More ways to find modules

In the output from module av matlab, module suggests a couple of alternate ways to search for software. When you use module av, it will match the search string anywhere in the module name; for example,

$ module av gcc

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   fftw/3.3.4/gcc/4.8.5                          hdf5-par/1.8.16/gcc/4.8.5
   fftw/3.3.4/gcc/4.9.3                   (D)    hdf5-par/1.8.16/gcc/4.9.3 (D)
   gcc/4.8.5                                     hdf5/1.8.16/gcc/4.8.5
   gcc/4.9.3                                     hdf5/1.8.16/gcc/4.9.3     (D)
   gcc/5.4.0                              (D)    openmpi/1.10.2/gcc/4.8.5
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3        openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)    openmpi/1.10.2/gcc/5.4.0  (D)

  Where:
   D:  Default Module

However, if you are looking for just gcc, that is more than you really want. So, you can use one of two commands. The first is

$ module spider gcc

----------------------------------------------------------------------------
  gcc:
----------------------------------------------------------------------------
    Description:
      GNU compiler suite

     Versions:
        gcc/4.8.5
        gcc/4.9.3
        gcc/5.4.0

     Other possible modules matches:
        fftw/3.3.4/gcc  gromacs/5.1.2/openmpi/1.10.2/gcc  hdf5-par/1.8.16/gcc  ...

----------------------------------------------------------------------------
  To find other possible module matches do:
      module -r spider '.*gcc.*'

----------------------------------------------------------------------------
  For detailed information about a specific "gcc" module (including how to load
the modules) use the module's full name.
  For example:

     $ module spider gcc/5.4.0
----------------------------------------------------------------------------

That is probably more like what you are looking for if you really are searching just for gcc. That also gives suggestions for alternate searching, but let us return to the first set of suggestions, and see what we get with keyword searching.

At the time of writing, if you were to use module av to look for Python, you would get this result.

[bennet@flux-build-centos7 modulefiles]$ module av python

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   python-dev/3.5.1

However, we have Python distributions that are installed that do not have python as part of the module name. In this case, module spider will also not help. Instead, you can use

$ module keyword python

----------------------------------------------------------------------------
The following modules match your search criteria: "python"
----------------------------------------------------------------------------

  anaconda2: anaconda2/4.0.0
    Python 2 distribution.

  anaconda3: anaconda3/4.0.0
    Python 3 distribution.

  epd: epd/7.6-1
    Enthought Python Distribution

  python-dev: python-dev/3.5.1
    Python is a general purpose programming language

----------------------------------------------------------------------------
To learn more about a package enter:

   $ module spider Foo

where "Foo" is the name of a module

To find detailed information about a particular package you
must enter the version if there is more than one version:

   $ module spider Foo/11.1
----------------------------------------------------------------------------

That displays all the modules that have been tagged with the python keyword or where python appears in the module name.

More about software versions

Note that Lmod will indicate the default version in the output from module av, which will be loaded if you do not specify the version.

$ module av gromacs

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)

  Where:
   D:  Default Module

When loading modules with complex names, for example, gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0, you can specify up to the second-from-last element to load the default version. That is,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc

will load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0

To load a version other than the default, specify the version as it is displayed by the module av command; for example,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3

When unloading a module, only the base name need be given; for example, if you loaded either gromacs module,

$ module unload gromacs

Module prerequisites and named sets

Some modules rely on other modules. For example, the gromacs module has many dependencies, some of which conflict with the default modules. To load it, you might first clear all modules with module purge, then load the dependencies, then finally load gromacs.

$ module list
Currently Loaded Modules:
  1) intel/16.0.3   2) openmpi/1.10.2/intel/16.0.3   3) StdEnv

$ module purge
$ module load gcc/5.4.0 openmpi/1.10.2/gcc/5.4.0 boost/1.61.0 mkl/11.3.3
$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
$ module list
Currently Loaded Modules:
  1) gcc/5.4.0                  4) mkl/11.3.3
  2) openmpi/1.10.2/gcc/5.4.0   5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
  3) boost/1.61.0

That’s a lot to do each time. Lmod provides a way to store a set of modules and give it a name. So, once you have the above list of modules loaded, you can use

$ module save my_gromacs

to save the whole list under the name my_gromacs. We recommend that you make each set fully self-contained, and that you use the full name/version for each module (to prevent problems if the default version of one of them changes), then use the combination

$ module purge
$ module restore my_gromacs
Restoring modules to user's my_gromacs

To see a list of the named sets you have (which are stored in ${HOME}/.lmod.d, use

$ module savelist
Named collection list:
  1) my_gromacs

and to see which modules are in a set, use

$ module describe my_gromacs
Collection "my_gromacs" contains: 
   1) gcc/5.4.0                   4) mkl/11.3.3
   2) openmpi/1.10.2/gcc/5.4.0    5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
   3) boost/1.61.0

How to get more information about the module and the software

We try to provide some helpful information about the modules. For example,

$ module help openmpi/1.10.2/gcc/5.4.0
------------- Module Specific Help for "openmpi/1.10.2/gcc/5.4.0" --------------

OpenMPI consists of a set of compiler 'wrappers' that include the appropriate
settings for compiling MPI programs on the cluster.  The most commonly used
of these are

    mpicc
    mpic++
    mpif90

Those are used in the same way as the regular compiler program, for example,

    $ mpicc -o hello hello.c

will produce an executable program file, hello, from C source code in hello.c.

In addition to adding the OpenMPI executables to your path, the following
environment variables set by the openmpi module.

    $MPI_HOME

For some generic information about the program you can use

$ module whatis openmpi/1.10.2/gcc/5.4.0
openmpi/1.10.2/gcc/5.4.0      : Name: openmpi
openmpi/1.10.2/gcc/5.4.0      : Description: OpenMPI implementation of the MPI protocol
openmpi/1.10.2/gcc/5.4.0      : License information: https://www.open-mpi.org/community/license.php
openmpi/1.10.2/gcc/5.4.0      : Category: Utility, Development, Core
openmpi/1.10.2/gcc/5.4.0      : Package documentation: https://www.open-mpi.org/doc/
openmpi/1.10.2/gcc/5.4.0      : ARC examples: /scratch/data/examples/openmpi/
openmpi/1.10.2/gcc/5.4.0      : Version: 1.10.2

and for information about what the module will set in the environment (in addition to the help text), you can use

$ module show openmpi/1.10.2/gcc/5.4.0
[ . . . .  Help text edited for space -- see above . . . . ]
whatis("Name: openmpi")
whatis("Description: OpenMPI implementation of the MPI protocol")
whatis("License information: https://www.open-mpi.org/community/license.php")
whatis("Category: Utility, Development, Core")
whatis("Package documentation: https://www.open-mpi.org/doc/")
whatis("ARC examples: /scratch/data/examples/openmpi/")
whatis("Version: 1.10.2")
prereq("gcc/5.4.0")
prepend_path("PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")
prepend_path("MANPATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/share/man")
prepend_path("LD_LIBRARY_PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/lib")
setenv("MPI_HOME","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0")

where the lines to attend to are the prepend_path(), setenv(), and prereq(). There is also an append_path() function that you may see. The prereq() function sets the list of other modules that must be loaded before the one being displayed. The rest set or modify the environment variable listed as the first argument; for example,

prepend_path("PATH", "/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")

adds /sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin to the beginning of the PATH environment variable.

Accessing the Internet from ARC-TS compute nodes

By | | No Comments

Normally, compute nodes on ARC-TS clusters cannot directly access the Internet because they have private IP addresses. This increases cluster security while reducing the costs (IPv4 addresses are limited, and ARC-TS clusters do not currently support IPv6). However, this also means that jobs cannot install software, download files, or access databases on servers located outside of University of Michigan networks: the private IP addresses used by the cluster are routable on-campus but not off-campus.

If your work requires these tasks, there are three ways to allow jobs running on ARC-TS clusters to access the Internet, described below. The best method to use depends to a large extent on the software you are using. If your software supports HTTP proxying, that is the best method. If not, SOCKS proxying or SSH tunneling may be suitable.

HTTP proxying

HTTP proxying, sometimes called “HTTP forward proxying”  is the simplest and most robust way to access the Internet from ARC-TS clusters. However, there are two main limitations:

  • Some software packages do not support HTTP proxying.
  • HTTP proxying only supports HTTP, HTTPS and FTP protocols.

If either of these conditions apply (for example, if your software needs a database protocol such as MySQL), users should explore SOCKS proxying or SSH tunneling, described below.

Some popular software packages that support HTTP proxying include:

HTTP proxying is automatically set up when you log in to ARC-TS clusters and it should be used by any software which supports HTTP proxying without any special action on your part.

Here is an example that shows installing the Python package pyvcf from within an interactive job running on a Flux compute node:


[markmont@flux-login1 ~]$ module load anaconda2/latest
[markmont@flux-login1 ~]$ qsub -I -V -A example_flux -q flux -l nodes=1:ppn=2,pmem=3800mb,walltime=04:00:00,qos=flux
qsub: waiting for job 18927162.nyx.arc-ts.umich.edu to start
qsub: job 18927162.nyx.arc-ts.umich.edu ready

[markmont@nyx5792 ~]$ pip install –user pyvcf
Collecting pyvcf
Downloading PyVCF-0.6.7.tar.gz
Collecting distribute (from pyvcf)
Downloading distribute-0.7.3.zip (145kB)
100% |████████████████████████████████| 147kB 115kB/s
Requirement already satisfied (use –upgrade to upgrade): setuptools>=0.7 in
/usr/cac/rhel6/lsa/anaconda2/latest/lib/python2.7/site-packages/setuptools-19.6.2-py2.7.egg (from distribute->pyvcf)
Building wheels for collected packages: pyvcf, distribute
Running setup.py bdist_wheel for pyvcf … done
Stored in directory: /home/markmont/.cache/pip/wheels/68/93/6c/fb55ca4381dbf51fb37553cee72c62703fd9b856eee8e7febf
Running setup.py bdist_wheel for distribute … done
Stored in directory: /home/markmont/.cache/pip/wheels/b2/3c/64/772be880a32a0c41e64b56b13c25450ff31cf363670d3bc576
Successfully built pyvcf distribute
Installing collected packages: distribute, pyvcf
Successfully installed distribute pyvcf
[markmont@nyx5792 ~]$

If HTTP proxying were not supported by pip (or was otherwise not working), you’d be unable to access the Internet to install the pyvcf package and receive “Connection timed out”, “No route to host”, or “Connection failed” error messages when you tried to install it.

Information for advanced users

HTTP proxying is controlled by the following environment variables which are automatically set on each compute node:

export http_proxy="http://proxy.arc-ts.umich.edu:3128/"
export https_proxy="http://proxy.arc-ts.umich.edu:3128/"
export ftp_proxy="http://proxy.arc-ts.umich.edu:3128/"
export no_proxy="localhost,127.0.0.1,.localdomain,.umich.edu"
export HTTP_PROXY="${http_proxy}"
export HTTPS_PROXY="${https_proxy}"
export FTP_PROXY="${ftp_proxy}"
export NO_PROXY="${no_proxy}"

Once these are set in your environment, you can access the Internet from compute nodes — for example, you can install Python and R libraries from compute nodes. There’s no need to start any daemons as is needed with the first two solutions above. The HTTP proxy server proxy.arc-ts.umich.edu does support HTTPS but does not terminate the TLS session at the proxy; traffic is encrypted by the software the user runs and the traffic is not decrypted until it reaches the destination server on the Internet.

To prevent software from using HTTP proxying, run the following command:

unset http_proxy https_proxy ftp_proxy no_proxy HTTP_PROXY HTTPS_PROXY FTP_PROXY NO_PROXY

The above command will only affect software started from the current shell.  If you start a new shell (for example, if you open a new window or log in again) you’ll need to re-run the command above each time.  To permanently disable HTTP proxying for all software, add the command above to the end of your ~/.bashrc file.

Finally, note that HTTP proxying (which is forward proxying) should not be confused with reverse proxying.  Reverse proxying, which is done by the ARC Connect service, allows researchers to start web applications (including Jupyter notebooks, RStudio sessions, and Bokeh apps) on compute nodes and then access those web applications through the ARC Connect.

SOCKS

A second solution is available for any software that either supports the SOCKS protocol or that can be “made to work” with SOCKS. Most software does not support SOCKS, but here is an example using curl (which does have built-in support for SOCKS) to download a file from the Internet from inside an interactive job running on a Flux compute node. We use “ssh -D” to set up a “quick and dirty” SOCKS proxy server for curl to use:

[markmont@flux-login1 ~]$ qsub -I -V -A example_flux -q flux -l nodes=1:ppn=2,mem=8000mb,walltime=04:00:00,qos=flux
qsub: waiting for job 18927190.nyx.arc-ts.umich.edu to start
qsub: job 18927190.nyx.arc-ts.umich.edu ready

[markmont@nyx5441 ~]$ ssh -f -N -D 1080 flux-xfer.arc-ts.umich.edu
[markmont@nyx5441 ~]$ curl –socks localhost -O ftp://ftp.gnu.org/pub/gnu/bc/bc-1.06.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 272k 100 272k 0 0 368k 0 –:–:– –:–:– –:–:– 1789k
[markmont@nyx5441 ~]$ ls -l bc-1.06.tar.gz
-rw-r–r– 1 markmont lsa 278926 Feb 10 17:11 bc-1.06.tar.gz
[markmont@nyx5441 ~]$

A limitation of “ssh -D” is that it only handles TCP traffic, not UDP traffic (including DNS lookups, which happen over UDP). However, if you have a real SOCKS proxy accessible to you elsewhere on the U-M network (such as on a server in your lab), you can specify its hostname instead of “localhost” above and omit the ssh command in order to have UDP traffic handled.

For software that does not have built-in support for SOCKS, it’s possible to wrap the software with a library that intercepts networking calls and routes the traffic via the “ssh -D” SOCKS proxy (or a real SOCKS proxy, if you have one accessible to you on the U-M network). This will allow most software running on compute nodes to access the Internet. ARC-TS clusters provide one such SOCKS wrapper, socksify, by default:

[markmont@nyx5441 ~]$ telnet towel.blinkenlights.nl 666  # this won't work...
Trying 94.142.241.111...
telnet: connect to address 94.142.241.111: No route to host
Trying 2a02:898:17:8000::42...
[markmont@nyx5441 ~]$ ssh -f -N -D 1080 flux-xfer.arc-ts.umich.edu # if it's not still running from above
[markmont@nyx5441 ~]$ socksify telnet towel.blinkenlights.nl 666

=== The BOFH Excuse Server ===
the real ttys became pseudo ttys and vice-versa.

Connection closed by foreign host.
[markmont@nyx5441 ~]$

You can even surf the web in text mode from a compute node:

[markmont@nyx5441 ~]$ socksify links http://xsede.org/

socksify is the client part of the Dante SOCKS server.

Local SSH tunneling (“ssh -L”)

A final option for accessing the Internet from an ARC-TS  compute node is to set up a local SSH tunnel using the “ssh -L” command. This provides a local port on the compute node that processes can connect to to access a single specific remote port on a single specific host on a non-UM network.

MongoDB example

Here is an example that shows how to use a local tunnel to access a MongoDB database hosted off-campus from inside a job running on a compute node. First, on a cluster login node, run the following command in order to get the keys for flux-xfer.arc-ts.umich.edu added to your ~/.ssh/known_hosts file. This needs to be done interactively so that you can respond to the prompt that the ssh command gives you:

[markmont@flux-login1 ~]$ ssh flux-xfer.arc-ts.umich.edu
The authenticity of host 'flux-xfer.arc-ts.umich.edu (141.211.22.200)' can't be established.
RSA key fingerprint is 6f:8c:67:df:43:4f:e0:fc:80:5b:49:1a:eb:81:cc:54.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'flux-xfer.arc-ts.umich.edu' (RSA) to the list of known hosts.
---------------------------------------------------------------------
Advanced Research Computing - Technology Services
University of Michigan
hpc-support@umich.edu

This machine is intended for transferring data to and from the cluster
flux using scp and sftp only. Use flux-login.engin.umich.edu for
interactive use.

For usage information, policies, and updates, please see:
arc-ts.umich.edu

Thank you for using U-M information technology resources responsibly.
———————————————————————

^CConnection to flux-xfer.arc-ts.umich.edu closed.
[markmont@flux-login1 ~]$

After you see the login banner, the connection will hang, so press Control-C to terminate it and get your shell prompt back.

You can now run the following commands in a job, either interactively or in a PBS script, in order to access the MongoDB database at db.example.com from a compute node:

# Start the tunnel so that port 27017 on the compute node connects to port 27017 on db.example.com:
ssh -N -L 27017:db.example.com:27017 flux-xfer.arc-ts.umich.edu &
# Give the tunnel time to completely start up:
sleep 5
# You can now access the MongoDB database at db.example.com by connecting to localhost instead.
# For example, if you have the “mongo” command installed in your current directory, you could run the
# following command to view the collections available in the “admin” database:
./mongo --username MY_USERNAME --password “MY_PASSWORD” localhost/admin --eval ‘db.getCollectionNames();’
# When you are all done using it, tear down the tunnel:
kill %1

scp example

Here is an example that shows how to use a local tunnel to copy a file using scp from a remote system (residing on a non-UM network) named “far-away.example.com” onto an ARC-TS cluster from inside a job running on a compute node.

You should run the following commands inside an interactive PBS job the first time so that you can respond to prompts to accept various keys, as well as enter your password for far-away.example.com when prompted.

# Start the tunnel so that port 2222 on the compute node connects to port 22 on far-away.example.com:
ssh -N -L 2222:far-away.example.com:22 flux-xfer.arc-ts.umich.edu &
# Give the tunnel time to completely start up:
sleep 5
# Copy the file “my-data-set.csv” from far-away.example.com to the compute node:
# Replace “your-user-name” with the username by which far-away.example.com knows you.
# If you don’t have public key authentication set up from the cluster for far-away.example.com, you’ll
# be prompted for your far-away.example.com password
scp -P 2222 your-user-name@localhost:my-data-set.csv .
# When you are all done using it, tear down the tunnel:
kill %1

Once you have run these commands once, interactively, from a compute node, they can then be used in non-interactive PBS batch jobs, if you’ve also set up public key authentication for far-away.example.com

Interactive PBS jobs

By | | No Comments

You can request an interactive PBS job for any activity for which the environment needs to be the same as for a batch job. Interactive jobs are also what you should use if you have something that needs more resources than are appropriate to use on a login node. When the interactive job starts, you will get a prompt, and you will have access to all the resources assigned to the job. This is a requirement to test or debug, for example, MPI jobs that run across many nodes.

Submitting an interactive job

There are two ways you can submit an interactive job to PBS: By including all of the options on the command line, or by listing the options in a PBS script and submitting the script with the command-line option specifying an interactive job.

Submitting an interactive job from the command line

To submit from the command line, you need to specify all of the PBS options that would normally be specified by directives in a PBS script. The translation from script to command line is simply to take a line, say,

#PBS -A example_flux

remove the #PBS, and the rest is the option you should put on the command line. More options will be needed, but that would lead to

$ qsub -A example_flux

For an interactive job, several options that are appropriate in a PBS script may be left off. Since you will have a prompt, you probably don’t need to use the options to send you mail about job status. The options that must be included include the accounting options, the resource options for number of nodes, processors, memory, and walltime, and the -V option to insure that all the nodes get the correct environment. The -I flag signals that the job should run as an interactive job. (Note: in the example that follows, the character indicates that the following line is a continuation of the one on which it appears.)

$ qsub -I -V -A example_flux -q flux 
   -l nodes=2:ppn=2,pmem=1gb,walltime=4:00:00,qos=flux

The above example requests an interactive job, using the account example_flux and two nodes with two processors, each processor with 1 GB of memory, for four hours. The prompt will change to something that says the job is waiting to start, followed by a prompt on the first of the assigned nodes.

qsub: waiting for job 12345678.nyx.engin.umich.edu to start
[grundoon@nyx5555 ~]$

If at some point before the interactive job has started you decide you do not want to use it, Ctrl-C will cancel it, as in

^CDo you wish to terminate the job and exit (y|[n])? y
Job 12345678.nyx.engin.umich.edu is being deleted

When you have completed the work for which you requested the interactive job, you can just logout of the compute node, either with exit or with logout, and you will return to the login node prompt.

[grundoon@nyx5555 ~]$ exit
[grundoon@flux-login1 ~]$

Submitting an interactive job using a file

To recreate the same interactive job as above, you could create a file, say interactive.pbs, with the following lines in it

#!/bin/bash
#PBS -V
#PBS -A example_flux
#PBS -l qos=flux
#PBS -q flux

#PBS -l nodes=2:ppn=2,pmem=1gb,walltime=4:00:00

then submit the job using

$ qsub -I interactive.pbs

Connecting Flux and XSEDE

By | | No Comments

XSEDE is an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource. It is the successor to TeraGrid.

For general information on XSEDE, visit the XSEDE home page.

This page describes the how to connect an XSEDE allocation with a Flux allocation.

The XSEDE Client Toolkit allows Flux users to connect to XSEDE resources using the GSI interface.  This eases logins and file transfers between the two sets of resources.  The toolkit provides the commands myproxy-logingsisshgsiscpglobus-url-copy, and uberftp. Refer to XSEDE’s Data Transfers page for details on these commands.

Loading the XSEDE Client Toolkit

Load the toolkit module:

module load xsede

Getting your XSEDE User Portal ticket

Before connecting to any XSEDE resource using the GSI interface you must get an XSEDE User Portal ticket. This proves your identity and lasts for 12 hours by default. Use your XSEDE portal username and password.

myproxy-logon -l portalusername

Logins to XSEDE Resources

The command gsissh works the same as normal ssh but uses your portal ticket to authenticate.

gsissh Xsedeloginhost
gsissh gordon.sdsc.edu

Connect to the resource you have access to via your startup or XSEDE TRAC.

File Transfers

The XSEDE Client Toolkit allows file transfers between XSEDE and Flux resources and between XSEDE and XSEDE resources using gsiscp or the more complex and powerful globus-url-copy.

GSISCP

gsiscp uses the same options as normal scp allowing transfer of files between Flux resources and XSEDE resources:

Transfer file fluxfile to the XSEDE resource xsedehost into the folder folder1

gsiscp fluxfile xsedehost:folder1/

Transfer the folder xfolder from xsedehost to Flux.

gsiscp -r teragridhost:folder1 .
 

GridFTP support globus-url-copy

GridFTP is a powerful system for moving files between XSEDE sites. You can use the command globus-url-copy to initiate transfers from a Flux resource. Please refer to the XSEDE’s Data Transfers page on its use.

XSEDE Available Resources:

To find available XSEDE resources when choosing where to request your allocation see the Resource Catalog, or contact hpc-support@umich.edu.

XSEDE Training Opportunities:

XSEDE offers training throughout the year listed on the XSEDE Course Calendar. XSEDE also maintains a collection of online training resources.

Data Science Platform (Hadoop)

By | | No Comments

The ARC-TS Data Science Platform is an upgraded Hadoop cluster currently available as a technology preview with no associated charges to U-M researchers. The ARC-TS Hadoop cluster is an on-campus resource that provides a different service level than most cloud-based Hadoop offerings, including:

  • high-bandwidth data transfer to and from other campus data storage locations with no data transfer costs
  • very high-speed inter-node connections using 40Gb/s Ethernet

The cluster provides 112TB of total usable disk space, 40GbE inter-node networking, Hadoop version 2.3.0, and several additional data science tools.

Aside from Hadoop and its Distributed File System, the ARC-TS data science service includes:

  • Pig, a high-level language that enables substantial parallelization, allowing the analysis of very large data sets.
  • Hive, data warehouse software that facilitates querying and managing large datasets residing in distributed storage using a SQL-like language called HiveQL.
  • Sqoop, a tool for transferring data between SQL databases and the Hadoop Distributed File System.
  • Rmr, an extension of the R Statistical Language to support distributed processing of large datasets stored in the Hadoop Distributed File System.
  • Spark, a general processing engine compatible with Hadoop data
  • mrjob, allows MapReduce jobs in Python to run on Hadoop

The software versions are as follows:

Title Version
Hadoop 2.5.0
Hive 0.13.1
Sqoop 1.4.5
Pig 0.12.0
R/rhdfs/rmr 3.0.3
Spark 1.2.0
mrjob 0.4.3-dev, commit

226a741548cf125ecfb549b7c50d52cda932d045

If a cloud-based system is more suitable for your research, ARC-TS can support your use of Amazon cloud resources through MCloud, the UM-ITS cloud service.

For more information on the Hadoop cluster, please see this documentation or contact us at data-science-support@umich.edu.

A Flux account is required to access the Hadoop cluster. Visit the Establishing a Flux allocation page for more information.

Policy on commercial use of Flux

By | | No Comments

Flux is intended only for non-commercial, academic research and instruction. Commercial use of some of the software on Flux is prohibited by software licensing terms. Prohibited uses include product development or validation, any service for which a fee is charged, and, in some cases, research involving proprietary data that will not be made available publicly.

Please contact hpc-support@umich.edu if you have any questions about this policy, or about whether your work may violate these terms.

Acknowledging Flux in Published Papers

By | | No Comments

Researchers are urged to acknowledge ARC in any publication, presentation,
report, or proposal on research that involved ARC hardware (Flux) and/or
staff expertise.

“This research was supported in part through computational resources and
services provided by Advanced Research Computing at the University of
Michigan, Ann Arbor.”

Researchers are asked to annually submit, by October 1, a list of materials
that reference ARC, and inform its staff whenever any such research receives
professional or press exposure (arc-contact@umich.edu). This information is
extremely important in enabling ARC  to continue supporting U-M researchers
and obtain funding for future system and service upgrades.

Security on Flux / Use of Sensitive Data

By | | No Comments
The Flux high-performance computing system at the University of Michigan has been built to provide a flexible and secure HPC environment. Flux is an extremely scalable, flexible, and reliable platform that enables researchers to match their computing capability and costs with their needs while maintaining the security of their research.

Built-in Security Features

Applications and data are protected by secure physical facilities and infrastructure as well as a variety of network and security monitoring systems. These systems provide basic but important security measures including:

  • Secure access – All access to Flux is via ssh or Globus. Ssh has a long history of high-security. Globus provides basic security and supports additional security if you need it.
  • Built-in firewalls – All of the Flux computers have firewalls that restrict access to only what is needed.
  • Unique users – Flux adheres to the University guideline of one person per login ID and one login ID per person.
  • Multi-factor authentication (MFA) – For all interactive sessions, Flux requires both a UM Kerberos password and Duo authentication. File transfer sessions require a Kerberos password.
  • Private Subnets – Other than the login and file transfer computers that are part of Flux, all of the computers are on a network that is private within the University network and are unreachable from the Internet.
  • Flexible data storage – Researchers can control the security of their own data storage by securing their storage as they require and having it mounted via NFSv3 or NFSv4 on Flux. Another option is to make use of Flux’s local scratch storage, which is considered secure for many types of data. Note: Flux is not considered secure for data covered by HIPAA.

Flux/Globus & Sensitive Data

To find out what types of data may be processed in Flux or Globus, visit the U-M Sensitive Data Guide to IT Resources.

Additional Security Information

If you require more detailed information on Flux’s security or architecture to support your data management plan or technology control plan, please contact the Flux team at hpc-support@umich.edu.

We know that it’s important for you to understand the protection measures that are used to guard the Flux infrastructure. But since you can’t physically touch the servers or walk through the data centers, how can you be sure that the right security controls are in place?

The answer lies in the third-party certifications and evaluations that Flux has undergone. IIA has evaluated the system, network, and storage practices of Flux and Globus. The evaluation for Flux is published athttp://safecomputing.umich.edu/dataguide/?q=node/151 and the evaluation for Globus is published at http://safecomputing.umich.edu/dataguide/?q=node/155.

Shared Security and Compliance Responsibility

Because you’re managing your data in the Flux high-performance computing environment, the security responsibilities will be shared.

Flux operators have secured the underlying infrastructure, and you are obligated to secure anything you put on the your own infrastructure itself, as well meet any other compliance requirement.  These requirements may be derived from your grant or funding agency, or data owners or stewards other than yourself, or state or federal laws and regulations.

The Flux support staff is available to help manage user lists for data access, and information is publicly available on how to manage file system permissions, please see:http://en.wikipedia.org/wiki/File_system_permissions.

Contacting Flux Support

The Flux Support Team encourages communications, including for security-related questions. Please email us at hpc-support@umich.edu.

We have created a PGP key for especially sensitive communications you may need to send.

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1

mQENBFNEDlUBCACvXwy9tYzuD3BqSXrxcAEcIsmmH52066R//RMaoUbS7AcoaF12
k+Quy/V0mEQGv5C4w2IC8Ls2G0RHMJ2PYjndlEOVVQ/lA8HpaGhrSxhY1bZzmbkr
g0vGzOPN87dJPjgipSCcyupKG6Jnnm4u0woAXufBwjN2wAP2E7sqSZ2vCRyMs4vT
TGiw3Ryr2SFF98IJCzFCQAwEwSXZ2ESe9fH5+WUxJ6OM5rFk7JBkH0zSV/RE4RLW
o2E54gkF6gn+QnLOfp2Y2W0CmhagDWYqf5XHAr0SZlksgDoC14AN6rq/oop1M+/T
C/fgpAKXk1V/p1SlX7xL230re8/zzukA5ETzABEBAAG0UEhQQyBTdXBwb3J0IChV
bml2ZXJzaXR5IG9mIE1pY2hpZ2FuIEhQQyBTdXBwb3J0IEdQRyBrZXkpIDxocGMt
c3VwcG9ydEB1bWljaC5lZHU+iQE+BBMBAgAoBQJTRA5VAhsDBQkJZgGABgsJCAcD
AgYVCAIJCgsEFgIDAQIeAQIXgAAKCRDHwuoUZnHdimrSB/4m6P7aQGnsbYVFspJ8
zquGRZd3fDU/IaCvLyjsUN4Qw1KFUmqQjvvfTxix7KjlNMcGy1boUCWKNNk1sFtb
E9Jr2p6Z/M7pm4XWhZIs1UIfHr3XgLdfbeYgXpt4Md2G6ttaXv44D10xL2LYCHE8
DnSVv+2SIG9PhaV+h+aBUo4yKwTwVBZsguU1Z1fsbiu6z6iDrzU2dlQp0NLmw73G
v5HUdYdu/YJdh5frp/2XorLXynrEyCk1SxViXrHY6dc9Y3bUjwl0MOJypLuRhQmj
kVwHIsNsRg1YJ6iyJzom33C7YdRktBiPpstkYDHJf/PVRAw1G4dkyjfUfG2pIoQd
WjOxuQENBFNEDlUBCADNwZ5edW/e08zYFWSGVsdpY4HM2CdsVqkuQru2puHhJqg4
eWS9RAdJ6fWp3HJCDsDkuQr19B3G5gEWyWOMgPJ9yW2tFVCrVsb9UekXAWh6C6hL
Tj+pgVVpNDTYrErYa2nlll0oSyplluVBRlzDfuf4YkHDy2TFd7Kam2C2NuQzLQX3
THhHkgMV+4SQZ+HrHRSoYPAcPb4+83dyQUo9lEMGcRA2WqappKImGhpccQ6x3Adj
/HFaDrFT7itEtC8/fx4UyaIeMszNDjD1WIGBJocOdO7ClIEGyCshwKn5z1cCUt72
XDjun0f1Czl6FOzkG+CHg5mf1cwgNUNx7TlVBFdTABEBAAGJASUEGAECAA8FAlNE
DlUCGwwFCQlmAYAACgkQx8LqFGZx3YrcqggAlKZhtrMDTHNki1ZTF7c7RLjfN17H
Fb342sED1Y3y3Dm0RVSQ2SuUWbezuDwov6CllgQR8SjBZ+D9G6Bt05WZgaILD7H0
LR9+KtBNYjxoVIdNHcGBf4JSL19nAI4AMWcOOjfasGrn9C60SwiiZYzBtwZa9VCi
+OhZRbmcBejBfIAWC9dGtIcPHBVcObT1WVqAWKlBOGmEsj/fcpHKkDpbdS7ksLip
YLoce2rmyjXhFH4GXZ86cQD1nvOoPmzocIOK5wpIm6YxXtYLP07T30022fOV7YxT
mbiKKL2LmxN1Nb/+mf+wIZ5w2ZdDln1bbdIKRHoyS2HyhYuLd1t/vAOFwg==
=yAEg
-----END PGP PUBLIC KEY BLOCK-----

May I process sensitive data using Flux?

Yes, but only if you use a secure storage solution like Mainstream Storage and Flux’s scratch storage. Flux’s home directories are provided by Value Storage, which is not an appropriate location to store sensitive institutional data.One possible workflow is to use sftp or Globus to move data between a secure solution and Flux’s scratch storage, which is secure, bypassing your home directory or any of your own Value Storage directories.Keep in mind that compliance is a shared responsibility.You must also take any steps required by your role or unit to comply with relevant regulatory requirements.

For more information on specific types of data that can be stored and analyzed on Flux, Value Storage, and other U-M services, please see the “Sensitive Data Guide to IT Services” web page on the Safe Computing website: http://safecomputing.umich.edu/dataguide/

Terms of Usage and User Responsibilities

By | | No Comments
  1. Data is not backed up. None of the data on Flux is backed up. The data that you keep in your home directory, /tmp or any other filesystem is exposed to immediate and permanent loss at all times. You are responsible for mitigating your own risk. We suggest you store copies of hard-to-reproduce data on systems that are backed up, for example, the AFS filesystem maintained by ITS.
  2. Your usage is tracked and may be used for reports. We track a lot of job data and store it for a long time. We use this data to generate usage reports and look at patterns and trends. We may report this data, including your individual data, to your adviser, department head, dean, or other administrator or supervisor.
  3. Maintaining the overall stability of the system is paramount to us. While we make every effort to ensure that every job completes with the most efficient and accurate way possible, the good of the whole is more important to us than the good of an individual. This may affect you, but mostly we hope it benefits you. System availability is based on our best efforts. We are staffed to provide support during normal business hours. We try very hard to provide support as broadly as possible, but cannot guarantee support on a 24 hour per day basis. Additionally, we perform system maintenance on a periodic basis, driven by the availability of software updates, staffing availability, and input from the user community. We do our best to schedule around your needs, but there will be times when the system is unavailable. For scheduled outages, we will announce them at least one month in advance on the ARC-TS home page; for unscheduled outages we will announce them as quickly as we can with as much detail as we have on that same page.You can also track ARC-TS at Twitter name ARC-TS.
  4. Flux is intended only for non-commercial, academic research and instruction. Commercial use of some of the software on Flux is prohibited by software licensing terms. Prohibited uses include product development or validation, any service for which a fee is charged, and, in some cases, research involving proprietary data that will not be made available publicly. Please contact hpc-support@umich.edu if you have any questions about this policy, or about whether your work may violate these terms.
  5. You are responsible for the security of sensitive codes and data. If you will be storing export-controlled or other sensitive or secure software, libraries, or data on the cluster, it is your responsibility that is is secured to the standards set by the most restrictive governing rules.  We cannot reasonably monitor everything that is installed on the cluster, and cannot be responsible for it, leaving the responsibility with you, the end user.
  6. Data subject to HIPAA regulations may not be stored or processed on the cluster. For assistance with HIPAA-related computational research please contact Jeremy Hallum, ARC liaison to the Medical School, at jhallum@med.umich.edu.

User Responsibilities

Users should make requests by email hpc-support@umich.edu:

  • Renewing allocations at least 2 days before your current allocation expires to have the new allocation provisioned before the old one expires.
  • At least a day in advance, request users being added to allocations you may have.


Users are responsible for maintaining MCommunity groups used for MReport authorizations.

Users must manage data appropriately in their various locations:

  • /home, /home2
  • /scratch
  • /tmp and /var/tmp
  • customer-provided NFS