Flux Configuration

Hardware

Computing

The standard Flux hardware is:

  • 109 Haswell architecture compute nodes, each configured with 24 cores (two 12-core 2.5 GHz Intel Xeon E5-2680v3 processors) and 128 GB RAM.
  • 124 Ivybridge architecture compute nodes, each configured with 20 cores (two 10-core 2.8 GHz Intel Xeon E5-2680v2 processors) and 96 GB RAM.
  • 139 Sandybridge architecture compute nodes, each configured with 16 cores (two 8-core 2.60 GHz Intel Xeon E5-2670 processors) and 64 GB RAM.
  • 88 Nehalem architecture compute nodes, each configured with 12 cores (two 6-core 2.67 GHz Intel Xeon X5650 processors) and 48 GB RAM.

All of the standard compute nodes are treated identically for purposes of Flux allocations and job scheduling. The packaging for the CPUs and memory for 12- and 16-core standard Flux compute nodes is Dell’s C6100 4-node chassis. Twenty-core nodes are housed in IBM NeXtScale System M4 n1200 chassis.

The larger memory Flux hardware comprises 8 compute nodes:

  • 5 Haswell architecture compute nodes, each configured with 56 cores (four 14-core 2.2 GHz Lenovo E7-4850v3 processors) and 1.5 TB RAM.
  • 3 Sandybridge architecture compute nodes, each configured with 32 cores (four 8-core 2.4 GHz Intel Xeon E5-4640 processors) and 1 TB RAM.

Five 56-core nodes are in Lenovo X3850 X6 chassis. Three 32-core nodes are in Dell R820 servers.

The Flux on Demand hardware is:

  • 22 Sandybridge architecture compute nodes, each configured with 16 cores (two 8-core 2.60 GHz Intel Xeon E5-2670 processors) and 64 GB RAM.
  • 296 Nehalem architecture compute nodes, each configured with 12 cores (two 6-core 2.67 GHz Intel Xeon X5650 processors) and 48 GB RAM.
  • 3 Nehalem architecture compute nodes, each configured with 40 cores (four 10-core 2.27 GHz Intel Xeon E7-4860 processors) and 1 TB RAM.

The packaging for the CPUs and memory for 12- and 16-core standard Flux compute nodes is Dell’s C6100 4-node chassis. Three 40-core larger memory Flux compute nodes are packaged in Dell’s R910 server chassis.

Networking

The compute nodes are all interconnected with InfinBand networking. The InfiniBand fabric is based on the Mellanox quad-data rate (QDR) platform in the Voltaire GridDirector 4700, which provides 40 Gbps of bandwidth and sub-5μs latency per host. Four Grid Director 4700 switches are connected to each other with 240 Gbps of bandwidth each.

In addition to the InfiniBand networking, there is a gigabit Ethernet network that also connects all of the nodes. This is used for node management and NFS file system access.

For those who may need to adjust firewall rules to allow traffic from the Flux cluster, the following tables shows the network ranges in use.

Login nodes (both Flux and Armis) 141.211.22.192/27 (Ethernet)
141.211.19.0/27 (Ethernet)
Flux compute nodes 10.164.0.0/21 (Ethernet)
10.224.72.0/21 (high speed IPoIB)
Armis compute nodes 10.224.37.0/24 (Ethernet)
10.246.1.0/24 (high speed IPoIB)

To discuss high-speed connections to the Flux or Armis clusters, please contact hpc-support@umich.edu.

Storage

The high-speed scratch file system is based on Lustre v2.5 and is a DDN SFA10000 backed by the hardware described in this table:

Server Type

Network Connection

Disk Capacity (raw/usable)

Dell R610 40Gbps InfiniBand 520 TB / 379 TB
Dell R610 40Gbps InfiniBand 530 TB / 386 TB
Dell R610 40Gbps InfiniBand 530 TB / 386 TB
Dell R610 40Gbps InfiniBand 520 TB / 379 TB

Totals

160 Gbps

2100 TB / 1530 TB

Operation

Computing jobs on Flux are managed through a combination of the Moab Scheduler, the Terascale Open-Source Resource and QUEue Manager (Torque), and the Gold Allocation Manager from Adaptive Computing.

Software

There are three layers of software on Flux.

Operating Software

The Flux cluster runs CentOS 7. We update the operating system on Flux as CentOS releases new versions and our library of third-party applications offers support. Due to the need to support several types of drivers (AFS and Lustre file system drivers, InfiniBand network drivers and NVIDIA GPU drivers) and dozens of third party applications, we are cautious in upgrading and can lag CentOS’s releases by months.

Compilers and Parallel and Scientific Libraries

Flux supports the Gnu Compiler Collection, the Intel Compilers, and the PGI Compilers for C and Fortran. The Flux cluster’s parallel library is OpenMPI, and the default version is 1.10.2, and there are limited earlier versions available.  Flux provides the Intel Math Kernel Library (MKL) set of high-performance mathematical libraries. Other common scientific libraries are compiled from source and include HDF5, NetCDF, FFTW3, Boost, and others.

Please contact us if you have questions about the availability of, or support for, any other compilers or libraries.

Application Software

Flux supports a wide range of application software. We license common engineering simulation software, for example, Ansys, Abaqus, VASP, and we compile other for use on Flux, for example, OpenFOAM and Abinit. We also have software for statistics, mathematics, debugging and profiling, etc. Please contact us if you wish to inquire about the current availability of a particular application.

GPUs

Flux has 24 K20x GPUs connected to 3 compute nodes,  24 K40 GPUs connected to 6 nodes, and 12 TITANV GPUs connected to 3 nodes.

Each GPU allocation comes with 2 compute cores and 8GB of CPU RAM.

FluxG GPU Specifications

GPU Model NVidia K20X NVidia K40 NVidia TITANV
Number and Type of GPU one Kepler GK110 Kepler GK110B GV100
Peak double precision floating point perf. 1.31 Tflops 1.43 Tflops 7.5 Tflops
Peak single precision floating point perf. 3.95 Tflops 4.29 Tflops 15 Tflops
Tensor Performance (Deep Learning) 110 Tflops
Memory bandwidth (ECC off) 250 GB/sec 288 GB/sec 652.8 GB/sec
Memory size (GDDR5) 6 GB 12 GB 12 GB
CUDA cores 2688 2880 5120 (single precision)

If you have questions, please send email to hpc-support@umich.edu.

Order Service

For information on determining the size of a Flux allocation, please see our pages on How Flux Works, Sizing a Flux Order, and Managing a Flux Project.

To order:

Email hpc-support@umich.edu with the following information:

  • the number of cores needed
  • the start date and number of months for the allocation
  • the shortcode for the funding source
  • the list of people who should have access to the allocation
  • the list of people who can change the user list and augment or end the allocations.

For information on costs, visit our Rates page.

Related Event

There are no upcoming events at this time.