Winter HPC maintenance scheduled for Jan. 6-9

To accommodate updates to software, hardware, and operating systems, Flux, Beta, Armis, Cavium, and ConFlux, and their storage systems (/home and /scratch) will be unavailable starting at 6 a.m. Sunday, January 6th and returning to service on Wednesday, January 9th.  These updates will improve the performance and stability of ARC-TS services. We try to encapsulate the required changes into two maintenance periods per year and work to complete these tasks quickly, as we understand the impact of the maintenance on your research.

During this time, the following maintenance tasks are planned:

  • Preventative maintenance at the Modular Data Center (MDC) which requires a full power outage
  • InfiniBand networking updates (firmware and software)
  • Ethernet networking updates (datacenter distribution layer switches)
  • Operating system and software updates
  • Potential updates to job scheduling software
  • Migration of Turbo networking to new switches (affects /home and /sw)
  • Perform consistency checks on the Lustre file systems that provide /scratch
  • Update firmware and software of the GPFS file systems (ConFlux, starting 9 a.m., Monday, Jan. 7)
  • Perform consistency checks on the GPFS file systems that provide /gpfs (ConFlux, starting 9 a.m., Monday, Jan. 7) 

You can use the command “maxwalltime” to discover the amount of time remaining until the beginning of the maintenance. Jobs requesting more walltime than remains before the maintenance will be queued and started after the maintenance is completed.

All filesystems will be unavailable during the maintenance. We encourage you to copy any data that might be needed during that time from Flux prior to the start of the maintenance.

We will post status updates on our Twitter feed ( https://twitter.com/arcts_um ) throughout the course of the maintenance and send an email to all users when the maintenance has been completed.  Please contact hpc-support@umich.edu if you have any questions.