HPC

Feb 27 2024

1

Using GenAI to design floor plans and buildings

By sdascola | Data, Data sets, Educational, Feature, General Interest, HPC, News, Research, Systems and Services

There is a lot to consider when designing places where humans live and work. How will the space be used? Who’s using the space? What are budget considerations? It is painstaking and time consuming to develop all of those details into something usable.

What if Generative AI (GenAI) could help? We already know that it can be used to create text, music, and images. Did you know that it can also create building designs and floor plans?

Dr. Matias del Campo, associate professor of architecture in the Taubman College for Architecture and Urban Planning, has been working to make architectural generative models more robust. He aims to expand on the patterns, structures, and features from the available input data to create architectural works. Himself a registered architect, designer, and educator, del Campo conducts research on advanced design methods in architecture using artificial intelligence techniques.

He leverages something called neural networks for two projects:

Common House: A project that focuses on floor plan analysis and generation.
Model Mine: A large-scale, 3D model housing database for architecture design using Graph Convolutional Neural Networks and 3D Generative Adversarial Networks.

This is an example from the annotated data created from the Common House research project. The main obstacle that has emerged in creating more real-life plans is the lack of databases that are tailored for these architecture applications. The Common House project aims at creating a large-scale dataset for plans with semantic information. Precisely, our data creation pipeline consists of annotating different components of a floor plan, for e.g., Dining Room, Kitchen, Bed Room, etc.

Four quadrants showing 9 models each of chairs, laptops, benches, and airplanes

A large scale 3D model housing database for Architecture design using Graph Convolutional Neural Networks and 3D Generative Adversarial Networks.

What exactly are neural networks? The name itself takes inspiration from the human brain and the way that biological neurons signal to one another. In the GenAI world, neural networks are a subset of machine learning and are at the heart of deep learning algorithms. This image of AI hierarchy may be helpful in understanding how they are connected.

Dr. del Campo’s research uses GenAI for every step of the design process including 2D models for things like floors and exteriors, and 3D models for shapes of the rooms, buildings, and volume of the room. The analysis informs design decisions.

DEI considerations

del Campo notes that there are some DEI implications for the tools he’s developing. “One of the observations that brought us to develop the ‘Common House’ (Plangenerator) project is that the existing apartment and house plan datasets are heavily biased towards European and U.S. housing. They do not contain plans from other regions of the world; thus, most cultures are underrepresented.”

To counterbalance that, del Campo and his team made a global data collection effort, collecting plans and having them labeled by local architects and architecture students. “This not only ensured a more diverse dataset but also increased the quality of the semantic information in the dataset.”

How technology supports del Campo’s work

A number of services from Information Technology & Services are used in these projects, including: Google at U-M collaboration tools, GenAI, Amazon Web Services at U-M (AWS), and GitHub at U-M.

Also from ITS, the Advanced Research Computing (ARC) team provides support to del Campo’s work.

“We requested allocations from the U-M Research Computing Package for high-performance computing (HPC) services in order to train two models. One focuses on the ‘Common House’ plan generator, and the other focuses on the ‘Model Mine’ dataset to create 3D models based,” said del Campo.

Additionally, they used HPC allocations from the UMRPC in the creation of a large-scale artwork called MOSAIK which consists of over 20,000 AI-generated images, organized in a color gradient.

A large scale 3D model housing database for Architecture design using Graph Convolutional Neural Networks and 3D Generative Adversarial Networks.

“We used HPC to run the algorithm that organized the images. Due to the necessary high resolution of the image, this was only possible using HPC.”

“Dr. del Campo’s work is really novel, and it is different from the type of research that is usually processed on Great Lakes. I am impressed by the creative ways Dr. del Campo is applying ITS resources in a way that we did not think was possible,” said Brock Palen, director of the ITS Advanced Research Computing.

Jan 18 2024

1

Using natural language processing to improve everyday life

By sdascola | Data, Great Lakes, HPC, News, Research, Uncategorized

Joyce Y. Chai, professor of electrical engineering and computer science, College of Engineering, and colleagues have been seeking answers to complex questions using natural language processing and machine learning that may improve everyday life.

Some of the algorithms that they develop in their work are meant for tasks that machines may have little to no prior knowledge of. For example, to guide human users to gain a particular skill (e.g., building a special apparatus or even, “Tell me how to bake a cake”). A set of instructions based on the observation of what the user is doing, e.g., to correct mistakes or provide the next step, would be generated by Generative AI, or GenAI. The better the data and engineering behind the AI, the more useful the instructions will be.

“To enable machines to quickly learn and adapt to a new task, developers may give a few examples of recipe steps with both language instructions and video demonstrations. Machines can then (hopefully) guide users through the task by recognizing the right steps and generating relevant instructions using GenAI,” said Chai.

What are AI, machine learning, deep learning, and natural language processing?

Two circles are embedded within a larger circle. The larger circle represents artificial intelligence. The next circle represents machine learning, and the third circle represents deep learning. The embedded circles show how they all work together and build off of each other.

By Lollixzc – Own work, CC BY-SA 4.0

It might help to take a step back to understand AI, machine learning (ML), and deep learning at a high level. Both ML and deep learning are subsets of AI, as seen in the figure. Some natural language processing (NLP) tasks fall within the realm of deep learning. They all work together and build off of each other.

Artificial Intelligence, or AI, is a branch of computer science that attempts to simulate human intelligence with computers. It involves creating systems to perform tasks that usually need human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

“NLP is a sub-area in AI, where DL/ML approaches are predominantly applied,” stated Chai.

Christopher Brown, a research data scientist, with ITS Advanced Research Computing (ARC) and a member of the ARC consulting team, explains that ML is a subfield of AI. Within ML, algorithms are used to generalize situations beyond those seen in training data and then complete tasks without further guidance from people. A good example is U-M GPT. The large language models (LLMs) accessible via U-M GPT are trained with millions of diverse examples. “The goal is to train the models to reliably predict, translate, or generate something.”

“Data is any information that can be formatted and fed into an algorithm that can be used for some task, including journal articles, chats, numbers, videos, audio, and texts,” said Brown, Algorithms can be trained to perform tasks using these real-world data.

Natural Language Processing is a branch of artificial intelligence that helps computers understand and generate human language in a way that is both meaningful and useful to humans. NLP teaches computers to understand languages and then respond so that humans can understand, and even accounting for when rich context language is used.

“NLP is highly interdisciplinary, and involves multiple fields, such as computer science, linguistics, philosophy, cognitive science, statistics, mathematics, etc.,” said Chai.

Examples of NLP are everywhere: when you ask Siri for directions, or when Google efficiently completes your half-typed query, or even when you get suggested replies in your email.

Ultimately NLP, along with AI, can be used to make interactions between humans and machines as natural and as easy as possible.

A lot of data is needed to train the models

Dr. Chai and her team use large language models, a lot of data, and computing resources. These models take longer to train and are harder to interpret. Brown says, “The state of the art, groundbreaking work tends to be in this area.”

Dr. Chai uses deep learning algorithms that make predictions about what the next part of the task or conversation is. “For example, they use deep learning and the transformer architecture to enable embodied agents to learn how new words are connected to the physical environment, to follow human language instructions, and to collaborate with humans to come up with a shared plan,” Brown explains.

The technology that supports this work

To accomplish her work, Dr. Chai uses the Great Lakes High-Performance Computing Cluster and Turbo Research Storage, both of which are managed by U-M’s Advanced Research Computing Group (ARC) in Information and Technology Services. She has 16 GPUs on Great Lakes at the ready, with the option to use more at any given time.

A GPU, or Graphics Processing Unit, is a piece of computer equipment that is good at displaying pictures, animations, and videos on your screen. The GPU is especially adept at quickly creating and manipulating images. Traditionally, GPUs were used for video games and professional design software where detailed graphics were necessary. But more recently, researchers including Dr. Chai discovered that GPUs are also good at handling many simple tasks at the same time. This includes tasks like scientific simulations and AI training where a lot of calculations need to be done in parallel (which is perfect for training large language models).

“GPUs are popular for deep learning, and we will continue to get more and better GPUs in the future. There is a demand, and we will continue supporting this technology so that deep learning can continue to grow,” said Brock Palen, ITS Advanced Research Computing director.

Chai and her team also leveraged 29 terabytes of the Turbo Research Storage service at ARC. NLP benefits from the high-capacity, reliable, secure, and fast storage solution. Turbo enables investigators across the university to store and access data needed for their research via Great Lakes.

Great Lakes HPC in the classroom

ARC offers classroom use of high-performance computing cluster resources on the Great Lakes High-Performance Computing Cluster.

Dr. Chai regularly leverages this resource. “Over 300 students have benefited from this experience. We have homework that requires the use of the Great Lakes, e.g., having students learn how to conduct experiments in a managed job-scheduling system like SLURM. This will benefit them in the future if they engage in any compute-intensive R&D (research and development).

“For my NLP class, I request Great Lakes access for my students so they have the ability to develop some meaningful final projects. We also use the Great Lakes HPC resources to study the reproducibility for NLP beginners,” said Chai. A gallery is available for many of the student projects.

The UMRCP defrays costs

The U-M Research Computing Package is a set of cost-sharing allocations offered by ITS ARC, and they are available upon request. Other units offer additional cost-sharing to researchers. Chai said, “We typically use the nodes owned by my group for research projects that require intensive, large-scale GPU model training. We use the UMRCP for less intensive tasks, thereby extending the budgetary impact of the allocations.”

Jan 17 2024

0

New Hardware-Accelerated Visualization Partition

By sdascola | General Interest, Great Lakes, HPC, News, Research

We are excited to introduce a brand-new feature of Great Lakes to researchers who need hardware acceleration for their visualization requirements. ARC has created a specialized “viz” partition, consisting of four nodes equipped with NVIDIA P40 GPUs. These nodes are accessible through Open OnDemand’s Remote Desktop functionality.

Key details about this new feature:

Jobs utilizing the viz partition have a maximum walltime of 2 hours.
The charge rate for the viz partition is currently aligned with our standard partition.

To make use of the viz partition, follow these steps:

Create a Remote Desktop job on Great Lakes via Open OnDemand.
Request the “viz” partition, specifying 1 node and 1 GPU (Please note that we have only one GPU available per node).
Prefix any application you intend to run with accelerated graphics with the command “vglrun.” (example: “vglrun glxgears”)

For questions or support requests, please contact our team at arc-support@umich.edu.

Dec 07 2023

0

What DNA can tell us about dog evolution

By sdascola | General Interest, HPC, News, Research, Systems and Services

An excerpt from the Michigan Medicine Health Lab Podcast:

An international consortium of scientists, led by Jeff Kidd, Ph.D., of the University of Michigan, Jennifer R. S. Meadows of Uppsala University in Sweden, and Elaine A. Ostrander, Ph.D. of the NIH National Human Genome Research Institute, is using an unprecedentedly large database of canine DNA to take an unbiased look at how our furry friends evolved into the various breeds we know and love.

A paper, published in the journal Genome Biology, outlines what the Dog10K project discovered after sequencing the genomes of close to 2,000 samples from 321 different breed dogs, wild dogs, coyotes, and wolves, and comparing them to one reference sample—that of a German Shepherd named Mischka.

Analyzing more than 48 million pieces of genetic information, they discovered that each breed dog had around 3 million single nucleotide polymorphism differences.

These SNPs or “snips” are what account for most of the genetic variation among people and dogs alike.

They also found 26,000 deleted sequences that were present in the German Shepherd but not in the comparison breed and 14,000 that were in the compared breed but missing from Mischka’s DNA.

“We did an analysis to see how similar the dogs were to each other, and it ended up that we could divide them into around 25 major groups that pretty much match up with what people would have expected based on breed origin, the dogs’ type, size and coloration,” said Kidd.

Most of the varying genes, he added, had to do with morphology, confirming that the breed differences were driven by how the dogs look.

Relative to dogs, wolves had around 14% more variation. And wild village dogs—dogs that live amongst people in villages or cities but aren’t kept as pets—exhibited more genetic variation than breed dogs.

The data set, which was processed using the Great Lakes high-performing computing cluster at U of M, also revealed an unusual amount of retrogenes, a new gene that forms when RNA gets turned back into DNA and inserted back into the genome in a different spot.

Listen to the podcast or read the transcript on the Health Lab webpage.

Nov 29 2023

1

Technology supports researchers’ quest to understand parental discipline behaviors

By sdascola | Feature, HPC, News, Research, Systems and Services, Uncategorized

Image by Rajesh Balouria from Pixabay

How do different types of parental discipline behaviors affect children’s development in low- and middle-income countries (LMICs)? A group of researchers set out to understand that question. They used a large data set from UNICEF of several hundred thousand families. The data came from the fourth (2009–2013) and fifth (2012–2017) rounds of the UNICEF Multiple Indicator Cluster Surveys.

“The majority of parenting research is conducted in higher income and Westernized settings. We need more research that shows what types of parenting behaviors are most effective at promoting children’s development in lower resourced settings outside of the United States. I wanted to conduct an analysis that provided helpful direction for families and policymakers in LMICs regarding what parents can do to raise healthy, happy children,” said Kaitlin Paxton Ward, People Analytics Researcher at Google and Research Affiliate at the University of Michigan.

Dr. Paxton Ward is the lead author on the recently-released paper, “Associations between 11 parental discipline behaviors and child outcomes across 60 countries.” Other authors are also cited in the article: Andrew Grogan-Kaylor, Julie Ma, Garrett T. Pace, and Shawna Lee.

Together, they tested associations between 11 parental discipline behaviors and outcomes (aggression, distraction, and prosocial peer relations) of children under five years in 60 LMICs:

Verbal reasoning (i.e., explaining why the misbehavior was wrong)
Shouting
Name calling
Shaking
Spanking
Hitting/slapping the body
Hitting with an object
Beating as hard as one could
Removing privileges
Explaining
Giving the child something else to do

Results

Verbal reasoning and shouting were the most common parental discipline behaviors towards young children. Psychological and physical aggression were associated with higher child aggression and distraction. Verbal reasoning was associated with lower odds of aggression, and higher odds of prosocial peer relations. Taking away privileges was associated with higher odds of distraction, and lower odds of prosocial peer relations. Giving the child something else to do was associated with higher odds of distraction. The results indicated that there was some country-level variation in the associations between parenting behaviors and child socioemotional outcomes, but also that no form of psychological or physical aggression benefitted children in any country.

Conclusion

Parental use of psychological and physical aggression were disadvantageous for children’s socioemotional development across countries. Only verbal reasoning was associated with positive child socioemotional development. The authors suggest that greater emphasis should be dedicated to reducing parental use of psychological and physical aggression across cultural contexts, and increasing parental use of verbal reasoning.

The technology used to analyze the data

The researchers relied on a complicated Bayesian multilevel model. This type of analysis incorporated knowledge from previous studies to inform the current analysis, and also provided a way for the researchers to look in more detail at variation across countries. To accomplish this task, the team turned to ITS Advanced Research Computing (ARC) and the Great Lakes High-Performance Computing Cluster. Great Lakes is the largest and fastest HPC service on U-M’s campus.

“I know for me as a parent of young children, you want the best outcome. I have known people to grow up with different forms of discipline and what the negative or positive influence of those are,” said Brock Palen, ARC director.

The researchers also created a visual interpretation of their paper for public outreach using a web app called ArcGIS StoryMaps. This software helps researchers tell the story of their work. With no coding required, StoryMaps combine images, text, audio, video, and interactive maps in a captivating web experience. StoryMaps can be shared with groups of users, with an organization, or with the world.

All students, faculty, and staff have access to ArcGIS StoryMaps. Since 2014, U-M folks have authored over 7,500 StoryMaps, and the number produced annually continues to increase year-over-year. Explore examples of how people around the world are using this technology in the StoryMaps Gallery.

“This intuitive software empowers the U-M community to author engaging, multimedia, place-based narratives, without involving IT staff,” said Peter Knoop, research consultant with LSA Technology Services.

Correspondence to Dr. Kaitlin Paxton Ward, kpward@umich.edu.

2024 ARC Winter Maintenance

By msbritt | Feature, General Interest, Great Lakes, HPC, News, Systems and Services

Winter maintenance is coming up! See the details below. Reach out to arc-support@umich.edu with questions or if you need help.

Like last year, we will have a rolling update which, outside of a few brief interruptions, should keep the clusters in production, Here is the schedule:

December 6, 11 p.m.

Update Slurm Controllers: Expect a brief 1-minute interruption when querying Slurm. All jobs will continue to run.
Update Open OnDemand Servers: Expect a few seconds of interruption if you are using Open OnDemand.
Login Servers Update: We will begin updating our login servers. This update is not expected to impact any users.

December 7:

Compute Rolling Updates: We will start rolling updates across all clusters a few nodes at a time, so there should be minimal impact on access to resources.

December 19, 10:30a.m.:

Update Globus transfer (xfer) nodes: as these nodes are in pairs for each cluster. For the Globus transfer nodes, we will take one node of each pair down at a time, so all Globus services will remain working. If you are using scp/sftp, your jobs may be interrupted, so please schedule these transfers accordingly or use Globus. Total maintenance time should be approximately one hour.

January 3, 8 a.m.:

Reboot Slurm Controller Nodes: This will cause an approximately 10-minute Slurm outage. All running jobs will continue to run.
Armis2 Open OnDemand Node: We will reload and reboot the Armis2 Open OnDemand node. This will take approximately 1 hour.
Great Lakes and Lighthouse Open OnDemand Nodes: These nodes will be down approximately 10 to 15 minutes.
Globus Transfer (xfer) Nodes: These nodes will be rebooted. This will take approximately 15 minutes.
Sigbio Login Reload/Reboot: This will take approximately 1 hour.
Some Armis2 Faculty Owned Equipment (FOE) nodes will require physical and configuration updates. Expected downtime is 4 hours.

HPC Maintenance Notes:

Open OnDemand (OOD) users will need to re-login. Any existing jobs will continue to run and can be reconnected in the OOD portal.
Login servers will be updated, and the maintenance should not have any effect on most users. Those who are affected will be contacted directly by ARC.
New viz partition : there will be a new partition called viz with 16 new GPUs, which can support exactly one GPU per job .
The –cpus-per-gpu Slurm bug has been fixed.

HPC Maintenance Details:

NEW version in BOLD

OLD version

Red Hat 8.6 EUS

Kernel 4.18.0-372.75.1.el8_6.x86_64
glibc-2.28-189.6
ucx-1.15.0-1.59056 (OFED provided)
gcc-8.5.0-10.1.el8

Red Hat 8.6 EUS

Kernel 4.18.0-372.51.1.el8_6.x86_64
glibc-2.28-189.6
ucx-1.15.0-1.59056 (OFED provided)
gcc-8.5.0-10.1.el8

Mlnx-ofa_kernel-modules

OFED 5.9.0.5.5.1
- kver.4.18.0_372.51.1.el8_6

Mlnx-ofa_kernel-modules

OFED 5.9.0.5.5.1
- kver.4.18.0_372.51.1.el8_6

Slurm 23.02.6 copiles with:

PMIx
- /opt/pmix/3.2.5
- /opt/pmix/4.2.6
hwloc 2.2.0-3 (OS provided)
ucx-1.15.0-1.59056 (OFED provided)
slurm-libpmi
slurm-contribs

Slurm 23.02.5 copiles with:

PMIx
- /opt/pmix/3.2.5
- /opt/pmix/4.2.6
hwloc 2.2.0-3 (OS provided)
ucx-1.15.0-1.59056 (OFED provided)
slurm-libpmi
slurm-contribs

PMIx LD config /opt/pmix/3.2.5/lib

PMIx versions available in /opt :

3.2.5
4.2.6

PMIx versions available in /opt :

3.2.5
4.2.6

Singularity CE (Sylabs.io)

3.10.4
3.11.1

Singularity CE (Sylabs.io)

3.10.4
3.11.1

NVIDIA driver 545.23.06

NVIDIA driver 530.30.02

Open OnDemand 3.0.3

Storage

There is no scheduled downtime for Turbo, Locker, or Data Den.

Secure Enclave Service (SES)

SES team to add details here

Maintenance notes:

No downtime for ARC storage systems maintenance (Turbo, Locker, and Data Den).
Open OnDemand (OOD) users will need to re-login. Any existing jobs will continue to run and can be reconnected in the OOD portal.
Login servers will be updated, and the maintenance should not have any effect on most users. Those who are affected will be contacted directly by ARC.
Copy any data and files that may be needed during maintenance to your local drive using Globus File Transfer before maintenance begins.

Status updates and additional information

Status updates will be available on the ARC Twitter feed and ITS service status page, throughout the course of the maintenance.
ARC will send an email to all HPC users when the maintenance has been completed.

How can we help you?

For assistance or questions, please contact ARC at arc-support@umich.edu.

Nov 27 2023

0

Gordon Bell Prize winning team also leverages ITS services

By sdascola | Feature, General Interest, Great Lakes, HPC, News, Systems and Services

A U-M College of Engineering team led by Vikram Gavini was recently awarded the prestigious ACM Gordon Bell Prize. The honor was presented in recognition of their outstanding achievement for developing and demonstrating an approach that brought near-quantum mechanical accuracy for large systems consisting of tens of thousands of atoms into the range of today’s supercomputers.

The ACM Gordon Bell Prize is awarded each year to recognize outstanding achievement in high-performance computing.

Dr. Gavini and team carried out their largest calculation on the fastest known computer in the world, the U.S. Department of Energy’s Frontier, and they sustained 660 petaflops.

“The ACM Gordon Bell Prize is the most prestigious prize available for high-performance computing,” said Brock Palen, Advanced Research Computing (ARC) director, a division of Information and Technology Services (ITS).

“This is so exciting because it is more than 660 times faster than our entire Great Lakes cluster at perfect efficiency. Their calculation was 10 times the improvement of any prior density-functional theory (DFT) calculation.”

DFT is a method used in physics and chemistry to investigate the electronic structure of many-body systems. These systems can include atoms, molecules, or solids.

“Dr. Gavini is a major user of ARC’s Great Lakes HPC Cluster, and we are so proud of this astonishing achievement.”

Gavini said that any development needs to be carefully tested. “We use ARC’s Great Lakes to test smaller scale systems before moving to larger-scale, production-ready calculations. This testing is critical, and we need the accessibility and speed of Great Lakes to run calibrations and debug our implementations.”

“We use Great Lakes to actively test our implementations.”

For quick-access storage, Gavini and his team use Turbo Research Storage and Data Den Research Archive for longer-term storage. “We generate a lot of data, and storage is important to our work.”

The U-M Research Computing Package was also helpful in defraying some of their HPC and storage costs.

“Thank you to ARC for providing their continual assistance for group members who have been in trenches. This has been a decade-long effort, and ITS/ARC was crucial along the journey,” said Gavini.

Dr. Gavini is a professor of Mechanical Engineering and professor of Materials Science and Engineering, College of Engineering.

Nov 16 2023

0

You’re invited: Parallel programming with MATLAB webinar on Dec. 4

By sdascola | Events, HPC, News

We invite you to join us for an engaging virtual session on Parallel Computing with MATLAB, scheduled for December 4 from 1-4 p.m. EST. This session promises to equip you with valuable insights and knowledge. Here’s a glimpse of what you can expect to learn during the session.

Parallel Computing Hands-On Workshop:

Join us for an immersive hands-on workshop where we will introduce you to the world of parallel computing using MATLAB®. This workshop aims to equip you with the skills to tackle computationally and data-intensive problems by harnessing the power of multicore processors, GPUs, and computer clusters. Through practical exercises and real-world examples, you will gain a comprehensive understanding of parallel computing and learn best practices for its implementation.

Highlights:

Explore a range of exercises and examples, varying in difficulty from fundamental parallel usage concepts to more advanced techniques.
Learn how to optimize MATLAB applications by leveraging parallel computing capabilities.
Discover the benefits of running multiple Simulink simulations in parallel and enhance your simulation efficiency.
Dive into the world of GPU computing and unlock the potential for accelerated computations.
Explore the concept of offloading computations and delve into the realm of cluster computing.
Master the art of working with large data sets and efficiently process them using parallel computing techniques.

Don’t miss out on this opportunity to enhance your parallel computing skills with MATLAB. Join us for this exciting workshop and unlock the potential of parallel computing for your computational challenges.

Register soon to guarantee your spot and receive the Webex link before the workshop.

Sep 15 2023

0

HPC Emergency 2023 Maintenance: September 15

By drhey | Data, HPC, News, Research, Systems and Services

Due to a critical issue which requires an immediate update, we will be performing updates to Slurm and underlying libraries which allow parallel jobs to communicate. We will be updating the login nodes and the rest of the cluster on the fly and you should only experience minimal impact when interacting with the clusters.

Jobs that are currently running will be allowed to finish.
All new jobs will only be allowed to run on nodes which have been updated.
The login and Open OnDemand nodes will also be updated, which will require a brief interruption in service.

Queued jobs and maintenance reminders

Jobs will remain queued, and will automatically begin after the maintenance is completed. Any parallel using MPI will fail; those jobs may need to be recompiled, as described below. Jobs not using MPI will not be affected by this update.

Jobs will be initially slow to start, as compute nodes are drained of running jobs so they can be updated. We apologize for this inconvenience, and want to assure you that we would not be performing this maintenance during a semester unless it was absolutely necessary.

Software updates

Only one version of OpenMPI (version 4.1.6) will be available; all other versions will be removed. Modules for the versions of OpenMPI that were removed will warn you that it is not available, as well as prompt you to load openmpi/4.1.6.

When you use the following command, it will default to openmpi/4.1.6:
module load openmpi

Any software packages you use (provided by ARC/LSA/COE/UMMS or yourself) will need to be updated to use openmpi/4.1.6. The software package updates will be completed by ARC. The code you compile yourself will need to be updated by you.

Note that at the moment openmpi/3.1.6 will be discontinued and warned to update your use to openmpi/4.1.6.

Status updates

Status updates will be posted to the ITS Status Page and the ARC Twitter/X feed.
ARC will send an email when the maintenance has been completed.

System software changes

Great Lakes, Armis2 and Lighthouse

NEW version in BOLD	OLD version
Slurm 23.02.5 compiles with: PMIx /opt/pmix/3.2.5 /opt/pmix/4.2.6 hwloc 2.2.0-3 (OS provided) ucx-1.15.0-1.59056 (OFED provided) slurm-libpmi slurm-contribs	Slurm 23.02.3 compiles with: PMIx /opt/pmix/2.2.5 /opt/pmix/3.2.3 /opt/pmix/4.2.3 hwloc 2.2.0-3 (OS provided) ucx-1.15.0-1.59056 (OFED provided) slurm-libpmi slurm-contribs
PMIx LD config /opt/pmix/3.2.5/lib	PMIx LD config /opt/pmix/2.2.5/lib
PMIx versions available in /opt : 3.2.5 4.2.6	PMIx versions available in /opt : 2.2.5 3.2.3 4.1.2
OpenMPI 4.1.6	OpenMPI 3.1.6 others

How can we help you?

For assistance or questions, contact ARC at arc-support@umich.edu.

Jul 18 2023

0

Summer 2023 Network Maintenance: HPC and storage unavailable August 21-22

By sdascola | Data, HPC, News, Research, Systems and Services

During the 2023 summer maintenance, a significant networking software bug was discovered and ARC was unable to complete the ARC HPC and Storage network updates at the MACC Data Center.

ITS has been working with the vendor on a remediation, and it will be implemented on August 21-22. This will require scheduled maintenance for the HPC clusters Great Lakes, Armis2, and Lighthouse, as well as the ARC storage systems Turbo, Locker, and Data Den. The date was selected to minimize any impact during the fall semester.

Maintenance dates:

HPC clusters and storage systems (/home and /scratch) and ARC storage systems (Turbo, Locker, and Data Den) will be unavailable August 21 starting at 7:00am. Expected completion date is August 22nd.

Queued jobs and maintenance reminders

Jobs will remain queued, and will automatically begin after the maintenance is completed. The command “maxwalltime” will show the amount of time remaining until maintenance begins for each cluster, so you can size your jobs appropriately. The countdown to maintenance will also appear on the ARC homepage.

Status updates

Status updates will be posted to the ARC website and the ARC Twitter feed.
ARC will send an email when the maintenance has been completed.
See all maintenance and technical details on the ARC summer maintenance web page.
Check the ITS Service Status Reports for reference:
- Great Lakes
- Armis2
- Lighthouse
- Data Den
- Turbo
- Locker

How can we help you?

For assistance or questions, contact ARC at arc-support@umich.edu.

HPC

DEI considerations

How technology supports del Campo’s work

What are AI, machine learning, deep learning, and natural language processing?

A lot of data is needed to train the models

The technology that supports this work

Great Lakes HPC in the classroom

The UMRCP defrays costs

Results

Conclusion

The technology used to analyze the data

Related article

HPC

December 6, 11 p.m.

December 7:

December 19, 10:30a.m.:

January 3, 8 a.m.:

HPC Maintenance Notes:

HPC Maintenance Details:

Storage

Secure Enclave Service (SES)

Maintenance notes:

Status updates and additional information

How can we help you?

Queued jobs and maintenance reminders

Software updates

Status updates

System software changes

Great Lakes, Armis2 and Lighthouse

How can we help you?

Maintenance dates:

Queued jobs and maintenance reminders

Status updates

How can we help you?

Archives

Categories

HIGH PERFORMANCE COMPUTING

Software

ARC Storage Services

U-M Resources for Researchers

ARC Cloud Services

Other Cloud Services

Resource Management Portal

Leaving U-M