Using machine learning and the Great Lakes HPC Cluster for COVID-19 research

By | General Interest, Great Lakes, HPC, News, Research, Uncategorized

A researcher in the College of Literature, Science, and the Arts (LSA) is pioneering two separate, ongoing efforts for measuring and forecasting COVID-19: pandemic modeling and a risk tracking site

The projects are led by Sabrina Corsetti, a senior undergraduate student pursuing dual degrees in honors physics and mathematical sciences, and supervised by Thomas Schwarz, Ph.D., associate professor of physics. 

The modeling uses a machine learning algorithm that can forecast future COVID-19 cases and deaths. The weekly predictions are made using the ARC-TS Great Lakes High-Performance Computing Cluster, which provides the speed and dexterity to run the modeling algorithms and data analysis needed for data-informed decisions that affect public health. 

Each week, 51 processes (one for each state and one for the U.S.) are run in parallel (at the same time). “Running all 51 analyses on our own computers would take an extremely long time. The analysis places heavy demands on the hardware running the computations, which makes crashes somewhat likely on a typical laptop. We get all 51 done in the time it would take to do 1,” said Corsetti. “It is our goal to provide accurate data that helps our country.”

The predictions for the U.S. at the national and state levels are fed into the COVID-19 Forecasting Hub, which is led by the UMass-Amherst Influenza Forecasting Center of Excellence based at the Reich Lab. The weekly predictions generated by the hub are then read out by the CDC for their weekly forecast updates Center for Disease Control (CDC) COVID-19 Forecasting Hub

The second project, a risk tracking site, involves COVID-19 data-acquisition from a Johns Hopkins University repository and the Michigan Safe Start Map. This is done on a daily basis, and the process runs quickly. It only takes about five minutes, but the impact is great. The data populates the COVID-19 risk tracking site for the State of Michigan that shows by county the total number of COVID-19 cases, the average number of new cases in the past week, and the risk level.

“Maintaining the risk tracking site requires us to reliably update its data every day. We have been working on implementing these daily updates using Great Lakes so that we can ensure that they happen at the same time each day. These updates consist of data pulls from the Michigan Safe Start Map (for risk assessments) and the Johns Hopkins COVID-19 data repository (for case counts),” remarked Corsetti.

“We are proud to support this type of impactful research during the global pandemic,” said Brock Palen, director of Advanced Research Computing – Technology Services. “Great Lakes provides quicker answers and optimized support for simulation, machine learning, and more. It is designed to meet the demands of the University of Michigan’s most intensive research.”

ARC-TS is a division of Information and Technology Services (ITS). 

Related information 

SEAS study wildlife refuge wetlands habitats using machine learning

By | General Interest, News, Research

This article was written by Taylor Gribble, the ARC-TS summer 2020 intern. 

A U-M School for Environment and Sustainability (SEAS) student team is working with the Shiawassee National Wildlife Refuge to study how fish move through different wetland habitats. Their work is primarily dependent on being in the field, but in March the pandemic delayed fieldwork. In June, the team of SEAS master students was allowed to begin socially distant field work. But the question was: How? 

With the help of the ARC-TS Scientific Computing and Research Consulting Services, the SEAS students were able to pivot their research methodology and develop advanced analysis approaches for hydroacoustic data using strategically placed cameras and machine learning.

The Shiawassee refuge is divided into separately managed wetland units. These wetland units can be connected or cut off from one another and the Shiawassee river. An Adaptive Resolution Imaging Sonar (ARIS) camera has been placed at the connection point between the refuge’s “control” wetland units and the river to track fish movements between these two ecosystems. They are created through human-made dikes and water control structures.

In order to find answers about fish movement, the SEAS team is divided into three separate parts: 

  1. In-the-field monitoring of fish, macroinvertebrates, water quality, and vegetation
  2. ARIS camera work: understanding how to use ARIS footage to answer ecological questions using machine learning facilitated by the ARC-TS Data Consultation Service
  3. Community education and outreach regarding restoration work at the refuge

Meghan Richey, machine learning specialist, and Armand Burks, research data scientist, are part of the ARC-TS Data Science Consultation team. Together they are working to see the project through by understanding the needs of the SEAS research team and providing the necessary coding expertise. In addition, they are working to provide the SEAS team with tools to become independent programmers so they can implement programming/coding into their future research endeavors.  

Richey works with the machine learning team. Machine learning is a tool for turning information into knowledge. It automatically finds patterns in complex data that are difficult for a human to find. While traditional problem solving uses data and rules to find an answer, machine learning uses data and answers to find the rules that apply to a problem. Together, they count the number of fish moving in front of the camera that was originally placed in mid-March but removed in mid-May due to flooding from the dam breaches in Midland, Mich. The camera has since been placed back into the “avenue” between one of the managed wetland pools and the river. With the help of a written machine algorithm, Richey and the SEAS team are able to count the number of fish they’re seeing in front of the camera feed. There is one camera placed in the water that is taking underwater images of the fish. The fish swim by the camera, and the team captures these frames.

Burks is responsible for the data conversion stages of the project. “They have a large amount of data that’s generated from the underwater camera. These aren’t the typical cameras as we would think of; they work with sonar which is based on sound. It is generating a lot of data in this sound-based sonar format that needs to be converted into something that is usable by the machine learning model.”

In order for the program to run smoothly and be able to count the fish, Burks and SEAS team had to develop a tool that allows them to turn the raw data into an actual video feed. Once this is completed the SEAS research team watch a series of pre-recorded videos that are saved to files. In order to receive the raw data, a large data conversion must happen to transform raw sonar data into videos. From there, the machine learning algorithms can be built and analyzed.

The ARC-TS team plans to continue working with the students and the team at the refuge to refine their methods and test with recently collected footage.

Beta tool helps researchers manage IT services

By | General Interest, News, Research, Uncategorized

Since August 2019, ARC-TS has been developing a tool that would give researchers and their delegates the ability to directly manage the IT research services they consume from ARC-TS, such as user access and usage stats.

The ARC-TS Resource Management Portal (RMP) beta tool is now available for U-M researchers.

The RMP is a self-service-only user portal with tools and APIs for research managers, unit support staff, and delegates to manage their ARC-TS IT resources. Common activities such as managing user access (adding and removing users), viewing historical usage to make informed decisions about lab resource needs, and determining volume capacity at a glance are just some of the functionality the ARC-TS RMP provides.

The portal currently provides tools for use with Turbo Research Storage, a high-capacity, reliable, secure, and fast storage solution. Longer-term, RMP will scale to include the other storage and computing services offered by ARC-TS. It is currently read-view only.

To get started or find help, contact arcts-support@umich.edu.

MIDAS announces winners of 2018 poster competition

By | Educational, General Interest, Happenings, Research

The Michigan Institute for Data Science (MIDAS) is pleased to announce the winners of its 2018 poster competition, which is held in conjunction with the MIDAS annual symposium.

The symposium was held on Oct. 9-10, 2018, and the student poster competition had more than 60 entries. The winners, judged by a panel of faculty members, received cash prizes.

Best Overall

Arthur Endsley, “Comparing and timing business cycles and land development trends in U.S. metropolitan housing markets”

Most likely health impact

  • Yehu Chen, Yingsi Jian, Qiucheng Wu, Yichen Yang, “Compressive Big Data Analytics – CBDA: Applications to Biomedical and Health Studies”
  • Jinghui Liu, “An Information Retrieval System with an Iterative Pattern for TREC Precision Medicine”

Most likely transformative science impact

  • Prashant Rajaram, “Bingeability and Ad Tolerance: New Metrics for the Streaming Media Age”
  • Mike Ion, “Learning About the Norms of Teaching Practice: How Can Machine Learning Help Analyze Teachers’ Reactions to Scenarios?”

Most interesting methodological advancement

  • Nina Zhou and Qiucheng Wu, “DataSifter: Statistical Obfuscation of Electronic Health Records and Other Sensitive Datasets”
  • Aniket Deshmukh, “Simple Regret Minimization for Contextual Bandits”

Most likely societal impact

  • Ece Sanci, “Optimization of Food Pantry Locations to Address Food Scarcity in Toledo, OH”
  • Rohail Syed, “Human Perception of Surprise: A User Study”

Most innovative use of data

  • Lan Luo, “Renewable Estimation and Incremental Inference in Generalized Linear Models with Streaming Datasets”
  • Danaja  Maldeniya, “Psychological Response of Communities affected by Natural Disasters in Social Media”

MICDE to provide data analysis and dissemination support for $18 million tobacco research center

By | General Interest, Happenings, News, Research

The University of Michigan School of Public Health will house a new, multi-institutional center focusing on modeling and predicting the impact of tobacco regulation, funded with an $18 million federal grant from the National Institutes of Health and the Food and Drug Administration.

The Center for the Assessment of the Public Health Impact of Tobacco Regulations will be part of the NIH and FDA’s Tobacco Centers of Regulatory Science, the centerpiece of an ongoing partnership formed in 2013 to generate critical research that informs the regulation of tobacco products.

The Michigan Institute for Computational Discovery and Engineering (MICDE) will support the center’s Data Analysis and Dissemination core by collecting national and regional survey data, conducting analysis of the use of tobacco products including vaping and e-cigarettes, and disseminate the resulting tobacco modeling parameters to other research centers and the Food and Drug Administration.

The center is led by MICDE affiliated faculty member Rafael Meza, associate professor of Epidemiology, and David Levy, professor of Oncology at Georgetown University.

For more on the center, see the press release from the U-M School of Public Health: https://sph.umich.edu/news/2018posts/tcors-091718.html

MDST group wins KDD best paper award

By | General Interest, Happenings, MDSTPosts, Research

A paper by members and faculty leaders of the Michigan Data Science Team (co-authors: Jacob Abernethy, Alex Chojnacki, Arya Farahi, Eric Schwartz, and Jared Webb) won the Best Student Paper award in the Applied Data Science track at the KDD 2018 conference in August in London.

The paper, ActiveRemediation: The Search for Lead Pipes in Flint, Michigan, details the group’s ongoing work in Flint to detect pipes made of lead and other hazardous material.

For more on the team’s work, see this recent U-M press release.

U-M part of new software institute on high-energy physics

By | General Interest, Happenings, News, Research

The University of Michigan is part of an NSF-supported 17-university coalition dedicated to creating next-generation computing power to support high-energy physics research.

Led by Princeton University, the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) will focus on developing software and expertise to enable a new era of discovery at the Large Hadron Collider (LHC) at CERN in Geneva, Switzerland.

Shawn McKee, Research Scientist in the U-M Department of Physics, is a co-PI of the institute. His His work will focus on integrating and extending the Open Storage Grid networking activities with similar efforts at the LHC.

For more information, see Princeton’s press release, and the NSF’s announcement.

MIDAS researchers’ papers accepted at ACM KDD data science conference in London

By | General Interest, Happenings, News, Research

Several U-M faculty affiliated with MIDAS will participate in the KDD2018 Conference in London in August. The meeting is held by the Associate for Computing Machinery’s Special Interest Group in Knowledge Discovery and Data Mining (KDD).

U-M researchers had the following papers accepted:

Learning Adversarial Networks for Semi-Supervised Text Classification via Policy Gradient
Yan Li (U-M); Jieping Ye (U-M)

TINET: Learning Invariant Networks via Knowledge Transfer
Chen Luo (Rice University); Zhengzhang Chen (NEC Laboratories America); Lu-An Tang (NEC Laboratories America); Anshumali Shrivastava (Rice University); Zhichun Li (NEC Laboratories America); Haifeng Chen (NEC Laboratories America); Jieping Ye (U-M)

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
Jiaqi Ma(U-M); Zhe Zhao (Google); Xinyang Yi (Google); Jilin Chen (Google); Lichan Hong (Google); Ed Chi (Google)

Learning Credible Models
Jiaxuan Wang (U-M); Jeeheh Oh (U-M); Haozhu Wang (U-M); Jenna Wiens (U-M)

Deep Multi-Output Forecasting: Learning to Accurately Predict Blood Glucose Trajectories
Ian Fox (U-M); Lynn Ang (U-M); Mamta Jaiswal (U-M); Rodica Pop-Busui (U-M); Jenna Wiens (U-M)

ActiveRemediation: The Search for Lead Pipes in Flint, Michigan
Jacob Abernethy (Georgia Institute of Technology); Alex Chojnacki (U-M); Arya Farahi (U-M); Eric Schwartz (U-M); Jared Webb (Brigham Young University)

Career Transitions and Trajectories: A Case Study in Computing
Tara Safavi (U-M); Maryam Davoodi (Purdue University); Danai Koutra (U-M)

In addition, U-M Professor Jieping Ye will present at the event’s Artificial Intelligence in Transportation tutorial, and U-M Assistant Professor Qiaozhu Mei will speak as part of Deep Learning Day.

MICDE awards seven Catalyst Grants

By | General Interest, Happenings, News, Research

The Michigan Institute for Computational Discovery and Engineering has awarded its second round of Catalyst Grants, providing between $80,000 and $90,000 each to seven innovative projects in computational science. The proposals were judged on novelty, likelihood of success at catalyzing larger programs and potential to leverage ARC’s computing resources.

The funded projects are:

Title: Exploring Quantum Embedding Methods for Quantum Computing
Researchers: Emanuel Gull, Physics; Dominika Zgid, Chemistry.
Description: The research team will design quantum embedding algorithms that can be early adopters of quantum computers on development of advanced materials for possible applications in modern batteries, next-generation oxide electronics, or high-temperature superconducting power cables.

Title: Teaching autonomous soft machines to swim
Researchers: Silas Alben, Mathematics; Robert Deegan, Physics, Alex Gorodetsky, Aerospace Engineering
Description: Self-oscillating gels are polymeric materials that change shape, driven by chemical reactions occurring entirely within the gel. The research team will develop a computational and machine learning program to discover how to configure self-oscillating gels so that they undergo deformations that result in swimming. The long term goal is to develop a general framework for controlling autonomous soft machines.

Title: Urban Flood Modeling at “Human Action” Scale: Harnessing the Power of Reduced-Order Approaches and Uncertainty Quantification
Researchers: Valeriy Ivanov, Civil and Environmental Engineering; Nikolaos Katopodes, Civil and Environmental Engineering; Darren McKague Climate and Space Sciences and Engineering; Khachik Sargsyan, Sandia National Labs.
Description: The research team will demonstrate urban flood monitoring and prediction capabilities using NASA Cyclone Global Navigation Satellite System (CYGNSS) data and relying on state-of-the-science uncertainty quantification tools in a proof-of-concept urban flooding problem of high complexity.

Title: Advancing the Computational Frontiers of Solution-Adaptive, Scale-Aware Climate Models
Researchers: Christiane Jablonowski, Climate and Space Sciences and Engineering; Hans Johansen, Lawrence Berkeley National Lab.
Description: Researchers will further develop a 3-D mesh adaptation model for climate modeling, allowing computational resources to be focused on phenomena of interest such as tropical cyclones or other extreme weather events. The project will also introduce data-driven machine learning paradigms into modeling of clouds and precipitation.

Title: Deciphering the meaning of human brain rhythms using novel algorithms and massive, rare datasets
Researchers: Omar Ahmed, Psychology, Neuroscience and Biomedical Engineering
Description: The team will develop a set of algorithms for use on high performance computers to analyze de-identified brain data from patients in order to better understand what electrical oscillations tell us about rapidly changing behavioral and pathological brain states.

Title: Embedded Machine Learning Systems To Sense and Understand Pollinator Behavior
Researchers: Robert Dick, Electrical Engineering and Computer Science; Fernanda Valdovinos Ecology and Evolutionary Biology, Center for Complex Systems; Paul Glaum, Ecology and Evolutionary Biology.
Description: To understand the mechanisms driving the population dynamics of pollinators, the research team will develop technologies for deeply embedded hardware/software learning systems capable of remote, long term, autonomous operation; and will analyze the resulting new data to better understand pollinator activity.

Title: Deep Learning for Phylogenetic Inference
Researchers: Jianzhi Zhang, Ecology and Evolutionary Biology; Yuanfang Guan, Computational Medicine and Bioinformatics.
Description: The research team will use deep neural networks to infer molecular phylogenies and extract phylogenetically useful patterns from amino acid or nucleotide sequences, which will help understand evolutionary mechanisms and build evolutionary models for a variety of analyses.

For more on the Catalyst Grants, see http://micde.umich.edu/catalyst/.