Tag

data analysis

PFAS research in the Michigan mother-infant pairs study, supported by ITS, SPH, MM, AGC

By | News

Three mothers holding their infants. Everyone is sitting on a couch..PFAS (per- and polyfluoroalkyl substances) are a class of chemicals that have been around since the 1940s and became more broadly used in the post-war 1960s era. PFAS are in our homes, offices, water, and even our food and blood. PFAS break down slowly and are difficult to process, both in the environment and our bodies. 

Scientific studies have shown that exposure to some PFAS in the environment may be linked to harmful health effects in humans and animals. Because there are thousands of PFAS chemicals found in many different consumer, commercial, and industrial products, it is challenging to study and assess the human health and environmental risks. 

Fortunately, some of the most persistent PFAS are being phased out. The EPA has been working on drinking water protections, scientists are working on ways to break down and eliminate PFAS, and PFAS are being addressed at a national level

A team of University of Michigan researchers from the School of Public Health DoGoodS-Pi Environmental Epigenetics Lab and Michigan Medicine are working to understand how behaviors and environments during pregnancy can cause changes to the way genes work in offspring. This emerging field is known as toxicoepigenetics. 

Jackie Goodrich, Ph.D., research associate professor at the U-M School of Public Health, led the team. “PFAS may impact the development of something we all have called the epigenome. The epigenome is a set of modifications on top of our DNA that controls normal development and function. Environmental exposures like PFAS can alter how the epigenome forms, and this impacts development and health. Our study expands on current knowledge about PFAS and the epigenome by focusing on a type of epigenetic mark that is not usually measured.”

Vasantha Padmanabhan, Ph.D., M.S., professor emerita (in service), Department of Pediatrics, Michigan Medicine, built the Michigan Mother-Infant Pairs study over the past decade with an emphasis on identifying harmful exposures during pregnancy that impact women and their newborns. “I am so grateful to those who engaged in this study. PFAS are complex, and mothers’ and infants’ involvement helped us work toward a solution that impacts us all. I want to acknowledge the contributions of the U-M Department of Obstetrics and Gynecology, Michigan Institute for Clinical & Health Research (MICHR), and the Von Voigtlander Women’s Hospital that made this study possible.” 

Rebekah Petroff, Ph.D., a research fellow with Environmental Health Sciences, led the computation portion of the research. She said that using Turbo for storing the raw data and Great Lakes for high-performance computing (HPC) enabled a much faster analysis that was needed for the study with so much data to analyze. 

Turbo and Great Lakes are services provided by Advanced Research Computing, a division of Information and Technology Services (ITS). ARC facilitates powerful approaches to complex research challenges in fields ranging from physics to linguistics, and from engineering to medicine.

Petroff said, “This analysis would have taken over a month straight of computing time on a regular desktop computer. The first job we submitted to Great Lakes ran so fast—I had results the next morning! Great Lakes made this research possible, and I believe that our study results can be broadly impactful to public health and toxicoepigenetics going forward.”

Support for using this complex technology also came from Dan Barker, a UNIX systems admin with the U-M School of Public Health Biostatistics Department. Barker assisted with the code needed to use Great Lakes. “We started with a test run of a few hundred pairs of genomes. Once we were successful with that, we ran the entire nearly 750,000 epigenetic marks across 141 people and seven different PFAS.”

Barker also helped design and submit array jobs which are a series of identical, or near identical, tasks that are run multiple times. This is a common technique used by researchers when leveraging HPC. Array jobs allow for essential analytical comparisons among the test results. Petroff said, “In our study, we used an array job to split up our computations so that they ran much more efficiently!”

The U-M Advanced Genomics Core (AGC) performed the epigenetic assays, a kind of laboratory technique which measures marks on your DNA, for this project. AGC is part of the campus-wide laboratories that develop and provide state-of-the-art scientific resources to enable biomedical research known as Biomedical Research Core Facilities (BRCF). Other BRCF cores also worked on this project, including the Epigenomics Core and the Bioinformatics Core.

Genotyping is similar to reading a few words scattered on a page. This process gives researchers small packets of data to compare. Genotyping looks for information at a specific place in the DNA where we know important data will be. This project used a type of genotyping called microarrays (also known as “arrays”) and help researchers understand how regulation of DNA—including methylation and hydroxymethylation measured in this study—are impacted by exposures like PFAS.  

Brock Palen, ARC director, said, “This research is of human interest and impacts all of us. I’m pleased that ARC assisted their research with staff expertise, equipment, and no-cost allocations from the U-M Research Computing Package.”

Petroff said that follow up studies are needed to better understand if the results are universal or specific to this cohort of infants and parents. If the results hold steady, then a significant discovery has been made that will lead to more comprehensive PFAS mitigation solutions. “Although steps are being taken to mitigate PFAS, exposure is still prevalent, and a deeper understanding of how it impacts humans is needed,” said Dana Dolinoy, Ph.D., chair, NSF International Department Chair of Environmental Health Sciences and epigenetics expert.

Read the full article: Mediation effects of DNA methylation and hydroxymethylation on birth outcomes after prenatal per- and polyfluoroalkyl substances (PFAS) exposure in the Michigan mother–infant pairs cohort.

Funding was provided by grants from the National Institutes of Health, the U.S. Environmental Protection Agency, and the National Institute of Environmental Health Sciences Children’s Health Exposure Analysis Resource program.

Introduction to SPSS

By |

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Introduction to SPSS

By | | No Comments

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Introduction to SPSS

By | | No Comments

Audience: Never before SPSS users who will be using SPSS for Windows.  Those using SPSS for Unix or Macintosh should email the instructor at cpow@umich.edu before enrolling.

Note: Topic order is subject to change.  Participants must sign up for the entire series.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Registration

To register for CSCAR Workshops, call the CSCAR front desk at (734) 764-7828 or come to the office in person with cash or check or a UM department shortcode:

OFFICE HOURS

9:00 a.m. – 5:00 p.m., Monday through Friday
Closed 12pm – 1:00 p.m. every Tuesday for staff meeting.
Voice: (734) 764-7828 (4-STAT from a campus phone)
Fax: (734) 647-2440

ADDRESS

Center for Statistical Consultation and Research (CSCAR)
The University of Michigan
3550 Rackham
915 E. Washington St.
Ann Arbor, MI 48109-1070

 

Image processing III

By |

If you use image data in your work, but are not trained to analyze it, this workshop could be for you. This is the third workshop and will build upon the material covered in the two previous workshops last semester. We will cover texture analysis, Hough transform, and frequency domain methods.

If you are not exposed to Fourier analysis consider attending the CSCAR workshop Fourier transform and its applications in data analysis’.

Fourier transform and its applications in data analysis

By |

Spectral decomposition of time series (1-D) and image (2-D) data is a commonly used technique across various disciplines that use sensors for data collection. Fourier analysis is the foundation of spectral decomposition methods and provides basis (and intuition) for the more advanced methods in time-frequency analysis such as wavelets and Wigner-Ville decomposition. This workshop will cover 1-D Fourier transform with applications to signals and time series data and will also provide a flavor of applications in image processing.

SPSS I Introduction to SPSS

By | | No Comments

Note: Topic order is subject to change.

This workshop is designed to introduce participants to SPSS. It will cover the fundamentals of SPSS, within-case transformations, data management with multiple files, and basic statistics and graphics. Useful for any scholar engaged in quantitative research.

Fundamentals

This portion introduces SPSS, the menu and the help systems, and the three main types of files used.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.

Basic Statistical Analysis

The portion includes a brief demonstration of a statistical analysis in SPSS. While not delving deep into statistical theory, we will cover the basics of an analysis, as well as discuss the graphing facilities in SPSS.

Registration

To register for CSCAR Workshops, call the CSCAR front desk at (734) 764-7828 or come to the office in person with cash or check or a UM 6-digit department shortcode:

OFFICE HOURS

9:00 a.m. – 5:00 p.m., Monday through Friday
Closed 12pm – 1:00 p.m. every Tuesday for staff meeting.
Voice: (734) 764-7828 (4-STAT from a campus phone)
Fax: (734) 647-2440

ADDRESS

Center for Statistical Consultation and Research (CSCAR)
The University of Michigan
3550 Rackham
915 E. Washington St.
Ann Arbor, MI 48109-1070

 

New private insurance claims dataset and analytic support now available to health care researchers

By | General Interest, Happenings, HPC, News | No Comments

The Institute for Healthcare Policy and Innovation (IHPI) is partnering with Advanced Research Computing (ARC) to bring two commercial claims datasets to campus researchers.

The OptumInsight and Truven Marketscan datasets contain nearly complete insurance claims and other health data on tens of millions of people representing the US private insurance population. Within each dataset, records can be linked longitudinally for over 5 years.  

To begin working with the data, researchers should submit a brief analysis plan for review by IHPI staff, who will create extracts or grant access to primary data as appropriate.

CSCAR consultants are available to provide guidance on computational and analytic methods for a variety of research aims, including use of Flux and other UM computing infrastructure for working with these large and complex repositories.

Contact Patrick Brady (pgbrady@umich.edu) at IHPI or James Henderson (jbhender@umich.edu) at CSCAR for more information.

The data acquisition and availability was funded by IHPI and the U-M Data Science Initiative.