yb-logoThe Yottabyte Research Cloud is a partnership between ARC and Yottabyte that provides U-M researchers with high performance, secure and flexible computing environments enabling the analysis of sensitive data sets restricted by federal privacy laws, proprietary access agreements, or confidentiality requirements.

The system is built on Yottabyte’s composable, software-defined infrastructure platform, called Cloud Composer and represents U-M’s first use of software-defined infrastructure for research, allowing the on-the-fly personalized configuration of any-scale computing resources.

Cloud Composer software inventories the physical CPU, RAM and storage components of Cloud Blox appliances into definable and configurable virtual resource groups that may be used to build multi-tenant, multi-site infrastructure as a service.

See the Sept. 2016 press release for more information.

The YBRC platform can accommodate restricted data. Please see this recent announcement for details.

Capabilities

The Yottabyte Research Cloud supports several existing and planned platforms for researchers at the University of Michigan:

  • Data Pipeline Tools, which include databases, message buses, data processing and storage solutions. This platform is suitable for restricted and unrestricted data.  These tools are currently available for users with unrestricted data.
  • Research Database Hosting, an environment that can house research-focused data stored in a number of different database engines.
  • Glovebox, a virtual desktop service for researchers who have restricted data and require higher security. (planned)
  • Virtual desktops for research. This service is similar to Glovebox but is suitable for unrestricted data. (planned)
  • Docker Container Service. This service can take any research application that can be containerized for deployment. This service will be suitable for restricted and unrestricted data (planned)

Researchers who need to use Hadoop or Spark for data-intensive work should explore ARC-TS’s separate Hadoop cluster.

Contact arcts-support@umich.edu for more information.

Hardware

The system deploys 40 high performance Hyperconverged YottaBlox nodes (H2400i-E5), each consisting of two, Intel Xeon E5-2680V4 CPU (1,120 cores total), 512GB DDR4 2400MHz RAM (20,480GB total), dual port 40GbE network adapters (80 total) and (2) 800GB NVMe SSD DC P3700 drives (64TB); and 20 storage YottaBlox nodes (S2400i-E5-HDD), each consisting of two, Intel Xeon E5-2620V4 CPU (320 cores total), 128 GB DDR4 2133MHz RAM (2,560 GB total), quad port 10GbE network adapters (80 total),  (2) 800 GB DC S3610 SSD (32 TB total) and 12 x 6 TB 7200 RPM (1,440TB total).

Access

These tools are offered to all researchers at the University of Michigan free of charge, provided that certain usage restrictions are not exceeded. Large-scale users who outgrow the no-cost allotment may purchase additional YBRC resources. All interested parties should contact arcts-support@umich.edu.

Sensitive Data

The U-M Research Ethics and Compliance webpage on Controlled Unclassified Information provides details on handling this type of data. The U-M Sensitive Data Guide to IT Services is a comprehensive guide to sensitive data.

Order Service

The Yottabyte Research Cloud is a pilot program available to all U-M researchers.

Access to Yottabyte Research Cloud resources involves a single email to us at arcts-support@umich.edu. Please include:

  • Your name or your advisor’s name
  • Your unit
  • What you would like to use YBRC for
  • Whether you plan to use restricted data.

Someone from your unit IT staff or an ARC-TS IT staff member will reach out to you and arrange details to determine the best path to make your request work within the Yottabyte Cloud environment.

General Questions

What is the Yottabyte Research Cloud?

The Yottabyte Research Cloud (YBRC) is the University’s private cloud environment for research.   It’s a collection of processors, memory, storage, and networking that can be subdivided into smaller units and allocated to research projects on an as-needed basis to be accessed by virtual machines and containers.

How do I get access to Yottabyte Research Cloud Resources?

Access to Yottabyte Research Cloud resources involves a single email to us at arcts-support@umich.edu. Please include:

  • Your name or your advisor’s name
  • Your unit
  • What you would like to use YBRC for
  • Whether you plan to use restricted data.

Someone from your unit IT staff or an ARC-TS IT staff member will reach out to you and arrange details to determine the best path to make your request work within the Yottabyte Cloud environment.  

What class of problems is Yottabyte Research Cloud designed to solve?

Yottabyte Research Cloud resources are aimed squarely at research and the teaching and training of students involved in research.  Primarily, Yottabyte resources are for sponsored research.  Yottabyte Research Cloud is not for administrative or clinical use (business of the university or the hospital).  Clinical research is acceptable as long as it is sponsored research.  

How large is the Yottabyte Research Cloud?

In total, Yottabyte Research Cloud (YBRC) has 960 processing cores for each Yottabyte cluster, 7.5 Terabytes, and roughly 330 TB of scratch storage available in Maize and Blue each.   

What does Maize Yottabyte Research Cloud and Blue Yottabyte Research Cloud stand for?

Yottabyte resources are divided up between two clusters of computing and storage.    Maize YBRC is for restricted data analyses and storage, and Blue YBRC is for unrestricted data analyses and storage.

What can I do with the Yottabyte Research Cloud?

The initial offering of YBRC is focused on a few different types of use cases:  

  1. Database hosting and data ingestion of streaming data from an external source into a database. We can host many types of databases within Yottabyte, including most structured and unstructured databases.  Examples include MariaDB, PostgreSQL, and MongoDB.
  2. Hosting for applications that you can’t host locally in your lab or you would like to connect to our HPC and data science clusters, such as Material Studio, Galaxy, and SAS Studio.
  3. Hosting of Virtual Desktops and Servers for restricted data use cases, such as statistical analysis of health data, or an analytical project for Controlled Unsecured Information (CUI).  Most people in this case may need a powerful workstation for SAS, Stata or R analyses, for example, or some other application.  

Are these the only things I can do with resources in the Yottabyte Research Cloud?

No!  Contact us at arcts-support@umich.edu if you want to learn whether or not your idea can be done within YBRC!  

How do I get help if I have an issue with something in Yottabyte?

The best way to get help is to send an email to arcts-support@umich.edu with a brief description of the issues that you are seeing.  

What are the support hours for the Yottabyte Research Cloud?

Yottabyte is supported between the hours of 9am to 5pm Monday through Friday.  Response times for support outside of these hours will be longer.

Usage Questions

What’s the biggest machine I can build within Yottabyte Research Cloud?

Because of the way that YBRC divides up resources, the largest Virtual Machine within the cluster is 16 processing cores, and 128 GB of RAM.  

How many Yottabyte Research Cloud resources am I able to access at no cost?

ARC-TS policy is to limit no-cost individual allocations to 100 cores, so that access is always open to multiple research groups.

What if I need more than the no-cost maximum?

If you need to use more than 100 cores of YBRC, we recommend that you purchase YBRC physical infrastructure of your own and add it to the cluster.  Physical infrastructure can be purchased in 96 physical core chunks, which can be oversubscribed as memory allows.  For every block purchased, the researcher will also receive 4 years of hardware and OS support for that block in the case of failure.  For a cost estimate of buying your own blocks of infrastructure and adding to the cluster, please email arcts-support@umich.edu.  

What is ‘scratch’ storage?

Scratch storage for Yottabyte Research Cloud is the storage area network that OS storage and active data storage on the local virtual machines that are not actively being backed up or replicated to a separate infrastructure.  Like the scratch storage on Flux, we don’t recommend storing any data solely on the local disk of any machines.  Make sure that you have backups on other machines, like Turbo, Locker, or some other service.  

HIPAA Compliance Questions

What can I do inside of an HIPAA network enclave?

For researchers with restricted data with a HIPAA classification, we provide a small menu of Linux and Windows workstations to be installed within your enclave.  We do not delegate administrative rights for those workstations to researchers or research staff.  We may delegate administrative rights for workstations and services in your enclaves to IT staff in your unit who have successfully completed the HIPAA IT training coursework given by ITS or HITS, and are familiar with desktop and virtual machine environments.  

Machines in the HIPAA network enclaves are encircled by a deny first firewall that prevents most traffic from entering the enclaves.  Researchers can still visit external-to-campus websites from within a HIPAA network enclave.  Researchers within a HIPAA network enclave can use storage services such as Turbo and MiStorage Silver (via CIFS) to host data for longer-term storage.

What are a researcher and research group responsibilities when they have HIPAA data within YBRC?

All researchers, staff, and students that use YBRC when analyzing restricted data have a shared responsibility in keeping their restricted data secure.

  • Researchers need to be aware of the personnel in their labs who have access to the data in their enclaves.  
    • Each lab should have a process for adding and removing users from enclaves that includes removing departed lab members from access to restricted data as soon as possible after they have left the lab.
    • Each lab should review who has access to their data and enclaves on a twice yearly basis via checking the memberships of their M-Community and Active Directory groups to ensure that people have been removed as requested.  
  • Each lab user must store their restricted data in a specific directory, as discussed during their introductory meeting with YBRC staff.  They must keep the data only in this directory over the life of the data on the system.  

CUI Compliance Questions

What can I do inside of a Secure Enclave Service CUI enclave?

Staff will work with researchers using CUI-classified data to determine the types of analysis that can be conducted on YBRC resources that comply with relevant regulations.

What are a researcher and research group responsibilities when they have CUI data within YBRC?

All researchers, staff, and students that use YBRC when analyzing restricted data have a shared responsibility in keeping their restricted data secure.

  • Researchers need to be aware of the personnel in their labs who have access to the data in their enclaves.  
    • Each lab should have a process for adding and removing users from enclaves that includes removing departed lab members from access to restricted data as soon as possible after they have left the lab.
    • Each lab should review who has access to their data and enclaves on a twice yearly basis via checking the memberships of their M-Community and Active Directory groups to ensure that people have been removed as requested.  
  • Each lab user must store their restricted data in a specific directory, as discussed during their introductory meeting with YBRC staff.  They must keep the data only in this directory over the life of the data on the system.