Configuring Software RAID
Software RAID (redundant array of independent disks) provides fast and resilient storage for your machine learning data. This document shows you how to configure software RAID in your cluster using mdadm
.
Install new drives as needed, then power on the machine.
Check that the drives are present with
lsblk
. Your output should look similar to the following:
Use
parted
to partition and format the drives.
Before running this step, ensure you back up all your data on this drive. Formatting the drive makes your data unrecoverable.
Repeat the previous step for every drive listed in step 2.
Use
mdadm
to create the RAID array with the new drives.
If mdadm is not on the system already, you can install it by running:
ubuntu@ubuntu:~$ sudo apt update && sudo apt install mdadm
This example uses RAID level 5. To use a different RAID level (see below), set --level
to the desired RAID level. Your output should look similar to the following:
Format the RAID array:
Update the
mdadm
configuration file so that the software RAID persists through reboots:
Create a mount point and mount the array:
Get the the block ID for your RAID array, then add a line to your
/etc/fstab
so it mounts at boot:
Understanding RAID Levels
There are a number of RAID levels, each performing a slightly different function. RAID 5, used in the steps above, provides a good balance between performance and availability. Many sources online — Wikipedia for example — provide more information about RAID levels.
Last updated