How to setup raid1 on Linux

RAID stands for Redundant Array of Inexpensive Disks; depending on the RAID level we setup, we can achieve data replication and/or data distribution. A RAID setup can be achieved via dedicated hardware or via software. In this tutorial we see how to implement a RAID1 (mirror) via software on Linux, using
the mdadm utility.

In this tutorial you will learn:

The peculiarities of the most used RAID levels
How to install mdadm on the major Linux distributions
How to configure a RAID1 with two disks
How to replace a disk in the RAID array

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category	Requirements, Conventions or Software Version Used
System	Distribution independent
Software	mdadm
Other	Root permissions
Conventions	# – requires given linux-commands to be executed with root privileges either directly as a root user or by use of `sudo` command $ – requires given linux-commands to be executed as a regular non-privileged user

A brief overview of the most used RAID levels

Before we start with our tutorial and see how to implement a software RAID1 setup on Linux using mdadm, it is a good idea to make a brief recap of the most used RAID levels and see what are their peculiarities.

RAID0

Its main goal is to improve performance. In this level or RAID we have two or more disks which should be of equal size. The data is distributed alternatively on the disks (stripes), and this decreases read and write times.

RAID0 diagram

RAID1

RAID1 (mirroring) is what we will implement in this tutorial: in this RAID level, the data is written simultaneously, and so replicated, on the two or more disks which are part of the array.

RAID1 diagram

RAID5

To create a setup with this RAID level, a minimum of three disks are required, and N-1 disks can contain data. This setup can handle the failure of one disk without suffering data loss. Just as RAID0, in this setup data is striped, so distributed on multiple disks. The main difference is that also data parity information exist and is also striped. What is data parity information? Basically, all disks
in the RAID array, contain information about the data status; such information allows data rebuild if one of the disk fails.

RAID5 diagram

RAID6

RAID6 works similarly to RAID5; the main difference is that this setup involves the presence of two parity disks, so with this RAID level it’s possible to handle the failure of two disks without suffering data loss. A minimum of four disks are necessary to achieve this configuration.

RAID6 diagram

Installing mdadm

Mdadm is the utility which manages software RAID on Linux. It is available in all the major distributions. On Debian and its derivatives it possible to install it using the following command:

$ sudo apt-get update && sudo apt-get install mdadm

On the Red Hat family of distributions, we can use the dnf package manager:

$ sudo dnf install mdadm

On Archlinux we can install the package using the pacman package manager:

$ sudo pacman -Sy mdadm

Once the software is installed we can proceed and create our RAID1 setup.

Creating the RAID

For the sake of this tutorial I will work in a virtual environment, using a Debian “Buster” system, and two virtual disks I previously created, which will be part of the RAID1 setup. Such disks are recognized as vdb and vdc, as you can see from the output of the lsblk command:

sr0     11:0    1 1024M  0 rom
vda    254:0    0    7G  0 disk
├─vda1 254:1    0    6G  0 part /
├─vda2 254:2    0    1K  0 part
└─vda5 254:5    0 1021M  0 part [SWAP]
vdb    254:16   0    1G  0 disk
vdc    254:32   0    1G  0 disk

Partitioning the disks

Although it is possible to create the RAID directly using raw disks, it’s always a good idea to avoid that, and, instead, create one partition on each of the two disks. To perform such task we will use parted. The first thing we want to do is to create a partition table. For the sake of this example we will use mbr partition tables, but gpt ones are required in real-world scenarios if using disks of 2TB or larger. To initialize a disk, we can run the following command:

$ sudo parted -s /dev/vdb mklabel msdos

Now, we can create a partition which takes all the available space:

$ sudo parted -s /dev/vdb mkpart primary 1MiB 100%

We can now put the RAID flag on the partition (this will set the partition type to fd – “Linux raid autodetect”):

$ sudo parted -s /dev/vdb set 1 raid on

In this case we worked on the /dev/vdb device, obviously we should repeat the same operations also on the /dev/vdc disk.

Setting up RAID1

Once we initialized and partitioned the disks we can use mdadm to create the actual setup. All we have to do is to run the following command:

$ sudo mdadm \
  --verbose \
  --create /dev/md0 \
  --level=1 \
  --raid-devices=2 \
  /dev/vdb1 /dev/vdc1

Let’s analyze the command above. First of all we used the --verbose option in order to make the command output more information about the operations that are being performed.

We used mdadm in “create mode”, that’s why passed the --create option, providing the name of the device that should be created (/dev/md0 in this case). We than specified what level to use for the RAID with --level, and the number of devices that should be part of it with --raid-devices. Finally we provided the path of the devices which should be used.

Once we run the command, we should visualize the following output:

mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
mdadm: size set to 1046528K
Continue creating array? y

In this case we can answer affirmatively to the question and continue creating the array:

mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

To visualize information and the state of the created RAID setup, we can run mdadm with the --detail option, passing the name of the device we want to check. In this case, the output is the following:

$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri Apr 23 11:16:44 2021
        Raid Level : raid1
        Array Size : 1046528 (1022.00 MiB 1071.64 MB)
     Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Apr 23 11:17:04 2021
             State : clean
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : debian:0  (local to host debian)
              UUID : 4721f921:bb82187c:487defb8:e960508a
            Events : 17

    Number   Major   Minor   RaidDevice State
       0     254       17        0      active sync   /dev/vdb1
       1     254       33        1      active sync   /dev/vdc1

With the --detail option we can gather information about the RAID as a whole. If we want information about each single disk which is member of the setup, we can use --examine instead, and pass the devices as argument. In this case, for example, we would run:

$ sudo mdadm --examine /dev/vdb1 /dev/vdc1

The command would produce an output similar to the following:

/dev/vdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 4721f921:bb82187c:487defb8:e960508a
           Name : debian:0  (local to host debian)
  Creation Time : Fri Apr 23 11:16:44 2021
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 2093056 (1022.00 MiB 1071.64 MB)
     Array Size : 1046528 (1022.00 MiB 1071.64 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : a9575594:40c0784b:394490e8:6eb7e9a3

    Update Time : Fri Apr 23 11:30:02 2021
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 51afc54d - correct
         Events : 17


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/vdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 4721f921:bb82187c:487defb8:e960508a
           Name : debian:0  (local to host debian)
  Creation Time : Fri Apr 23 11:16:44 2021
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 2093056 (1022.00 MiB 1071.64 MB)
     Array Size : 1046528 (1022.00 MiB 1071.64 MB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=0 sectors
          State : clean
    Device UUID : b0cf8735:5fe765c0:6c269c2f:3777d11d

    Update Time : Fri Apr 23 11:30:02 2021
  Bad Block Log : 512 entries available at offset 16 sectors
       Checksum : 28c3066f - correct
         Events : 17


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

Using the RAID device

In the previous section we created a RAID1 setup using two (virtual) disks:
/dev/vdb and /dev/vdc. The RAID device we created is called /dev/md0. To be able to use it we should create a filesystem on it. To use the ext4, filesystem, for example, we would run:

$ sudo mkfs.ext4 /dev/md0

Once the filesystem is created, we should mount it somewhere, and than proceed using it just as a normal block device. To make the system auto-mount the device at boot we should create an entry for it in the /etc/fstab file. When doing so, we should reference the RAID device by its UUID, since its path may change on reboot. To find the UUID of the device, we can use the lsblk command:

$ lsblk -o UUID /dev/md0
UUID
58ff8624-e122-419e-8538-d948439a8c07

Replacing a disk in the array

Now, imagine one of the disks in the array fails. How should we proceed? As we will see, we can remove it from the array without loosing any data. Supposing the failed hard disk is /dev/vdc, we can issue the following command to mark it as such:

$ sudo mdadm --manage /dev/md0 --fail /dev/vdc1

The output of the command above will be:

mdadm: set /dev/vdc1 faulty in /dev/md0

We can check the status of the RAID to confirm the device has been marked as faulty:

$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri Apr 23 11:16:44 2021
        Raid Level : raid1
        Array Size : 1046528 (1022.00 MiB 1071.64 MB)
     Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Apr 23 15:01:36 2021
             State : clean, degraded
    Active Devices : 1
   Working Devices : 1
    Failed Devices : 1
     Spare Devices : 0

Consistency Policy : resync

              Name : debian:0  (local to host debian)
              UUID : 4721f921:bb82187c:487defb8:e960508a
            Events : 19

    Number   Major   Minor   RaidDevice State
       0     254       17        0      active sync   /dev/vdb1
       -       0        0        1      removed

       1     254       33        -      faulty   /dev/vdc1

Has you can see there is now only one active device, and /dev/vdc1 state
is: faulty. Now, to remove the disk from the array, we can run:

$ sudo mdadm --manage /dev/md0 --remove /dev/vdc1

By passing --manage we work with mdadm in “Manage” mode; in this mode we can perform actions like removing faulty disks, or adding new ones. If everything goes as expected the device should be “hot-removed”:

mdadm: hot removed /dev/vdc1 from /dev/md0

We should now format the new hard disk we will use to replace the faulty one in the same way we did for the other two, at the beginning of this tutorial. We could also use a shortcut which consists in the use of the sfdisk command. If we run this command with the -d option (short for --dump), it will dump information about the partitions of a device we pass as argument. Such information can be used as a backup and to replicate the setup. We can redirect the output to a file or use it directly in a pipeline. Supposing the new disk is /dev/vdd, we would run:

$ sudo sfdisk -d /dev/vdb | sudo sfdisk /dev/vdd

Once the new disk is partitioned and ready, we can add it to our RAID1 array with the following command:

$ sudo mdadm --manage /dev/md0 --add /dev/vdd1

If we now check the status of the RAID device, we can see that it is “rebuilding” on the spare device we added:

$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Fri Apr 23 11:16:44 2021
        Raid Level : raid1
        Array Size : 1046528 (1022.00 MiB 1071.64 MB)
     Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Apr 23 15:29:45 2021
             State : clean, degraded, recovering
    Active Devices : 1
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 1

Consistency Policy : resync

    Rebuild Status : 19% complete

              Name : debian:0  (local to host debian)
              UUID : 4721f921:bb82187c:487defb8:e960508a
            Events : 26

    Number   Major   Minor   RaidDevice State
       0     254       17        0      active sync   /dev/vdb1
       2     254       49        1      spare rebuilding   /dev/vdd1

From the output of the command we can see that the state is reported as “clean, degraded, recovering”, and the /dev/vdd1 partition is reported as a “spare rebuilding”. Once the rebuild process is over it will change into “active sync”.

Conclusions

In this tutorial we saw a brief overview of the most used RAID levels, how to create a software RAID1 with two disks using the mdadm utility, how to check the status of the RAID device and of each single disks in the array. We also saw how to remove and replace a faulty disk. Always remember that RAID1 let us achieve data redundancy but must not be considered as a backup!