In the article about checking an hard drive health using smartctl we talked about the smartmontools package, and we saw that it provides two components: a command line utility (smartctl) and a daemon, smartd, we can use to schedule operations. We focused on the usage of the former and we saw what are the S.M.A.R.T tests we can run and how to actually run them.
This time, we will talk about the smartd daemon: we will see how to schedule tests and how to configure it to so to be notified via email when an error is found on a storage device. In the course of the article I will assume the smartmontools package to be already installed. Please refer to the aforementioned article for installation instructions.
In this tutorial you will learn:
- How to configure the smartd daemon
- What is the meaning of some of the more used directives that can be used with smartd
- How to configure msmtp to forward email to gmail smtp server for messages to be delivered externally
- How to test the configuration
Software requirements and conventions used
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Distribution independent |
Software | The smartmontools and msmtp packages |
Other | Root permissions |
Conventions | # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux-commands to be executed as a regular non-privileged user |
The smartd daemon
The smartd daemon, when active, tries to poll ATA and SCSI device every 30 minutes by default. It can be configured to send an email in case some kind of problem is detected: in this article we will see how to create such setup.
The daemon configuration file is /etc/smartd.conf
. If we take a look at it, we can see that it contains a series of commented instructions except for one, DEVICESCAN
. When this keyword is used, the smartd daemon scans for all existing ATA and SCSI devices, ignoring the rest of the configuration. For the sake of this tutorial we will comment the line containing the instruction (21
) and focus on a single device, /dev/sda
. Let’s see some of the directives we can use in the file. Here is a quick recap:
Directive | Use |
---|---|
-d TYPE | Specifies the device type between ata, scsi etc… |
-H | Checks the SMART health status of the disk |
-l TYPE | Monitors SMART log (error or selftest) |
-s REGEX | Specifies regular expression to schedule self-tests |
-m ADDRESS | Sends an email notification at the specified address |
-M TYPE | Works only when the -m directive is provided and modifies its behavior |
-f | Monitors the failure of “usage” attributes |
-t | Works like a shortcut for -p and -u, so reports changes in “Prefailure” and “Usage” attributes |
-C ID | Reports if the count of pending sectors is something other than 0 |
-U ID | Reports if the number of offline uncorrectable sectors is not 0 |
-a | Works like a shortcut for -H -f -t -l error -l selftest -C 197 -U 198 |
The -d
directive is used to specify the type of device we are dealing with. Some type of devices are the following:
- auto
- ata
- scsi
- sat (scsi to ATA translation)
- usbcypress (for ATA disks behind a usbcypress USB to PATA bridge)
- usbjmicron (SATA disks behind a JMicron USB to PATA/SATA bridge)
This is not a complete list, but providing one is out of the scope of this tutorial. You can check the smartd.conf manpage for that. The default value used by the directive is auto: this means that the type of the device is inferred by the information provided by the operating system.
The -H
directive is used only for ata devices. It is needed to enable the monitoring of the S.M.A.R.T. health status of the disk. When this option is used, a report is received when any of the SMART attributes of the type pre-fail are equal or below their threshold (this could mean an imminent the device failure).
The -l
directive is used to specify what type of SMART logs should be monitored. Most common options are error and selftest. The first checks if the number of ATA errors in the summary S.M.A.R.T. error log has increased since the last check; the second checks when the number of failed tests increases, instead.
The -s
directive takes a regular expression as argument, and is used to schedule a self test. The regex should respect a specific syntax:
T/MM/DD/d/HH
Where T is the type of test that should be run, options are:
- L for long self-test
- S for short self-test
- C for conveyance test
- O for an Offline immediate Test
MM is used to specify the month of the year in the form of decimal digits, from 01 (January) to 12 (December). The DD notation specifies the day of the month: values can go from 1 to 31. In the regex syntax, the d stands for the day of the week. We specify it by using a digit from 1 (Monday) to 7 (Sunday). Finally, HH indicates the hour of the day (hours after midnight): 00 (Midnight to just before 1 am) to 23 (11pm to just before midnight). To schedule a “long test” each sunday between 4am and 5am, we would write:
L/../../7/04
Notice that in the above regex, each dot (.
) matches any possible value, so, in the example above, it’s basically like saying “every month” or “every day”.
The -f
option is needed to check for failures of Old_age attributes. Those attributes are those which (in case their value is below the threshold) don’t indicate an imminent disk failure, but only a potential usage anomaly, like for example an usage time which surpassed the designed device life.
The -t
directive is used to track changes in Old_age and and Pre-fail SMART attributes. It is a shortcut for the -p
and -u
directives, which perform those tasks, respectively.
The -C
and -U
directives are needed to report when the current pending sectors and uncorrectable sectors count become something other than 0. Both directives accepts an ID argument, which is the id of the SMART attributes they check, usually 197
and 198
:
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
Finally, the -a
directive is a shortcut; it implies the use of: -H
, -f
,-t
, -l error
, -l selftest
, -C 197
and -U 198
. It is important to notice that -a
is the default directive: if no other one is specified it is assumed.
Using msmtp for external email notifications
To be able to send to notification email “externally” and not in the mail spool of our machine users, we can use msmtp. Msmtp is a smtp client able to forward emails to a third party smtp server. It is very easy to configure, let’s see how!
Installation
Installing msmtp is quite simple. The specific command depends, of course, on the distribution we are running on. On Debian and derivatives we can run:
$ sudo apt-get update && sudo apt-get install msmtp
To achieve the same result on Archlinux, we can run:
$ sudo pacman -S msmtp
On Fedora we use the dnf package manager:
$ sudo dnf install msmtp
On Red Hat Enterprise Linux and CentOS, it should be possible to install the software from the third party EPEL repository, using the same command above.
Configuring msmtp to work with gmail with app-specific password
Msmtp can be configured per-user or with a global configuration file. Each user that wants to obtain a specific configuration should use the ~/.msmtprc
file. Appropriate permissions should be set on the it, so that it is readable and writable only by its owner. To use a global configuration we must use the /etc/msmtprc
file instead: for msmtprc to work correctly it should have 644
as permissions, so it must be readable by all the users. The configuration needed for the application to forward emails to the gmail smtp server is the following:
defaults auth on tls on tls_trust_file /etc/ssl/certs/ca-certificates.crt logfile /var/log/msmtp.log # Gmail configuration account gmail host smtp.gmail.com port 587 from your-username@gmail.com user your-username password app-specific-password account default: gmail
As you may have noticed, in the password field we used a google app-specific password. App specific passwords are passwords that are meant to be used with programs considered “less secure” by google, because they don’t use the auth2 authentication protocol. To generate such a password we must navigate to google app password page, log in, select an application to associate to a password (or enter a custom name) and confirm the creation. The created password will be displayed but you will not be able to recover it if you loose it, so be sure to keep it safe.
By default emails are sent by using the system mail command. For it to be able to work with msmtp, the msmtp-mta
package should also be installed: this package creates a sendmail symlink which points to msmtp and it is available on Debian and Archlinux (I couldn’t find it on Fedora). As an alternative, we can enter the following line into the /etc/mail.rc
configuration file:
set sendmail="/usr/bin/msmtp -t"
Testing the setup
With all things in place, we can verify that our setup works as expected. In the /etc/smartd.conf
file we comment all the lines and append the following one:
/dev/sda -a -m destination.email@gmail.com -M test
We focus on the /dev/sda
device, and we already saw what the -a
, -m
and -M
options are for. Passing “test” as an argument to the latter, a test email will be sent to the specified address each time the daemon is restarted. So let’s do it by running:
$ sudo systemctl restart smartd
At this point, if everything is configured correctly, we should have received a mail!
Conclusions
In this article we took a look at smartd the daemon provided by the smartmontools options, that can be used to schedule S.M.A.R.T. tests and data gathering. We saw how to configure it, and what is the meaning of some of the directives that can be used in the configuration file. Finally, we saw how to use msmtp to forward email notifications externally via gmail smtp server.