Smartd.v7

From Access Information
Jump to: navigation, search


Installing New Server Mrepo smartd RAID Hardening YUM Crontabs LogWatch systemctl firewalld CentOS 7
Packages
Apache Bind Cacti DHCP mariadb Samba Sarg Sendmail Smokeping Rsync Work Apps
Problems VPN VPN Win Extras Bash MailScanner Horde Google CE Wake Up KVM
Other Computer Lab ISO2USB aiContact Google CE Android USB Live SRS XML

smartd

YUM

yum install smartmontools

systemctl

systemctl status smartd
systemctl start smartd
systemctl enable smartd

Usage

smartctl -H /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-123.9.3.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Try also

smartctl -i /dev/sda

Testing

smartctl --test=short /dev/sda

Takes about a minute then run

smartctl -a /dev/sda
smartctl --test=long /dev/sda

This can take several hours, while it is running you can

smartctl -a /dev/sda

And near the top you will see:

Self-test execution status:      ( 244) Self-test routine in progress...
                                       40% of test remaining.

Once the test has completed

Reporting

emacs /etc/smartmontools/smartd.conf

Standard

DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q
  -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N
  -T TYPE set the tolerance to one of: normal, permissive
  -o VAL  Enable/disable automatic offline tests (on/off)
  -S VAL  Enable/disable attribute autosave (on/off)
  -n MODE No check. MODE is one of: never, sleep, standby, idle
  -H      Monitor SMART Health Status, report if failed
  -l TYPE Monitor SMART log.  Type is one of: error, selftest
  -f      Monitor for failure of any 'Usage' Attributes
  -m ADD  Send warning email to ADD for -H, -l error, -l selftest, and -f
  -M TYPE Modify email warning behavior (see man page)
  -s REGE Start self-test when type/date matches regular expression (see man page)
  -p      Report changes in 'Prefailure' Normalized Attributes
  -u      Report changes in 'Usage' Normalized Attributes
  -t      Equivalent to -p and -u Directives
  -r ID   Also report Raw values of Attribute ID with -p, -u or -t
  -R ID   Track changes in Attribute ID Raw value with -p, -u or -t
  -i ID   Ignore Attribute ID for -f Directive
  -I ID   Ignore Attribute ID for -p, -u or -t Directive
  -C ID   Report if Current Pending Sector count non-zero
  -U ID   Report if Offline Uncorrectable count non-zero
  -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit
  -v N,ST Modifies labeling of Attribute N (see man page)
  -a      Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198
  -F TYPE Use firmware bug workaround. Type is one of: none, samsung
  -P TYPE Drive-specific presets: use, ignore, show, showall
  #      Comment: text after a hash sign is ignored
  \      Line continuation character
Attribute ID is a decimal integer 1 <= ID <= 255
except for -C and -U, where ID = 0 turns them off.
All but -d, -m and -M Directives are only implemented for ATA devices

If the test string DEVICESCAN is the first uncommented text
then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z]
DEVICESCAN may be followed by any desired Directives.

Hash the previous DEVICESCAN line and add

/dev/sda -a -d sat -m root -M test
/dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M exec /usr/libexec/smartmontools/smartdnotify
/dev/sdb -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M exec /usr/libexec/smartmontools/smartdnotify
/dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M exec /usr/libexec/smartmontools/smartdnotify

and for as many HDs that you have or want to monitor. You should also get a test email.

Here’s what each option does:

/dev/sda: Replace this with the device file you’ve been using in smartctl commands.

  • -a: This enables some common options. You almost certainly want to use it.
  • -d sat: On my system, smartctl correctly guesses that I have a serial ata drive. smartd on the other hand does not. If you had to add a “-d TYPE” parameter to the smartctl commands, you’ll almost certainly have to do the same here. If you didn’t, try leaving it out initially. You can add it later if smartd fails to start.
  • -o on, -S on: These have the same meaning as the smartctl equivalents
  • -s (S/../.././02|L/../../6/03): This schedules the short and long self-tests. In this example, the short self-test will run daily at 2:00 A.M. The long test will run on Saturday’s at 3:00 A.M. For more information, see the smartd.conf man page.
  • -m root: If any errors occur, smartd will send email to root. On my system, mail for root is forwarded to my normal email account. If you don’t have a similar setup, replace root with your normal email address. This option also requires a working email setup. Most Linux distributions automatically have working outbound email.
  • -M exec /usr/share/smartmontools/smartd-runner: This last part may be specific to the Debian and Ubuntu smartmontools packages. Check if your system has /usr/share/smartmontools/smartd-runner. If it doesn’t, remove this option. Instead of sending email directly, “-M exec” makes smartd run a different command when errors occur. On Debian, smartd-runner will run each script in /etc/smartmontools/run.d/, one of which emails the user specified by the “-m” option.

What you can expect.

This email was generated by the smartd daemon running on:

  host name: vmsrv.somewhere.co.nz
 DNS domain: somewhere.co.nz
 NIS domain: (none)

The following warning/error was logged by the smartd daemon: 

TEST EMAIL from smartd for device: /dev/sdc

For details see host's SYSLOG (default: /var/log/messages).

Reporting

You will also have to set up send mail to enable reporting.

http://ai.net.nz/wiki/index.php?title=Sendmail.v7

Setup reporting back to our server address

nano /etc/aliases

Add

root:           servers@xx.net.nz

Save and run

newaliases

What is bad?

A couple of articles to try and make sense of what constitutes a bad error, and when a hard drive needs to be replaced. http://www.computerworld.com/article/2846009/the-5-smart-stats-that-actually-predict-hard-drive-failure.html, https://www.backblaze.com/blog/hard-drive-smart-stats/.

Insummary:

  • SMART 5 - Reallocated_Sector_Count.
  • SMART 187 - Reported_Uncorrectable_Errors.
  • SMART 188 - Command_Timeout.
  • SMART 197 - Current_Pending_Sector_Count.
  • SMART 198 - Offline_Uncorrectable

Centos 7

Centos 6

Centos 5

For Centos 5

emacs /etc/aliases

Forward root to your server monitor email address.

newaliases

Let's test

mail -s "Testing from my new server" root </root/.bashrc

Check that you received the email.

smartd.conf

If your smartd.conf has

DEVICESCAN -H -m root

Add to this

DEVICESCAN -H -m root -M test

Restart smartd

/etc/init.d/smartd restart

and you should get a test email Try now setting up extensive testing, hash out the previous DEVICESCAN testing line and add the following.

/dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M test

You should also get a test email. Finally to set up monitoring testing I would use something like this

/dev/sda -a -d sat -m root -M test
/dev/sda -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M exec /usr/libexec/smartmontools/smartdnotify
/dev/sdb -a -d sat -o on -S on -s (S/../.././02|L/../../6/03) -m root -M exec /usr/libexec/smartmontools/smartdnotify

And for any more drives I want to test.

The first line gives me an email everything the daemon is restarted, I like that, and the following lines do the testing.

HP Raid Controller

https://www.thegeekstuff.com/2014/07/hpacucli-examples/

Download and install: https://support.hpe.com/hpsc/swd/public/detail?swItemId=MTX_04bffb688a73438598fef81ddd

OK

  • Did you find this page useful?
  • Do you have an issue that you have not yet fixed?

We can do this for you.

I am available for technical support. Please follow this link. Tech Support Request.
+64-6-880-0000 : ++1-808-498-7146 : help@ai.net.nz
Getting us to help you