Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#658 closed enhancement (wontfix)

Many (long) HDD default timeouts cause data loss or corruption (silent controller resets)

Reported by: Ch.Ris Owned by:
Priority: minor Milestone:
Component: smartctl Version: 6.4
Keywords: Cc:

Description

The default error correction timeouts of many HDDs cause data loss or corruption (so long or disabled that the controller hard resets the drives instead of only marking single blocks as bad). And the smartctl utility is the place to adjust the "scterc" timeouts of HDDs.

Please ship the provided scripts and default udev rule with the
smartmontools package, so that they try to configure safe timeouts,
depending on the drives capabilities, usage and configuration.

The problem with mismatching default timeouts surfaced through repeated
reports about drives being droped from raid arrays, and about a high rate of
unrecoverable errors occuring during raid-reconstruction, on the linux-raid
mailinglist, but the problem is not limited to redundant disk setups. Many experienced "disk failures" are probably just failures due to mismatched recovery timeout settings.

(The scripts have been posted upstream without response, but it is still
a distro resposibility to ensure that installations will have safe
defaults. Note that the provided udev rules specific to mdadm are only
to be included in the mdadm package.)

RATIONALE

The error recovery (ERC) time of a drive *must* be shorter than the
controller timeout.

Otherwise read errors will cause controller resets, leading to direct
data loss or, if it is a redundant disk, loss of redundancy and a very
high probability of another read error and data loss when
re-establishing the redundancy.

If a drive does not support adjusting its ERC timeout, the controller
timeout must be increased above the drive's maximal error recovery time.
If you don't want that kind of long device timeout, you should look for
a drive with SCT ERC timeout support. (smartctl -l scterc /dev/...)

The files are attached at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=780162

Attachments (1)

smartctl-timeouts_v1.02.zip (10.8 KB) - added by Ch.Ris 3 years ago.
smartctl-timeouts: unzip to /etc/udev/rules.d/ and reboot

Download all attachments as: .zip

Change History (15)

comment:1 Changed 3 years ago by Ch.Ris

Older versions of the files were first posted at https://sourceforge.net/p/smartmontools/mailman/message/33501936/

Last edited 3 years ago by Ch.Ris (previous) (diff)

comment:2 Changed 3 years ago by Christian Franke

Component: allsmartctl
Milestone: undecided
Priority: majorminor
Type: defectenhancement

This is a smartmontools use case, not a smartmontools bug. Does any distribution already provide a package including these scripts?

comment:3 Changed 3 years ago by Ch.Ris

Your're right, the bug actually consists of the HDDs that began shipping with disabled recovery timeouts. But we can't adjust the firmwares.

So now users and operating systems need to work around this, by adjusting the timeouts properly during hotplug initialization. I don't know of a disto that is officially using the scripts yet, but reading the mdadm response and the refernced debian bug report, it's kind of a catch 22. Everybody waiting, either for upstream, another distro, or in case of mdadm, for smartctl to ship with the basic scripts before shipping according mdadm udev rules that can adjust the timeouts according to the md use-case.

I guess, if distros need further initramfs support they will need to adjust the package anyway, so you don't have to deal with that. Just shipping smartctl together with the initial smartctl-timeouts* scripts, the additional udev rule and the systemd-unit-file example (after inspecting them and leaving out the mdadm specific udev rules) and making a release note for the distro maintainers to test and customize in their environment could be enough to trigger a solution process.

comment:4 Changed 3 years ago by Ch.Ris

The systemd-unit-file can also be left out, according to
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=779412
(dirty workaround)

comment:5 Changed 3 years ago by Ch.Ris

Assuming the scripts get installed to /usr/sbin/, the following is the adaption of the udev rule from the zip file that includes the machinecheck udev subsystem (triggers on resume events and avoids the systemd-unit-file workaround):

SUBSYSTEM=="block|machinecheck", ENV{DEVTYPE}=="partition", TEST=="/usr/sbin/smartctl", RUN+="/usr/sbin/smartctl-timeouts_non-redundant-partition.sh $parent"

comment:6 Changed 3 years ago by Alex Samorukov

Resolution: wontfix
Status: newclosed

Sorry, but it has nothing to do with the package itself. I am using smartctl to set this values on my servers using normal init scripts from the OS. I am really not sure that its good idea to distribute such scripts with smartctl, however we can mention this issue in the wiki, if you want. Using this automatically could be dangerous, because on some buggy controllers setting this timeouts may cause a problems, typically due to buggy SAT layer implementation.

comment:7 Changed 3 years ago by Ch.Ris

Hm, ok so what you suggest is a separate package. I think it would be good to inform smartmontools users how they can prevent possible data loss with smartctl in the wiki, ideally also in the readme and package metadata (init scripts won't help with suspend/resume and hotplug).

I am not that sure though if a separate package should really be necessary. Instead of the shell scripts smartctl could have an option to "auto adjust the HDD timeouts". The logic could be disabled by default of course, if a safe default for buggy controlers has to be considered more important than a save default for many HDDs in general. The shell scripts could also be disabled by default.

comment:8 Changed 3 years ago by Christian Franke

Milestone: undecided

comment:9 Changed 3 years ago by Alex Samorukov

My suggestion is to prepare some description for the FAQ - we can put it here. Also may be few notes in the smartctl/smartd man in the ERC timeout description should be a good idea.

comment:10 Changed 3 years ago by Ch.Ris

Sounds good. Looking into the FAQ, I don't know how to fit it in, though. A nice item could be:

Why does smartctl warn about a disabled or too long error correction (erc) timeout?

->Info and pointer to the scripts.

comment:11 Changed 3 years ago by Alex Samorukov

Please also attach your scripts to the ticket and i will mention them in the FAQ. I added information about ERC and RAID in the FAQ, see https://www.smartmontools.org/wiki/FAQ?action=diff&version=55

Changed 3 years ago by Ch.Ris

Attachment: smartctl-timeouts_v1.02.zip added

smartctl-timeouts: unzip to /etc/udev/rules.d/ and reboot

comment:12 Changed 3 years ago by Ch.Ris

Thanks for mentioning, I've updated the .zip including the changes mentioned above.

To me, the current FAQ sounds a little as if there would not be much of a problem without raid, but I think that may be a misconception. Non-raid setups simply have absolutely no redundancy and chance to mitigate controller resets resulting from the drive defaults.

I am proposing a revision of first part of the FAQ item:


Harddisks with error recovery control (ERC), also known as time-limited error recovery (TLER) from Western Digital, or command completion time limit (CCTL) from Samsung/Hitachi?, allow to configure the amount of time a drive's firmware may spend attemting to recover from a read or write error.

The error recovery (ERC) time of a drive *must* be shorter than the system's controller timeout. Otherwise errors will cause a controller reset and the loss of all unwritten data. Unfortunately, many drives by default have very long or disabled timeouts.

With redundant RAID hardware or software configurations a drive's timeout shorter than the controller's timeout is equally important. Here, resetting an entire drive instead of just retrying the failed block causes entire drives being marked as unusable, reducing the redundancy and performance. Furthermore, during the re-sync of a drive there is a high likelihood of errors to occur (seldom used areas), and a drive reset during the re-sync can render the entire array unusable as all unwritten meta data is lost. Limiting the drives' recovery timeout also allows for improved error handling in hardware or software RAID environments. Instead of waiting for one drive to recover requested data, it can quickly be read from another (redundant) drive.

...

Last edited 3 years ago by Ch.Ris (previous) (diff)

comment:13 Changed 3 years ago by Ch.Ris

For those that find the scripts usefull: I have looked for creating a proper software package, but I'm sorry I am not able to do it. Your are very welcome to create a distributable linux package.

comment:14 Changed 3 years ago by Alex Samorukov

Thank you for your contribution, i decided to not extend this FAQ item too much, but link this ticket instead. Actually, drives without timeout set may behave very differently, so its not always mean that timeout is not set, its rather undefined for us ) + most of the NAS-designed (yes, i know its a buzzword) drives are already comes with ERC enabled and configured.

Note: See TracTickets for help on using tickets.