Opened 5 months ago

Closed 5 months ago

Last modified 5 months ago

#1207 closed defect (invalid)

WD Red 6TB - WDC WD60EFAX-68SHWN0 reports wrong self-test polling time recommendation

Reported by: Bear_ Owned by:
Priority: minor Milestone:
Component: smartctl Version: 7.0
Keywords: ata Cc:

Description

The duration of the long (extended) self-test is reported as 7 minutes.
Running the long self-test with smartctl (under linux) indeed takes 7 minutes. Running the extended self test under W7 with Data Lifeguard Diagnostics took about 4 hours and 30 minutes (same disk of course).

I also see that the conveyance test has the same recommended duration as the short self-test, which is unusual. I didn't run these tests on either systems.

Is this a bug in smartctl or in the disk?

Attachments (2)

smartctl-WDC-WD60EFAX-68SHWN0.txt (16.6 KB) - added by Bear_ 5 months ago.
result of smartctl -q noserial -x /dev/sdc > smartctl-VENDOR-MODEL.txt
WD60EFAX-68SHWN0_ataioctl_2.txt (6.9 KB) - added by Bear_ 5 months ago.
result of smartctl -r ataioctl,2 -q noserial -c /dev/sdc

Download all attachments as: .zip

Change History (9)

Changed 5 months ago by Bear_

result of smartctl -q noserial -x /dev/sdc > smartctl-VENDOR-MODEL.txt

comment:1 Changed 5 months ago by Christian Franke

Component: allsmartctl
Keywords: ata added
Milestone: undecided

ATA-8 introduced a new field for drives with extended self-test polling time > 0xff. Either the drive does not set it correctly or smartctl does not interpret it correctly.

Please provide output of:
smartctl -r ataioctl,2 -q noserial -c /dev/sdc

Changed 5 months ago by Bear_

result of smartctl -r ataioctl,2 -q noserial -c /dev/sdc

comment:2 Changed 5 months ago by Christian Franke

Spec for Device SMART data structure from T13/1699-D Revision 6a (ATA8-ACS) up to T13/2161-D Revision 5 (ACS-3):

OffsetDescription
372Short self-test routine recommended polling time (in minutes).
373Extended self-test routine recommended polling time in minutes. If FFh, use bytes 375 and 376 for the polling time.
374Conveyance self-test routine recommended polling time in minutes.
375..376Extended self-test routine recommended polling time in minutes (word).

(ACS-4 and later removed SMART spec and refer to ACS-3)

Observed values:

...
REPORT-IOCTL: Device=/dev/sdc Command=SMART READ ATTRIBUTE VALUES
 Input:   FR=0xd0, SC=0x01, LL=...., LM=0x4f, LH=0xc2, DEV=...., CMD=0xb0 IN
 [Duration: 0.006s]
REPORT-IOCTL: Device=/dev/sdc Command=SMART READ ATTRIBUTE VALUES returned 0
...
368-383: 03 00 01 00 02 07 02 00 00 00 00 00 00 00 00 00 |................|
                              ^^-^^ Extended (word)
                           ^^------ Conveyance
                        ^^--------- Extended < 0xff, if 0xff see above
                     ^^------------ Short
...
General SMART Values:
...
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (   7) minutes.
Conveyance self-test routine
recommended polling time: 	 (   2) minutes.

Conclusion: smartctl prints the values as returned by the device.

comment:3 Changed 5 months ago by Christian Franke

Running the long self-test with smartctl (under linux) indeed takes 7 minutes. Running the extended self test under W7 with Data Lifeguard Diagnostics took about 4 hours and 30 minutes (same disk of course).

Are you sure that Data Lifeguard Diagnostics actually uses SMART self-tests?
Does smartctl -c report Self-test routine in progress... during such tests?
Do such tests appear is self-test logs (-l selftest -l xselftest) after completion?

comment:4 Changed 5 months ago by Bear_

Data Lifeguard Diagnostics (DLG) offers two tests, QUICK TEST and EXTENDED TEST (strangely enough no conveyance test). The description in the tool is:

QUICK TEST performs SMART drive quick self-test to gather and verify the Data Lifeguard information contained on the drive.

EXTENDED TEST performs a Full Media Scan to detect bad sectors. This test may take hours for a large drive.

So, previously it escaped my attention that the extended test is not even claimed to be a SMART self-test.

I started both scans, and the quick test showed up in smartctl -c report Self-test routine in progress ... 90% of test remaining (at some point) It completed after 2 minutes, and the test did appear in the self-test log.

The DLG extended test did neither show up for smartctl -c, nor in the self-test log. I canceled the test after a minute, but the earlier DLG extended self-test was completed, and it also did not show up in the log.

Edit: I also did the self-tests with PassMark?'s DiskCheckup?. Both short and extended self-test show up in smartctl (both -c and -l selftest). The short took 2 minutes, and the extended 7 minutes.

I am not a drive expert, but as far as I understand the self-test polling time is indeed not set(?) correctly. I am confused, because the drive seems to do what it says, but I can't imagine that the extended (SMART) self-test does a full surface scan (as I think is usual for extended self-tests) in 7 minutes.

Before I made a ticket here, I contacted WD and they said, that if the DLG extended test completes successfully, then the drive is okay.

What can I do, what should I do? Try to convince the WD help desk, or return the drive? I don't feel convenient with the idea that the SMART functionality is not implemented correctly in a drive that I use to store a lot of data.

Last edited 5 months ago by Bear_ (previous) (diff)

comment:5 in reply to:  4 Changed 5 months ago by Christian Franke

The DLG extended test did neither show up for smartctl -c, nor in the self-test log. I canceled the test after a minute, but the earlier DLG extended self-test was completed, and it also did not show up in the log.

This likely means that DLG does the read scan itself. Then the host read counters from device statistics (smartctl -l devstat or -x) should increase quickly during the test:

Device Statistics (GP Log 0x04)
293	Page  Offset Size        Value Flags Description
...
299	0x01  0x028  6     11721284557  ---  Logical Sectors Read
300	0x01  0x030  6        45788458  ---  Number of Read Commands

SMART self-tests should not affect read counters because no host I/O is done.

Edit: I also did the self-tests with PassMark's DiskCheckup. Both short and extended self-test show up in smartctl (both -c and -l selftest). The short took 2 minutes, and the extended 7 minutes.

This likely means that WD decided to implement an "extended" self-test which does no full read scan. So the polling time is set correctly but the test is implemented in an at least "unusual" way.

What can I do, what should I do? Try to convince the WD help desk, or return the drive?

Try the selective self-test which allows to specify LBA ranges, see man page. This command should perform a read scan of the full LBA range: smartctl -t select,0-max /dev/sdc

comment:6 Changed 5 months ago by Christian Franke

Resolution: invalid
Status: newclosed

The self-test recommended polling times are printed correctly by smartctl.

The extended self-test implemented by the firmware of this drive is far to short for a full read scan. This cannot be fixed by smartctl.

Note that the ATA standards do not specify what an extended self-test should do. Only a selective self-test is required to do a read scan.

comment:7 Changed 5 months ago by Christian Franke

Milestone: undecided
Note: See TracTickets for help on using tickets.