Opened 7 days ago

Last modified 3 days ago

#1153 new defect

Some issues occurred when I used the command "smartctl -C -t short" on HDD test

Reported by: jerrytw168 Owned by:
Priority: critical Milestone: undecided
Component: smartctl Version: 6.6
Keywords: Cc: linjerrytw@…



When I used this command "smartctl -c -t short /dev/sdb" to verify SSD, smartctl (using smartctl -a)test result would show "Interrupt (host reset)" as following.

# 6 Short captive Interrupted (host reset) 70% 2423 -
# 7 Short captive Interrupted (host reset) 70% 2408 -

And /dev/log/dmesg also occurred some error messages below.
However, when I removed -C (captive mode), these issues would disappeared. I tried lots of SSDs (Intel, Samsung, HGST), I got the same symptom.

Would you please advise if the parameter "-C" can't use with "-t" in the same test? Or is it a bug for smartctl tool? I will be grateful for any help you can provide.

[166867.098164] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[166867.098172] ata2.00: failed command: SMART
[166867.098180] ata2.00: cmd b0/d4:00:81:4f:c2/00:00:00:00:00/00 tag 25

res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[166867.098184] ata2.00: status: { DRDY }
[166867.098189] ata2: hard resetting link
[166867.403151] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[166867.403678] ata2.00: supports DRM functions and may not be fully accessible
[166867.404715] ata2.00: supports DRM functions and may not be fully accessible
[166867.405200] ata2.00: configured for UDMA/133
[166867.405218] ata2: EH complete

Change History (2)

comment:1 Changed 4 days ago by Christian Franke

The kernel log shows that the SMART command which runs the captive test was aborted by the driver with "timeout". Then the driver resets the device. The device reset aborts the running self test. This is then recorded as "host reset" in the self-test log.

The problem is that smartctl does not pass a sufficient long command timeout to the driver in this case. Some drivers don't even support long timeouts.

Why do you need captive tests?

PS: For future submission, please do not set a milestone.

comment:2 Changed 3 days ago by jerrytw168

Thanks for your prompt reply.

As for your question "Why do you need captive tests?"?
According to the description of "SMART RUN/ABORT OFFLINE TEST AND self-test OPTIONS", '-C' option can be used in conjuction with short or long self-test. That's why I use in captive mode for the testing.

To be honest, I don't understand what test purpose of captive mode is. If possible, could you explain more when I just need to use SSD self-test in captive mode.

Thanks for your help.

Note: See TracTickets for help on using tickets.