Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#1225 closed defect (fixed)

Missed info for SAS SSD after update of smartmontools from 5.43 to 7.0

Reported by: Andrey P. Owned by:
Priority: major Milestone: Release 7.1
Component: smartctl Version: 7.0
Keywords: scsi Cc:

Description

Hello,

I'm trying to get smart info from SAMSUNG MZILS3T8HMLH/007 SAS SSD drive.

smartmontools 5.43 work properly:

~]# smartctl -a /dev/sda
smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-042stab138.1] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
Vendor: SAMSUNG
Product: MZILS3T8HMLH/007
Revision: GXL0
User Capacity: 3,840,755,982,336 bytes [3.84 TB]
Logical block size: 512 bytes
Logical Unit id: ...
Serial number: ...
Device type: disk
Transport protocol: SAS
Local Time is: ...
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
SS Media used endurance indicator: 0%
Current Drive Temperature: 33 C
Drive Trip Temperature: 70 C
Manufactured in week 19 of year 2017
Accumulated start-stop cycles: 26
Specified load-unload count over device lifetime: 0
Accumulated load-unload cycles: 0
Elements in grown defect list: 0
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 8 0 8 8 ... 0
write: 0 0 0 0 0 ... 0
Non-medium error count: 1
No self-tests have been logged
Long (extended) Self Test duration: 3600 seconds [60.0 minutes]

smartmontools 7.0 don't work:

~]# smartctl -a /dev/sda
smartctl 7.0 2019-05-21 r4916 [x86_64-linux-3.10.0-957.12.2.vz7.86.2] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SAMSUNG
Product: MZILS3T8HMLH/007
Revision: GXL0
Compliance: SPC-4
User Capacity: 3,840,755,982,336 bytes [3.84 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Logical Unit id: ...
Serial number: ...
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: ...
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C
Elements in grown defect list: 0
Error Counter logging not supported
Device does not support Self Test logging

Particularly, I need missed "SS Media used endurance indicator" to be reported by smartmontools 7.0.

By investigating the smartmontools code, I've found that this is printed by scsiPrintSSMedia function, but now (in 7.0) it is called only if is_disk==true:

is_disk = ((SCSI_PT_DIRECT_ACCESS == peripheral_type) ||	
              (SCSI_PT_HOST_MANAGED == peripheral_type));

Is it possible that is_disk==false for such drive? How to workaround this?

Thanks.

Attachments (1)

scsiprint.cpp.diff.html (530.6 KB ) - added by Andrey P. 5 years ago.
Diff for scsiprint.cpp 5.43 vs 7.0

Download all attachments as: .zip

Change History (16)

by Andrey P., 5 years ago

Attachment: scsiprint.cpp.diff.html added

Diff for scsiprint.cpp 5.43 vs 7.0

in reply to:  description comment:1 by Christian Franke, 5 years ago

Keywords: sas endurance removed
Milestone: undecided
Priority: criticalmajor
smartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-042stab138.1] (local build)
...
Device type: disk
...
Elements in grown defect list: 0

...

smartctl 7.0 2019-05-21 r4916 [x86_64-linux-3.10.0-957.12.2.vz7.86.2] (local build)
...
Device type: disk
...
Elements in grown defect list: 0

...

First of all, both version info lines look strange:
r4347 is a commit between 6.5 and 6.6 and much newer than the 7+ year old 5.43 (2012-06-30 r3573).
r4916 is only a few commits from current SVN HEAD (r4934) and newer that 7.0 (2018-12-30 r4883).

Is it possible that is_disk==false for such drive?

Possibly not:
Device type: disk is only printed if peripheral_type == SCSI_PT_DIRECT_ACCESS (0) which should set is_disk = true.
Elements in grown defect list is only printed if is_disk == true.

The gSSMediaLPage might no longer be set. This would also skip scsiPrintSSMedia().

Diff for scsiprint.cpp 5.43 vs 7.0

The diff between r4347 and r4916 looks different.
The diff between r3573 and r4916 is closer.

Please explain which code revision was actually used for the reported outputs. If possible, please try some revision between both revisions.

comment:2 by Andrey P., 5 years ago

Please explain which code revision was actually used for the reported outputs.

Original 5.43 + updated drivedb.h to r4347
Original 7.0 + updated drivedb.h to r4916

comment:3 by Andrey P., 5 years ago

Debug output (7.0).

~]# smartctl -x /dev/sda -r ioctl,2

smartctl 7.0 2019-05-21 r4916 [x86_64-linux-3.10.0-957.12.2.vz7.96.21] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

sda -> /sys/class/scsi_host/host0/proc_name: "mpt3sas"
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 00 00 00 24 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=0
  Incoming data, len=36:
 00     00 00 06 12 c5 01 10 03  53 41 4d 53 55 4e 47 20
 10     4d 5a 49 4c 53 33 54 38  48 4d 4c 48 2f 30 30 37
 20     47 58 4c 30
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 01 00 00 fc 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=236
  Incoming data, len=16:
 00     00 00 00 0c 00 80 83 86  87 88 8d 90 91 b0 b1 b2
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 00 00 00 24 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=0
  Incoming data, len=36:
 00     00 00 06 12 c5 01 10 03  53 41 4d 53 55 4e 47 20
 10     4d 5a 49 4c 53 33 54 38  48 4d 4c 48 2f 30 30 37
 20     47 58 4c 30
=== START OF INFORMATION SECTION ===
Vendor:               SAMSUNG
Product:              MZILS3T8HMLH/007
Revision:             GXL0
Compliance:           SPC-4
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [read capacity(16): 9e 10 00 00 00 00 00 00 00 00 00 00 00 20 00 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
  Incoming data, len=32:
 00     00 00 00 01 bf 1f 72 af  00 00 02 00 00 03 c0 00
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
User Capacity:        3,840,755,982,336 bytes [3.84 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 01 b2 00 08 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=0
  Incoming data, len=8:
 00     00 b2 00 04 00 e6 01 00
LU is resource provisioned, LBPRZ=1
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 01 b1 00 40 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
  Incoming data, len=64:
 00     00 b1 00 3c 00 01 00 03  03 00 00 00 00 00 00 00
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 30     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [mode sense(6): 1a 00 1c 00 40 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=40
  Incoming data, len=24:
 00     17 00 10 08 ff ff ff ff  00 00 02 00 9c 0a 31 04
 10     00 00 17 70 00 00 00 00
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [mode sense(6): 1a 00 5c 00 40 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=40
  Incoming data, len=24:
 00     17 00 10 08 ff ff ff ff  00 00 02 00 9c 0a bf 0f
 10     ff ff ff ff ff ff ff ff
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 01 83 00 fc 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=176
  Incoming data, len=76:
 00     00 83 00 48 01 03 00 08  50 02 53 8a 98 81 1f b0
 10     61 93 00 08 50 02 53 8a  98 81 1f b2 61 94 00 04
 20     00 00 00 01 61 a3 00 08  50 02 53 8a 98 81 1f b1
 30     03 28 00 18 6e 61 61 2e  35 30 30 32 35 33 38 41
 40     39 38 38 31 31 46 42 31  00 00 00 00
Logical Unit id:      ...
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [inquiry: 12 01 80 00 fc 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=228
  Incoming data, len=24:
 00     00 80 00 14 53 33 4d 35  4e 58 30 4b 38 30 30 33
 10     36 33 20 20 20 20 20 20
Serial number:        ...
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        ...
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [test unit ready: 00 00 00 00 00 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [mode sense(6): 1a 00 08 00 40 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=32
  Incoming data, len=32:
 00     1f 00 10 08 ff ff ff ff  00 00 02 00 88 12 04 00
 10     ff ff 00 00 ff ff ff ff  00 ff 00 ff 00 00 00 00
Read Cache is:        Enabled
Writeback Cache is:   Enabled

=== START OF READ SMART DATA SECTION ===
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
  Incoming data, len=4:
 00     00 00 00 0e
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
  Incoming data, len=18:
 00     00 00 00 0e 00 02 03 05  06 0d 0e 0f 10 11 15 18
 10     1a 2f
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [log sense: 4d 00 40 ff 00 00 00 3e fc 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=1 milliseconds  resid=16112
  Incoming data, len=12:
 00     40 ff 00 04 00 ff 34 ff  00 00 00 00
scsiGetSupportedLogPages: number of unreported (standard) log pages: 0 (sub-pages: 0)
>>>> do_scsi_cmnd_io: sg_io_ver=3
 [request sense: 03 00 00 00 12 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=0
  Incoming data, len=18:
 00     70 00 00 00 00 00 00 10  00 00 00 00 00 00 00 80
 10     00 00
SMART Health Status: OK
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

>>>> do_scsi_cmnd_io: sg_io_ver=3
 [read defect list(12): b7 0c 00 00 00 00 00 00 00 08 00 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=2 milliseconds  resid=0
  Incoming data, len=8:
 00     00 0c 00 01 00 00 00 00
Elements in grown defect list: 0

Error Counter logging not supported

>>>> do_scsi_cmnd_io: sg_io_ver=3
 [mode sense(6): 1a 00 0a 00 40 00 ]
  scsi_status=0x0, sg_transport_status=0x0, sg_driver_status=0x0
  sg_info=0x0  sg_duration=0 milliseconds  resid=40
  Incoming data, len=24:
 00     17 00 10 08 ff ff ff ff  00 00 02 00 8a 0a 00 10
 10     00 00 00 00 00 00 0e 10
Device does not support Self Test logging
Device does not support Background scan results logging

comment:4 by Christian Franke, 5 years ago

Possibly related: ticket #1239.

comment:5 by Doug Gilbert, 5 years ago

Subversion revision 4958 (of smartmontools version 7.1) should fix this problem, I hope. The firmware problem that caused this has been reported to Samsung.

comment:6 by Christian Franke, 5 years ago

Milestone: undecidedRelease 7.1
Resolution: duplicate
Status: newclosed

Should be fixed in r4958, see ticket #1239.

Feel free to reopen this ticket if test is not successful.

comment:7 by Andrey P., 4 years ago

I've checked latest smartctl on the drive and it still doesn't report SS Media used endurance indicator:

smartctl 7.0 2019-08-20 r4949 [x86_64-linux-3.10.0-957.12.2.vz7.96.21] (local build)
Copyright © 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===

Vendor:               SAMSUNG
Product:              MZILS3T8HMLH/007
Revision:             GXL0
Compliance:           SPC-4
User Capacity:        3,840,755,982,336 bytes [3.84 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      --
Serial number:        --
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        --
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===

SMART Health Status: OK
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C
Elements in grown defect list: 1
Error Counter logging not supported
Device does not support Self Test logging

comment:8 by Andrey P., 4 years ago

Resolution: duplicate
Status: closedreopened

in reply to:  7 comment:9 by Christian Franke, 4 years ago

I've checked latest smartctl on the drive and it still doesn't report SS Media used endurance indicator:

smartctl 7.0 2019-08-20 r4949 [x86_64-linux-3.10.0-957.12.2.vz7.96.21] (local build)
...

This again looks like a tweaked version as a build from SVN r4949 snapshot should report smartctl 7.1 2019-08-20 r4949.

This smartctl build does not include r4958 and therefore doesn't fix the problem.

If you don't want to compile the latest source or test a binary from https://builds.smartmontools.org/, you need to wait for Release 7.1 and then for the package maintainer providing a 7.1 binary.

comment:10 by Christian Franke, 4 years ago

Resolution: duplicate
Status: reopenedclosed

Should be fixed in r4958, see ticket #1239.

Before reopening this ticket again, please make sure that the smartctl build used for testing actually includes this fix.

comment:11 by Andrey P., 4 years ago

Resolution: duplicate
Status: closedreopened

We've backported r4958 patch only. For this reason the version reported by smartctl was not updated.

So either the patch r4958 is not effective in our case, or we need all other changes in the latest master of smartctl compared to 7.0 release for this patch to work properly.

The 7.1 milestone seems complete. Are you about to release 7.1 soon?

comment:12 by Christian Franke, 4 years ago

Milestone: Release 7.1undecided

Then we need to address this after Release 7.1.

in reply to:  11 comment:13 by Christian Franke, 4 years ago

We've backported r4958 patch only. ...

Backporting single patches may have undesired side effects. Please repeat the test with smartmontools 7.1 released today.

If the problem persists, please also check which smartmontools version introduced the problem. The range 5.43 ... 7.0 is too coarse.

comment:14 by Andrey P., 4 years ago

Resolution: fixed
Status: reopenedclosed

Confirmed fix of the issue by 7.1.

Thanks a lot!

comment:15 by Christian Franke, 4 years ago

Milestone: undecidedRelease 7.1
Note: See TracTickets for help on using tickets.