Opened 4 years ago

Closed 4 years ago

#1258 closed defect (duplicate)

Error Counter logging not supported (SAMSUNG MZILT800HAHQ0D3)

Reported by: AN Owned by:
Priority: major Milestone: Release 7.1
Component: smartctl Version:
Keywords: scsi Cc:

Description (last modified by Christian Franke)

smartctl is not reporting disk-level checks and not running self tests. Getting below error:

#smartctl -a -d megaraid,1 /dev/sdb
smartctl 7.0 2018-12-30 r4883  (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SAMSUNG
Product:              MZILT800HAHQ0D3
Revision:             DWF8
Compliance:           SPC-5
User Capacity:        800,166,076,416 bytes [800 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed Oct 30 04:17:00 2019 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     33 C
Drive Trip Temperature:        70 C

Elements in grown defect list: 0

Error Counter logging not supported

Device does not support Self Test logging

i have tried enabling with --smart=on --saveauto=on,-T permissive but nothing works.

Change History (12)

comment:1 by AN, 4 years ago

Component: allsmartctl

comment:2 by Christian Franke, 4 years ago

Description: modified (diff)

comment:3 by Christian Franke, 4 years ago

Keywords: scsi added
Milestone: undecided
Priority: criticalmajor
Summary: Error Counter logging not supportedError Counter logging not supported (SAMSUNG MZILT800HAHQ0D3)

May be related to the bogus "Supported log pages and subpages" info returned by some SAMSUNG SAS SSDs. See ticket #1239 and changeset r4958 for details.

Please try a recent smartctl version. See https://builds.smartmontools.org/ for src tarballs and various binaries.

comment:4 by AN, 4 years ago

This seems to be some other issue. As I have the same SSD on another set of servers and I can get all SSD related details properly. What difference I see is no of disks and RAID setup. Node with RAID 1 and 6 disks shows stats fine but node with RAID 5 and 16 disks shows above error

Last edited 4 years ago by AN (previous) (diff)

comment:5 by Christian Franke, 4 years ago

Is there any difference in firmware revision, SCSI compliance level or power mode? If disk is spun down, less information might be visible. See ticket #1233 for an example.

If not, please provide outputs of smartctl -r ioctl,2 -d megaraid,N -a /dev/... for both cases as attachments to this ticket.

comment:6 by AN, 4 years ago

Attaching output of both types of nodes:

Node not returning data
==============================================
#smartctl -r ioctl,2 -a -d megaraid,1 /dev/sdb
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-957.21.3.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

Creating /dev/megaraid_sas_ioctl_node = 17
 [inquiry: 12 00 00 00 24 00 ]
Got MegaRAID inquiry.. SAMSUNG MZILT800HAHQ0D3 DWF8
 [inquiry: 12 01 00 00 fc 00 ]
 [inquiry: 12 00 00 00 24 00 ]
=== START OF INFORMATION SECTION ===
Vendor:               SAMSUNG
Product:              MZILT800HAHQ0D3
Revision:             DWF8
Compliance:           SPC-5
 [read capacity(16): 9e 10 00 00 00 00 00 00 00 00 00 00 00 20 00 00 ]
User Capacity:        800,166,076,416 bytes [800 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
 [inquiry: 12 01 b2 00 08 00 ]
LU is resource provisioned, LBPRZ=1
 [inquiry: 12 01 b1 00 40 00 ]
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
 [mode sense(6): 1a 00 1c 00 40 00 ]
 [mode sense(6): 1a 00 5c 00 40 00 ]
 [inquiry: 12 01 83 00 fc 00 ]
Logical Unit id:      0x5002538b495cb8b0
 [inquiry: 12 01 80 00 fc 00 ]
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed Nov  6 09:47:14 2019 PST
 [test unit ready: 00 00 00 00 00 00 ]
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
 [log sense: 4d 00 40 ff 00 00 00 3e fc 00 ]
scsiGetSupportedLogPages: number of unreported (standard) log pages: 0 (sub-pages: 0)
 [request sense: 03 00 00 00 12 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 10 00 ]
SMART Health Status: OK

 [log sense: 4d 00 4d 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 10 00 ]
Current Drive Temperature:     35 C
Drive Trip Temperature:        70 C

 [read defect list(12): b7 0c 00 00 00 00 00 00 00 08 00 00 ]
Elements in grown defect list: 0

Error Counter logging not supported

 [mode sense(6): 1a 00 0a 00 40 00 ]
Device does not support Self Test logging

==============================================
Node returning all data
==============================================
#smartctl -r ioctl,2 -a -d megaraid,5 /dev/sdb
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-957.27.2.el7.x86_64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Creating /dev/megaraid_sas_ioctl_node = 17
 [inquiry: 12 00 00 00 24 00 ]
Got MegaRAID inquiry.. SAMSUNG MZILT800HAHQ0D3 DWF8
 [inquiry: 12 01 00 00 fc 00 ]
 [inquiry: 12 00 00 00 24 00 ]
=== START OF INFORMATION SECTION ===
Vendor:               SAMSUNG
Product:              MZILT800HAHQ0D3
Revision:             DWF8
Compliance:           SPC-5
 [read capacity(10): 25 00 00 00 00 00 00 00 00 00 ]
User Capacity:        800,166,076,416 bytes [800 GB]
Logical block size:   512 bytes
 [inquiry: 12 01 b2 00 08 00 ]
LU is resource provisioned, LBPRZ=1
 [inquiry: 12 01 b1 00 40 00 ]
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
 [mode sense(6): 1a 00 1c 00 40 00 ]
 [mode sense(6): 1a 00 5c 00 40 00 ]
 [inquiry: 12 01 83 00 fc 00 ]
Logical Unit id:      0x5002538b48a19ef0
 [inquiry: 12 01 80 00 fc 00 ]
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed Nov  6 09:48:53 2019 PST
 [test unit ready: 00 00 00 00 00 00 ]
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
 [log sense: 4d 00 6f 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 6f 00 00 00 00 00 fc 00 ]
 [request sense: 03 00 00 00 12 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 10 00 ]
SMART Health Status: OK

 [log sense: 4d 00 51 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 51 00 00 00 00 00 0c 00 ]
Percentage used endurance indicator: 0%
 [log sense: 4d 00 4d 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 4d 00 00 00 00 00 10 00 ]
Current Drive Temperature:     33 C
Drive Trip Temperature:        70 C

 [log sense: 4d 00 4e 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 4e 00 00 00 00 00 38 00 ]
Manufactured in week 39 of year 2018
Accumulated start-stop cycles:  19
Specified load-unload count over device lifetime:  0
Accumulated load-unload cycles:  0
 [read defect list(12): b7 0c 00 00 00 00 00 00 00 08 00 00 ]
Elements in grown defect list: 0

 [log sense: 4d 00 43 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 43 00 00 00 00 00 58 00 ]
 [log sense: 4d 00 42 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 42 00 00 00 00 00 58 00 ]
 [log sense: 4d 00 45 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 45 00 00 00 00 00 58 00 ]
Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0       2011.654           0
write:         0        0         0         0          0         93.234           0
verify:        0        0         0         0          0         20.601           0
 [log sense: 4d 00 46 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 46 00 00 00 00 00 10 00 ]

Non-medium error count:      345

 [mode sense(6): 1a 00 0a 00 40 00 ]
 [request sense: 03 00 00 00 12 00 ]
 [log sense: 4d 00 50 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 50 00 00 00 00 01 94 00 ]
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                  96       1                 - [-   -    -]
# 2  Background short  Completed                  96       1                 - [-   -    -]
 [mode sense(6): 1a 00 0a 00 40 00 ]

Long (extended) Self Test duration: 3600 seconds [60.0 minutes]
Last edited 4 years ago by Christian Franke (previous) (diff)

comment:7 by Christian Franke, 4 years ago

The smartctl versions differ:

Node not returning data
...
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-957.21.3.el7.x86_64] (local build)
...
=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
 [log sense: 4d 00 40 ff 00 00 00 3e fc 00 ] <-- Samsung SSDs return bogus info here
scsiGetSupportedLogPages: number of unreported (standard) log pages: 0 (sub-pages: 0)
...
==============================================
Node returning all data
...
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-957.27.2.el7.x86_64] (local build)
...
=== START OF READ SMART DATA SECTION ===
 [log sense: 4d 00 40 00 00 00 00 00 04 00 ]
 [log sense: 4d 00 40 00 00 00 00 00 12 00 ]
                                             <-- Info not requested by 6.5
 [log sense: 4d 00 6f 00 00 00 00 00 04 00 ] 
...

So above comment is still valid: Some Samsung SAS/SSDs return bogus "Supported log pages and subpages" info which affect smartctl 7.0. Smartctl 6.5 is not affected because it does not request this info.

Please try a recent smartctl version on the "Node not returning data". See ​https://builds.smartmontools.org/ for src tarballs and various binaries.

If this works, then this ticket is a duplicate of #1239.

comment:8 by AN, 4 years ago

I see the latest package available in the repo is smartmontools-7.0-1.el7 which is r4883 version. I have many systems to update the version. Any rpm available for the latest build or for smartmontools-6.5 version.

comment:9 by Christian Franke, 4 years ago

We at upstream do not provide packages for Linux distributions. If you don't want to compile the latest source from SVN or test a binary from https://builds.smartmontools.org/, you need to wait for Release 7.1 and then for the maintainer providing a 7.1 rpm.

comment:10 by AN, 4 years ago

I found an rpm http://rpm.pbone.net/index.php3/stat/4/idpl/43950332/dir/scientific_linux_7/com/smartmontools-6.5-1.el7.x86_64.rpm.html#content for version 6.5 which I tested on 1 node and showing result properly. However, I tried to check sign for the package and this is the output:

$ rpm --checksig smartmontools-6.5-1.el7.x86_64.rpm
smartmontools-6.5-1.el7.x86_64.rpm: (SHA1) DSA sha1 md5 (GPG) NOT OK (MISSING KEYS: GPG#192a7d7d)

comment:11 by Christian Franke, 4 years ago

As explained above, smartctl 6.5 is not affected by the problem.

comment:12 by Christian Franke, 4 years ago

Milestone: undecidedRelease 7.1
Resolution: duplicate
Status: newclosed

Should be fixed in r4958, see ticket #1239.

Note: See TracTickets for help on using tickets.