Opened 3 years ago

Closed 2 years ago

#793 closed defect (invalid)

SMART error (FailedReadSmartSelfTestLog) detected on host: xxx

Reported by: valisann Owned by:
Priority: major Milestone:
Component: all Version: 6.5
Keywords: megaraid linux scsi Cc:

Description

Hi,

I need your help regarding an issue related with smartmontools apps.
I have a linux server (Debian GNU/Linux 8 (jessie), installed on a Supermicro 2U 6027R-E1R12L with a raid controller LSI.

root@pm03:~# lspci | egrep -i 'raid|adaptec'
03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)

On LSI I have 2 raid configuration:

Raid 1 - 2 SSD - Intel SSD DC S3500 Series 120GB
Raid 10 - 7 HDD (6 + 1 spare) - Seagate Constellation ES.3 3.5 2TB SAS

root@pm03:~#_ smartctl --scan
/dev/sda_ -d scsi # /dev/sda, SCSI device
/dev/sdb_ -d scsi # /dev/sdb, SCSI device
/dev/bus/0_ -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], SCSI device
/dev/bus/0_ -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], SCSI device
/dev/bus/0_ -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
/dev/bus/0_ -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], SCSI device
/dev/bus/0_ -d megaraid,4 # /dev/bus/0 [megaraid_disk_04], SCSI device
/dev/bus/0__ -d megaraid,5 # /dev/bus/0 [megaraid_disk_05], SCSI device
/dev/bus/0_ -d megaraid,6 # /dev/bus/0 [megaraid_disk_06], SCSI device
/dev/bus/0_ -d megaraid,7 # /dev/bus/0 [megaraid_disk_07], SCSI device
/dev/bus/0 -d megaraid,15 # /dev/bus/0 [megaraid_disk_15], SCSI device spare

issue - I have received an error log "SMART error (FailedReadSmartSelfTestLog?) detected on host: xxx" (see the message logs).

  • this was a spare disk form RAID 10 with ID 08
This message was generated by the smartd daemon running on:

   host name:  XXXX
   DNS domain: mydomain.int

The following warning/error was logged by the smartd daemon:

Device: /dev/bus/0 [megaraid_disk_08], Read SMART Self-Test Log Failed

Device_ info:
[SEAGATE  ST2000NM0023     0004], lu id: 0x5000c500628f5dcf, S/N: Z1Y2C8G90000C5124838, 2.00 TB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
Another message will be sent in 24 hours if the problem persists._
  • I have contact the vendor and I have change the disk...after that, the SMART test on the new disk (with ID 15), was OK
  • after 1 day the error is Back :(
This message was generated by the smartd daemon running on:

   host name:  pm03
   DNS domain: mercury.int

The following warning/error was logged by the smartd daemon:

Device: /dev/bus/0 [megaraid_disk_15], Read SMART Self-Test Log Failed

Device info:
[SEAGATE  ST2000NM0023     0004], lu id: 0x5000c500845a1abf, S/N: Z1X5VF2R0000C6095M9C, 2.00 TB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
Another message will be sent in 24 hours if the problem persists.

Another problem that I noticed after I have detect is that I cannot see in -- Disk information -- about the spare disk, using the command megaclisas-status && megasasctl

before
https://drive.google.com/open?id=0B5sUwIIchHUgUlA4UGwyZTZaSjQ
https://drive.google.com/open

after
https://drive.google.com/open?id=0B5sUwIIchHUgMXBDRzExaXhZOE0
https://drive.google.com/open

  • the rest of the disks form RAID10, don't have issues on SMART
  • other log, if helps
root@pm03:~# smartctl -a /dev/bus/0 -d megaraid,15
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.4.35-1-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST2000NM0023
Revision:             0004
Compliance:           SPC-4
User Capacity:        2,000,398,934,016 bytes [2.00 TB]
Logical block size:   512 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c500845a1abf
Serial number:        Z1X5VF2R0000C6095M9C
Device type:          disk

Transport protocol:   SAS (SPL-3)
Local Time is:        Fri Jan 13 13:54:06 2017 EET
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

Device does not support Self Test logging

megacli 8.07.14-1 amd64 LSI Logic MegaRAID SAS MegaCLI
megaclisas-status 0.15 all get RAID status out of LSI MegaRAID SAS HW RAID controllers
megactl 0.4.1+svn20090725.r6-5 amd64 LSI MegaRAID SCSI/SAS reporting tool
megaraid-status 0.12 all get RAID status out of LSI MegaRAID SCSI/SAS HW RAID controllers

Thanks

Change History (3)

comment:1 Changed 3 years ago by Christian Franke

Keywords: megaraid linux scsi added; FailedReadSmartSelfTestLog removed
Milestone: undecided

Is this disk possibly spun down by the controller because it is a spare disk?

This is a bug tracker, not a support forum. For future support questions, please use the smartmontools-support mailing list instead.

comment:2 Changed 3 years ago by Christian Franke

Possibly related report on smartmontools-support list:
PERC H730 Mini: spun down hot spare: FailedReadSmartSelfTestLog.

comment:3 Changed 2 years ago by Christian Franke

Milestone: undecided
Resolution: invalid
Status: newclosed

No feedback from reporter.

Note: See TracTickets for help on using tickets.