Opened 2 months ago

Last modified 8 weeks ago

#1515 reopened defect

smartctl crash with aacraid and adaptec 6805 on Windows

Reported by: maclin Owned by:
Priority: major Milestone: unscheduled
Component: all Version: 7.2
Keywords: aacraid windows Cc:

Description

Strange, but i was always thought that 6805 is supported but now i look at list https://www.kernel.org/doc/Documentation/scsi/aacraid.txt and see no support for 6805. Is it possible to add it?

I try to implement smart monitoring on old windows server with 7.2.1 version of smartmontools and i get this (while on another machine with 5405 controller this command works)

smartctl.exe -a /dev/sda -d "aacraid,0,0,0"
/dev/sda: aacraid: host 0 not found

Attachments (6)

smartctl-output.txt (5.9 KB) - added by maclin 2 months ago.
smartctl-output-5405.txt (4.3 KB) - added by maclin 2 months ago.
smartctl-output-5405-while-one-of-disks-was-pulled-out.txt (31.7 KB) - added by maclin 2 months ago.
smartctl-output-5405-after-rebuild.txt (37.0 KB) - added by maclin 2 months ago.
smartctl-output-5405-r5230.txt (7.3 KB) - added by maclin 2 months ago.
photo_2021-08-30_20-29-05.jpg (26.1 KB) - added by maclin 2 months ago.

Download all attachments as: .zip

Change History (33)

comment:1 in reply to:  description Changed 2 months ago by Christian Franke

Keywords: aacraid added
Milestone: undecided

Linux or Windows?
Please be more specific. Note that RAID controller support for Linux and Windows has to be implemented separately in smartmontools.

Strange, but i was always thought that 6805 is supported but now i look at list ... and see no support for 6805. Is it possible to add it?

If the Linux AACRAID driver does not support the Adaptec 6805, this cannot be fixed in smartmontools. See https://storage.microsemi.com/en-us/support/raid/sas_raid/sas-6805/ for a link to "Linux Driver Source Code".

I try to implement smart monitoring on old windows server with 7.2.1 version of smartmontools and i get this (while on another machine with 5405 controller this command works) ...

AACRAID support for Windows was provided 2015 by developers from pmcs.com (now microsemi.com), see ticket #496. These developers are no longer active in the smartmontools project. Enhancements usually require documentation or sample source code from the driver vendor.

comment:2 Changed 2 months ago by maclin

On linux 6805 is supported (at least 6405 100%, we have linux server with this controller and smartctl works perfectly)
I did not specified windows in subject, but mentioned it in description...

comment:3 Changed 2 months ago by Christian Franke

Keywords: adaptec6805 windows added

comment:4 Changed 2 months ago by Christian Franke

Summary: adaptec 6805 supportadaptec 6805 support on Windows

comment:5 Changed 2 months ago by maclin

Installed the newset version of driver
https://storage.microsemi.com/en-us/speed/raid/aac/windows/aacraid_ws08-ws12-ws16-w7-w8-w10-sbs_x64_b52013_cert_zip.php
Now driver date is 30.08.2016 (was 29.11.2012).

Now smartctl shows some info but crashes...

C:\Program Files\smartmontools\bin>smartctl.exe -a /dev/sda -d "aacraid,0,0,0"
smartctl 7.2 2020-12-30 r5155 [x86_64-w64-mingw32-2008r2-sp1] (sf-7.2-1)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HITACHI
Product:              HUC109090CSS600
Revision:             A440
Compliance:           SPC-4
User Capacity:        900 185 481 216 bytes [900 GB]
Logical block size:   512 bytes
Rotation Rate:        10020 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000cca016815478
Serial number:        KPJ93XZF
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed Aug 18 09:20:24 2021 RTZST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
*** stack smashing detected ***:  terminated

Seems to be just key -i works

Last edited 2 months ago by maclin (previous) (diff)

comment:6 Changed 2 months ago by Christian Franke

Type: enhancementdefect

Please provide output of
smartctl -r ioctl,2 -d "aacraid,0,0,0" -a /dev/sda
as a plain-text attachment to this ticket.

Changed 2 months ago by maclin

Attachment: smartctl-output.txt added

comment:7 Changed 2 months ago by Christian Franke

Keywords: adaptec6805 removed
Milestone: undecidedunscheduled
Priority: minormajor
Summary: adaptec 6805 support on Windowsstack smashing with aacraid and adaptec 6805 on Windows

If possible, please retry with a SATA disk connected to the 6805.

(while on another machine with 5405 controller this command works)

SAT or SATA disk?

I sent a problem report to last known email address of patch author (#496). No reply or bounce so far.

Leaving ticket open as unscheduled until some volunteer developer with access to similar hardware is able to work on this.

Changed 2 months ago by maclin

Attachment: smartctl-output-5405.txt added

comment:8 Changed 2 months ago by maclin

Added output, it is sas on 5405
"If possible, please retry with a SATA disk connected to the 6805." - have no server with 6805 and sata. If really needed, i will try to find the way.

comment:9 Changed 2 months ago by maclin

On server with 5405 we found one of disks (physically 3, logically it os aacraid,0,0,2) could not end a selftest, decided to change it. After pulling out (and even after raid rebuild) smartctl shows info only about first disk, others are crashes in the same manner as 6805. Added smartctl-output-5405-while-one-of-disks-was-pulled-out.txt

comment:10 Changed 2 months ago by Christian Franke

... have no server with 6805 and sata. If really needed, i will try to find the way.

Thanks, but this would only show whether the problem is SAS-specific or not.

Added smartctl-output-5405-while-one-of-disks-was-pulled-out.txt

Did all commands which produced the above outputs also fail with
*** stack smashing detected ***: terminated ?
This message cannot be redirected to a file as the GCC runtime always writes it to console device.

Changed 2 months ago by maclin

comment:11 in reply to:  10 Changed 2 months ago by maclin

Thanks, but this would only show whether the problem is SAS-specific or not.

Ok, i will add sata disk to the same server (we have additional slot for this as i remember) and check it.

Did all commands which produced the above outputs also fail with
*** stack smashing detected ***: terminated ?
This message cannot be redirected to a file as the GCC runtime always writes it to console device.

Added new one smartctl-output-5405-after-rebuild.txt​, 3 of 4 disks gave such result (first is ok). I run command in cmd and then copy output to buffer and to file.

comment:12 Changed 2 months ago by maclin

Added sata disk, at least smartctl does not fails:

C:\Program Files\smartmontools\bin>smartctl.exe -a /dev/sda -d "aacraid,0,0,7"
smartctl 7.2 2020-12-30 r5155 [x86_64-w64-mingw32-2008r2-sp1] (sf-7.2-1)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:
Product:              GB0500EAFYL
Revision:             HPG1
Compliance:           SPC-3
User Capacity:        500 107 862 016 bytes [500 GB]
Logical block size:   512 bytes
Rotation Rate:        7200 rpm
Logical Unit id:      0x50014ee2035558d7
Serial number:        WCASY7345001
Device type:          disk
Local Time is:        Wed Aug 25 13:41:00 2021 RTZST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature:     <not available>
Drive Trip Temperature:        0 C

Read defect list: asked for grown list but didn't get it
Error Counter logging not supported

Device does not support Self Test logging

comment:13 Changed 2 months ago by Christian Franke

Thanks. This is only the SCSI view of the SATA drive which newer shows any ATA-SMART features. The stack smashing might not occur because smartctl stops early due to missing diagnostic feature.

Normally the driver/firmware of a SAS controller should return "ATA " as the Vendor of a SATA drive. This is not the case here. This violates the SAT standard.

Please retry with ... -d "sat+aacraid,0,0,7".

comment:14 Changed 2 months ago by maclin

Yep -d "sat+aacraid,0,0,7" worked perfectly

comment:15 Changed 2 months ago by maclin

If i can eject sata disk already?
If it is aacraid problem with sas disk only? What next steps to resolve this?

comment:16 Changed 2 months ago by Christian Franke

Milestone: unscheduledRelease 7.3

comment:17 Changed 2 months ago by Christian Franke

Resolution: fixed
Status: newclosed

comment:18 Changed 2 months ago by Christian Franke

A fixed size buffer is used which is too small for some transfers used only with SAS drives. Problem exists since the early days of -d aacraid option.

Please test r5230. A CI build is available at https://builds.smartmontools.org/.

Reopen this ticket only if smartctl still crashes. Create a new ticket for other issues.

comment:19 Changed 2 months ago by maclin

Yep, r5230 works, thanks a lot!
Is it a stable version or we should wait for 7.3 release?

Changed 2 months ago by maclin

comment:20 Changed 2 months ago by maclin

Nope, checked more cases...
On 6805:
-t long or short key gives error "Control and Monitoring Utility for SMART disks has stopped working - A problem caused the program to stop working correctly. Please close the program"
-l selftest - the same result
-a works fine

On 5405 on one server -a / -t short/long / -l selftest key gives all info and no errors. On another server gives appcrash. Added smartctl-output-5405-r5230.txt

comment:21 Changed 2 months ago by maclin

Resolution: fixed
Status: closedreopened

comment:22 in reply to:  20 ; Changed 2 months ago by Christian Franke

Milestone: Release 7.3unscheduled
Summary: stack smashing with aacraid and adaptec 6805 on Windowssmartctl crash with aacraid and adaptec 6805 on Windows

Thanks for testing.

On 6805:
-t long or short key gives error "Control and Monitoring Utility for SMART disks has stopped working - A problem caused the program to stop working correctly. Please close the program" ...

Always with SAS disks?

On 5405 on one server -a / -t short/long / -l selftest key gives all info and no errors. On another server gives appcrash. Added smartctl-output-5405-r5230.txt

Is this output from the no errors case or from the appcrash case?

Changed 2 months ago by maclin

comment:23 in reply to:  22 Changed 2 months ago by maclin

Always with SAS disks?

With sata everythink is ok: smartctl.exe -a /dev/sda -d "sat+aacraid,0,0,7"

Is this output from the no errors case or from the appcrash case?

Here is how it looks when i launch

C:\Program Files\smartmontools\bin>smartctl.exe -a /dev/sda -d "aacraid,0,0,0"
smartctl 7.3 2021-08-27 r5230 [x86_64-w64-mingw32-2008r2-sp1] (CircleCI)
Copyright (C) 2002-21, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:
Product:              GB0750EAMYB
Revision:             HPG1
Compliance:           SPC-3
User Capacity:        750 156 374 016 bytes [750 GB]
Logical block size:   512 bytes
Serial number:        WMAU00105724
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Mon Aug 30 20:31:47 2021 RTZST

added photo_2021-08-30_20-29-05.jpg
full description:

Problem signature:
  Problem Event Name:  APPCRASH
  Application Name:  smartctl.exe
  Application Version:  7.3.0.5230
  Application Timestamp:  00000000
  Fault Module Name:  StackHash_f3b0
  Fault Module Version:  6.1.7601.23677
  Fault Module Timestamp:  589c99e1
  Exception Code:  c0000374
  Exception Offset:  00000000000bf3e2
  OS Version:  6.1.7601.2.1.0.272.7
  Locale ID:  1049
  Additional Information 1:  f3b0
  Additional Information 2:  f3b09ded703bc0b8243e5c7f78d438f1
  Additional Information 3:  2fde
  Additional Information 4:  2fde217ce8e9c8fdaae14b989ffb918f

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt

Microsoft (http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409)
Windows 7 Privacy Statement
View Windows 7 Privacy Statement.
Last edited 2 months ago by maclin (previous) (diff)

comment:24 Changed 2 months ago by Christian Franke

StackHash_f3b0 suggests that the stack is still overwritten. Needs to be further investigated by a volunteer developer running smartctl in a debugger on same or similar hardware. The pass-through API of the AACRAID drivers may now slightly differ to the version from 2015.

comment:25 Changed 2 months ago by maclin

By the way on 5405 driver is latest:
02.04.2012
5.2.0.19076

comment:26 Changed 8 weeks ago by Christian Franke

No replies (or bounces) so far from last known email address of patch author and all pmcs.com addresses from the CC list of ticket #496.

comment:27 Changed 8 weeks ago by Christian Franke

To avoid system damages, -d aacraid,... is now disabled unless -d aacraid,...,force is specified (r5231).

Note: See TracTickets for help on using tickets.