Opened 12 years ago

Closed 12 years ago

#219 closed defect (worksforme)

regression: segfault on arm since 5.39

Reported by: joban1 Owned by: somebody
Priority: minor Milestone:
Component: smartctl Version: 5.42
Keywords: arm Cc: Alex Samorukov

Description

I tried to use smartmontools natively built from tar on a QNAP TS-419PII.
All versions from 5.42 down to 5.39 produce a segfault:

[~] # smartctl -d sat -a /dev/sda
smartctl 5.39 2009-12-09 r2995 [armv5tel-unknown-linux-gnueabi] (local build)
Copyright (C) 2002-9 by Bruce Allen, http://smartmontools.sourceforge.net

Segmentation fault

Version 5.38 finally worked:

[~] # smartctl -d sat -a /dev/sda
smartctl version 5.38 [armv5tel-unknown-linux-gnu] Copyright (C) 2002-8 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

START OF INFORMATION SECTION

Device Model: WDC WD20EARX-00PASB0
...

I am ok with just using 5.38, but if someone wants to fix it I am willing to
support by testing patches and sending logs.
Only problem would be I cannot build from svn (because of a perl issue?):

[~/share/smartmontools] $ ./autogen.sh
This Perl not built to support threads
Compilation failed in require at /opt/bin/automake-1.11 line 139.
...

[~/share/smartmontools] $ perl -v

This is perl, v5.10.0 built for arm-none-linux-gnueabi
...

Change History (10)

comment:1 by joban1, 12 years ago

To answer some questions from the ml (Re: [smartmontools-support] regression: segfault on arm since 5.39)

[~] $ gcc --version
gcc (GCC) 4.2.3
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

checking whether byte ordering is bigendian... no

perl problem is fixed thanks to Christian Franke (threads=0 helped), so I can report same prob with rev 3515:

[~] # ~joachim/share/smartmontools/smartctl -d sat -i -A /dev/sda
smartctl 5.43 2012-02-27 r3515 [armv5tel-linux-2.6.33.2] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

Segmentation fault

comment:2 by joban1, 12 years ago

@Alex Samorukov:

My builds worked without error, too (after the perl fix also from svn).
This was the detected config:

smartmontools-5.43 configuration:
host operating system: armv5tel-unknown-linux-gnueabi
C++ compiler: g++
C compiler: gcc
preprocessor flags:
C++ compiler flags: -g -O2 -Wall -W
C compiler flags: -g -O2
linker flags:
OS specific modules: os_linux.o cciss.o
binary install path: /usr/local/sbin
man page install path: /usr/local/share/man
doc file install path: /usr/local/share/doc/smartmontools
examples install path: /usr/local/share/doc/smartmontools/examplescripts
drive database file: /usr/local/share/smartmontools/drivedb.h
database update script: /usr/local/sbin/update-smart-drivedb
download tools: curl wget lynx
local drive database: /usr/local/etc/smart_drivedb.h
smartd config file: /usr/local/etc/smartd.conf
smartd initd script: /usr/local/etc/init.d/smartd
smartd save files: [disabled]
smartd attribute logs: [disabled]
libcap-ng support: no
SELinux support: no

[~] # cat /proc/version
Linux version 2.6.33.2 (root@NasX86-6) (gcc version 4.2.1) #1 Sat Nov 26 03:55:10 CST 2011

Here is the strace:

[~] # strace ~joachim/share/smartmontools/smartctl -d sat -i -A /dev/sda
execve("/share/homes/joachim/share/smartmontools/smartctl", ["/share/homes/joachim/share/smart"..., "-d", "sat", "-i", "-A", "/dev/sda"], 25 vars */) = 0
brk(0) = 0x78000
uname({sys="Linux", node="job9", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001d000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=23757, ...}) = 0
mmap2(NULL, 23757, PROT_READ, MAP_PRIVATE, 3, 0) = 0x4001e000
close(3) = 0
open("/usr/lib/libstdc++.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\330\323\3\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=867200, ...}) = 0
mmap2(NULL, 919748, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40026000
mprotect(0x400f5000, 32768, PROT_NONE) = 0
mmap2(0x400fd000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xcf) = 0x400fd000
mmap2(0x40101000, 22724, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40101000
close(3) = 0
open("/lib/libm.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0<2\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=691092, ...}) = 0
mmap2(NULL, 717000, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x40107000
mprotect(0x401ae000, 28672, PROT_NONE) = 0
mmap2(0x401b5000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xa6) = 0x401b5000
close(3) = 0
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0h\"\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=49012, ...}) = 0
mmap2(NULL, 78580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x401b7000
mprotect(0x401c3000, 28672, PROT_NONE) = 0
mmap2(0x401ca000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xb) = 0x401ca000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0`J\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1243580, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x401cb000
mmap2(NULL, 1257892, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x401cc000
mprotect(0x402f3000, 28672, PROT_NONE) = 0
mmap2(0x402fa000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x126) = 0x402fa000
mmap2(0x402fd000, 8612, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x402fd000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40300000
set_tls(0x403004a0, 0x400256ec, 0x40300b78, 0x40025050, 0x40) = 0
mprotect(0x402fa000, 4096, PROT_READ) = 0
mprotect(0x401b5000, 4096, PROT_READ) = 0
mprotect(0x400fd000, 8192, PROT_READ) = 0
munmap(0x4001e000, 23757) = 0
uname({sys="Linux", node="job9", ...}) = 0
brk(0) = 0x78000
brk(0x99000) = 0x99000
fstat64(1, {st_mode=S_IFCHR|0622, st_rdev=makedev(136, 2), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001e000
write(1, "smartctl 5.43 2012-02-27 r3515 ["..., 71smartctl 5.43 2012-02-27 r3515 [armv5tel-linux-2.6.33.2] (local build)
) = 71
write(1, "Copyright (C) 2002-12 by Bruce A"..., 75Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
) = 75
write(1, "\n", 1
) = 1
access("/usr/local/etc/smart_drivedb.h", F_OK) = -1 ENOENT (No such file or directory)
access("/usr/local/share/smartmontools/drivedb.h", F_OK) = -1 ENOENT (No such file or directory)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Segmentation fault

comment:3 by joban1, 12 years ago

And finally the backtrace:

(gdb) run
Starting program: /share/MD0_DATA/joachim/smartmontools/smartctl -d sat -i -A /dev/sda
smartctl 5.43 2012-02-27 r3515 [armv5tel-linux-2.6.33.2] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net


Program received signal SIGSEGV, Segmentation fault.
0x00031518 in linux_scsi_device (this=0x78020, intf=<value optimized out>, dev_name=<value optimized out>, req_type=<value optimized out>, scanning=false) at os_linux.cpp:108
108           m_flags(flags), m_retry_flags(retry_flags)
(gdb) where
#0  0x00031518 in linux_scsi_device (this=0x78020, intf=<value optimized out>, dev_name=<value optimized out>, req_type=<value optimized out>, scanning=false) at os_linux.cpp:108
#1  0x0003159c in os_linux::linux_smart_interface::get_scsi_device (this=0x77bb8, name=0xbe8a8bac "/dev/sda", type=0x444c4 "scsi") at os_linux.cpp:2921
#2  0x0001ee6c in smart_interface::get_smart_device (this=0x77bb8, name=0xbe8a8bac "/dev/sda", type=0x444c4 "scsi") at dev_interface.cpp:306
#3  0x0001eefc in smart_interface::get_smart_device (this=0x77bb8, name=0xbe8a8bac "/dev/sda", type=0xbe8a8ba2 "sat") at dev_interface.cpp:317
#4  0x0000d868 in main_worker (argc=<value optimized out>, argv=0xbe8a8a44) at smartctl.cpp:1193
#5  0x0000dba4 in main (argc=24, argv=0xd1b44) at smartctl.cpp:1252

comment:4 by joban1, 12 years ago

Alex, for me it also looks like a compiler problem now: smartctl works just fine with optimization level 0!

comment:5 by joban1, 12 years ago

or not. This looks wrong to me:

class linux_scsi_device extends linux_smart_device (which implements smart_device)
and its constructor calls linux_smart_device(flag, retry)
which calls smart_device(never_called)
which throws an exception

but then, why does it work with optlevel 0?

in reply to:  5 comment:6 by Christian Franke, 12 years ago

Cc: Alex Samorukov added
Keywords: arm added

The optimizer apparently breaks construction code for multiple inheritance hierarchies with virtual bases:

C++ requires that the final implementation class (linux_scsi_device) initializes each virtual base (smart_device) explicitly. This done by the ctor call here. All other base ctor calls in the inheritance hierarchy are ignored. Therefore I decided to use a dummy ctor with a dummy enum to document that fact here. There is no default ctor in the base class to enforce a compile error if the base ctor call is missing in the implementation class.

Even the thrown exception is not handled properly. It should arrive here.

Newer smartmontools versions were successfully build even with older GCC 3.x on i686. This is likely an issue specific to arm and may be fixed in later GCC versions.

If building with ./configure CXXFLAGS=-O0 resolves the issue for both smartctl and smartd, I would suggest to close this ticket with "worksforme".

comment:7 by Alex Samorukov, 12 years ago

My vote is to close this bug as bogus/worksforme. I can`t reproduce it on debian/arm (installed in qemu, with real USB device). On compilation -O2 was used. Also i tried to build with -O6 and found that it works correctly.

So i think that it is related to some very specific gcc related to this distribution only. As i mentioned in the mail list - it is common to see buggy or broken libraries and tools in the NAS distributions.

comment:8 by joban1, 12 years ago

Priority: majorminor

Ok, I understand this diamond inheritance now. To me it looks a bit quirky to directly call the virtual base class constructor to avoid passing mandatory args through the middle level constructors but I guess it is valid.

Anyways, I found out using -fno-unit-at-a-time or -fno-toplevel-reorder fixes the sigsegv problem.
Using gcc 4.1 instead of 4.2 fixes the problem, too (didn't find newer versions).
With this I found Andrew Wiggin who had the same problem http://gcc.gnu.org/ml/gcc-help/2010-11/msg00036.html
So I guess it is not this special qnap optware distro, but a general problem with the compiler version on arm.

If you want to add some auto magic and find (in)vulnerable compiler versions: this code from Andrew triggers the sigsegv faster:

class A { public: ~A() {} virtual int e() = 0; };
class B : virtual public A { public: ~B() {} virtual int f() = 0; };
class C : virtual public A { public: ~C(); C(int a); protected: int _c; virtual int e(); };
class D : public C, virtual public B { public: ~D(); D(int a); virtual int f(); };
C::C(int a) { _c = a; }
C::~C() {}
int C::e() { return _c; }
D::D(int a):C(a) {}
D::~D() {}
int D::f() { return _c+1; }
int main() { D* d = new D(1); }
[~/share/smartmontools] $ g++ -O0 testing.cpp -o testing
[~/share/smartmontools] $ ./testing
[~/share/smartmontools] $ g++ -O2 testing.cpp -o testing
[~/share/smartmontools] $ ./testing                     
Segmentation fault

Or, since it is not really a smartmontools problem, just close the ticket, now that I know a workaround :)

Thank you.

in reply to:  8 comment:9 by Christian Franke, 12 years ago

Replying to joban1:

Ok, I understand this diamond inheritance now. To me it looks a bit quirky to directly call the virtual base class constructor to avoid passing mandatory args through the middle level constructors but I guess it is valid.

Please note that this is both valid and required. Construction style common to non-virtual inheritance does simply not work for virtual (even non-multiple) inheritance:

It is not possible to pass arguments from final class ctor to base ctor via middle level ctors because their calls to ctors of all virtual bases are simply never_called. This is a design decision of C++ itself to avoid ambiguities in construction in multiple inheritance hierarchies. (Bjarne Stroustrup: A virtual base is always constructed (once only) by its ‘most derived’ class).

Anyways, I found out using -fno-unit-at-a-time or -fno-toplevel-reorder fixes the sigsegv problem.
...
If you want to add some auto magic and find (in)vulnerable compiler versions: this code from Andrew triggers the sigsegv faster:
...

Thanks for the detailed info!

comment:10 by Christian Franke, 12 years ago

Resolution: worksforme
Status: newclosed

Bug in GCC 4.2.3 for arm. Workaround: Disable optimization.

Note: See TracTickets for help on using tickets.