Custom Query (1095 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (73 - 75 of 1095)

Ticket Resolution Summary Owner Reporter
#77 wontfix smartd to include fma on opensolaris somebody grooverdan
Description

opensolaris has a fault management architecture. smartd has the detection capability to report faults into this architecture.

There is a simple API for reporting faults. http://www.opensolaris.org/os/community/fm/

detailed documentation for this is available: http://docs.sun.com/app/docs/doc/819-3196/gemfu?l=en&a=view

#78 fixed Smartctl segmentation fault and crash followed by kernel invalid opcode trace Christian Franke mhlavink
Description

I got following bug report from one Fedora user, let me know if you need some other information.


Smartctl segmentation fault and crash when asking for SMART test of a disk on a DELL MegaRaid? controller.

How reproducible:

Always reproducible

Steps to Reproduce:

  1. smartctl -t short /dev/sda -d megaraid,0
  2. segmentation fault and crash

Actual results: smartctl 5.39.1 2010-01-28 r3054 [x86_64-redhat-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Segmentation fault

Message from syslogd@webster at Mar 29 14:45:01 ...

kernel:invalid opcode: 0000 #8 SMP kernel:last sysfs file:

/sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map

kernel:Stack: kernel:Call Trace: kernel:Code: 00 08 00 00 49 c1 ef 0b 4c 8b 75 c0 49 81 c6 ff 07 00 00 49 c1 ee

0b 48 81 7d c0 01 10 00 00 45 19 ed 41 83 c5 02 45 85 f6 75 04 <0f> 0b eb fe 48 c7 c7 c0 7a a0 81 45 89 ec e8 a9 10 22 00 49 89

Additional info: DELL PowerEdge? R710 with 2 Xeon E5530 with 8GB, running F12 x86_64 LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)


The stack trace of smartmontools at the system call that causes the problem is

a little hard to get because the crash happens in the kernel so you can't just run the debugger to the error (stack is gone by that time), but it seems that the problem is in an ioctl:

#0 os_linux::linux_megaraid_device::megasas_cmd (this=0x7ffff821a030, cdbLen=<value optimized out>, cdb=0x7fffffffc8f0, dataLen=-134715736, data=0x0) at os_linux.cpp:1112 #1 0x00007ffff7fd682f in os_linux::linux_megaraid_device::scsi_pass_through (this=<value optimized out>, iop=0x7fffffffc870) at os_linux.cpp:1076 #2 0x00007ffff7fcba52 in scsiSendDiagnostic (device=0x7ffff821a030, functioncode=<value optimized out>, pBuf=<value optimized out>, bufLen=<value optimized out>) at scsicmds.cpp:722 #3 0x00007ffff7fcbb9f in scsiSmartExtendSelfTest (device=<value optimized out>) at scsicmds.cpp:1699 #4 0x00007ffff7fd45ad in scsiPrintMain (device=<value optimized out>, options=<value optimized out>) at scsiprint.cpp:1703 #5 0x00007ffff7fbbcf2 in main_worker (argc=<value optimized out>, argv=<value optimized out>) at smartctl.cpp:951 #6 0x00007ffff7fbc049 in main (argc=<value optimized out>, argv=<value optimized out>) at smartctl.cpp:967

line 1112 of os_linux.cpp is

rc = ioctl(m_fd, MEGASAS_IOC_FIRMWARE, &uio);

where uio is:

{host_no = 2, pad1 = 0, sgl_off = 48, sge_count = 1, sense_off = 0, sense_len = 0, frame = {

raw = "\004\000\377\000\000\000\006\001\000\000\000\000\000\000\000\000\020", '\000'<repeats 15 times>, "\035@", '\000' <repeats 93 times>, hdr = {cmd = 4 '\004', sense_len = 0 '\000', cmd_status = 255 '\377', scsi_status = 0 '\000', target_id = 0 '\000', lun = 0 '\000', cdb_len = 6 '\006', sge_count = 1 '\001', context = 0, pad_0 = 0, flags = 16, timeout = 0, data_xferlen = 0}}, sgl = {{iov_base = 0x0, iov_len = 0} <repeats 16 times>}}

Don't know if it's useful but last non-hardware specific call level up the stack is line 722 of scsicmds.cpp :

if (!device->scsi_pass_through(&io_hdr));

at that point, io_hdr is $1 = {cmnd = 0x7fffffffc8f0 "\035@", cmnd_len = 6, dxfer_dir = 0, dxferp = 0x0, dxfer_len = 0, sensep = 0x7fffffffc8d0 "HITACHI ", max_sense_len = 32, timeout = 18000, resp_sense_len = 0, scsi_status = 0 '\000', resid = 0}

#79 wontfix smartd opensolaris service defination somebody grooverdan
Description

Attached is an {open}solaris service definition. The configure script should alter some of the path names but otherwise should work flexibly.

An make install target could do something like:

	cp smartd.xml /var/svc/manifest/site/
	chown root:sys /var/svc/manifest/site/smartd.xml
	svccfg -v import /var/svc/manifest/site/smartd.xml

Documentation could say run 'svcadm enable smartd' to get it running

I'm still getting this error. I'm running out of time to debug it. Tips welcome.

svc:/system/smartd:default (SMART monitoring)
 State: maintenance since June  9, 2010 01:54:26 PM EST
Reason: Start method failed repeatedly, last exited with status 10.
   See: http://sun.com/msg/SMF-8000-KS
   See: man -M /usr/share/man -s 1M smartd
   See: man -M /usr/share/man -s 4 smartd.conf
   See: /var/svc/log/system-smartd:default.log
Impact: This service is not running.

/var/svc/log/system-smartd:default.log

[ Jun  9 13:52:06 Rereading configuration. ]
[ Jun  9 13:54:26 Executing start method ("/usr/sbin/smartd -q never"). ]
[ Jun  9 13:54:26 Method "start" exited with status 10. ]
[ Jun  9 13:54:26 Executing start method ("/usr/sbin/smartd -q never"). ]
[ Jun  9 13:54:26 Method "start" exited with status 10. ]
[ Jun  9 13:54:26 Executing start method ("/usr/sbin/smartd -q never"). ]
[ Jun  9 13:54:26 Method "start" exited with status 10. ]
Note: See TracQuery for help on using queries.