Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#781 closed defect (invalid)

smartctl for OCZ-AGILITY3 incorrectly reporting temperature of 128C

Reported by: Daniel Owned by:
Priority: minor Milestone:
Component: all Version: 6.4
Keywords: Cc:

Description

When running smartctl in FreeNAS 10 on an OCZ-AGILITY3 drive, I get a Temperature_Celsius reading of 128. The other drives in my system (Western Digital Reds) report values between 25-27C.

[root@freenas] ~# smartctl -a /dev/ada4
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SandForce Driven SSDs
Device Model:     OCZ-AGILITY3
Serial Number:    OCZ-XXXXXXXXXXXXXXXX
LU WWN Device Id: 5 e83a97 f1fb643ef
Firmware Version: 2.15
User Capacity:    60,022,480,896 bytes [60.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Dec 15 17:11:44 2016 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                ( 2097) seconds.
Offline data collection
capabilities:                    (0x7f) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Abort Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  48) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x0021) SCT Status supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   088   088   050    Pre-fail  Always       -       0/40025267
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0
  9 Power_On_Hours_and_Msec 0x0032   098   098   000    Old_age   Always       -       2522h+33m+58.310s
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       640
171 Program_Fail_Count      0x0032   000   000   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       15
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       3
181 Program_Fail_Count      0x0032   000   000   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   128   129   000    Old_age   Always       -       128 (0 127 0 129 0)
195 ECC_Uncorr_Error_Count  0x001c   120   120   000    Old_age   Offline      -       0/40025267
196 Reallocated_Event_Count 0x0033   100   100   003    Pre-fail  Always       -       0
201 Unc_Soft_Read_Err_Rate  0x001c   120   120   000    Old_age   Offline      -       0/40025267
204 Soft_ECC_Correct_Rate   0x001c   120   120   000    Old_age   Offline      -       0/40025267
230 Life_Curve_Status       0x0013   100   100   000    Pre-fail  Always       -       100
231 SSD_Life_Left           0x0013   100   100   010    Pre-fail  Always       -       0
233 SandForce_Internal      0x0000   000   000   000    Old_age   Offline      -       2000
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       2733
241 Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       2733
242 Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       4477

SMART Error Log not supported

SMART Self-test Log not supported

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I can include an actual Serial Number if that's necessary. If you care to view the ticket I originally filed with FreeNAS 10, it's included here.

Change History (3)

comment:1 by Daniel, 7 years ago

Version: 6.56.4

comment:2 by Christian Franke, 7 years ago

Resolution: invalid
Status: newclosed

This device has no temperature sensor and therefore should not return attribute 194. The firmware author decided to keep the attribute and return 128 (min 127, max 129) instead.

Smartctl prints the raw bytes instead of 128 (Min/Max 127/129) because the values are out of range.

comment:3 by Christian Franke, 7 years ago

Possible workaround: Add a local drive database entry which changes the name and/or clears the raw value of attribute 194.

Example for /etc/smart_drivedb.h:

{
  "",
  "OCZ-AGILITY3",
  "", "",
  "-v 194,raw48:z,No_Temp_Sensor" 
}

This requires that the monitoring tool checks either the name or value > 0 before assuming that this attribute represents temperature.

If other attributes are also needed, add the required -v options from SandForce Driven SSDs entry in /usr/share/smartmontools/drivedb.h. Do not change this file directly as it will be overwritten by package upgrades or by /usr/sbin/update-smart-drivedb.

The path of the database files may be distribution specific. See -B section on smartctl man page for further info.

Note: See TracTickets for help on using tickets.