Opened 7 years ago

Closed 7 years ago

Last modified 6 years ago

#939 closed defect (fixed)

drivedb correction: Innolite Satadom D150QV-L

Reported by: Stoat Owned by:
Priority: minor Milestone: Release 7.0
Component: drivedb Version: 6.6
Keywords: Cc:

Description

After reading the full datasheet (pdf available on request) I've come up with the following changes:

# diff -u /usr/local/share/smartmontools/drivedb.h /usr/local/share/smartmontools/drivedb.h.new
--- /usr/local/share/smartmontools/drivedb.h    2017-11-14 02:36:32.359088000 +0000
+++ /usr/local/share/smartmontools/drivedb.h.new        2017-11-14 02:33:58.020915000 +0000
@@ -736,7 +736,7 @@
     "-v 241,raw48,Host_Writes"
   },
   { "InnoDisk InnoLite SATADOM D150QV-L SSDs", // tested with InnoLite SATADOM D150QV-L/120319
-    "InnoLite SATADOM D150QV-L",
+    "InnoLite SATADOM D150QV",
     "", "",
   //"-v 1,raw48,Raw_Read_Error_Rate "
   //"-v 2,raw48,Throughput_Performance "
@@ -744,18 +744,18 @@
   //"-v 5,raw16(raw16),Reallocated_Sector_Ct "
   //"-v 7,raw48,Seek_Error_Rate " // from InnoDisk iSMART Linux tool, useless for SSD
   //"-v 8,raw48,Seek_Time_Performance "
-  //"-v 9,raw24(raw8),Power_On_Hours "
+  //"-v 9,raw48,Power_On_Hours "
   //"-v 10,raw48,Spin_Retry_Count "
   //"-v 12,raw48,Power_Cycle_Count "
     "-v 168,raw48,SATA_PHY_Error_Count "
-    "-v 170,raw48,Bad_Block_Count "
-    "-v 173,raw48,Erase_Count "
+    "-v 170,raw16,Bad_Block_Count_New/Tot "
+    "-v 173,raw16,Erase_Count_Max/Avg "
     "-v 175,raw48,Bad_Cluster_Table_Count "
     "-v 192,raw48,Unexpect_Power_Loss_Ct "
   //"-v 194,tempminmax,Temperature_Celsius "
   //"-v 197,raw48,Current_Pending_Sector "
     "-v 229,hex48,Flash_ID "
-    "-v 235,raw48,Later_Bad_Block "
+    "-v 235,raw16,Lat_Bad_Blk_Era/Wri/Rea "
     "-v 236,raw48,Unstable_Power_Count "
     "-v 240,raw48,Write_Head"
   },



Notes:

This device is labelled on the outside as a D150QV-L, but reports as a D150QV - which is the family name, according to the spec sheet. Suffixes indicate powering and temperature variations

"235 - Later bad block" this is the count of bad blocks detected after leaving the factory and the mode they've tested faulty in. I'm not quite sure what order the write/read are in, as this was extracted from ismart documentation, but it appears to be correct.

"170 - badblock count" lists later_bad_blocks and the total including bad blocks from factory. The actual format is 0x64 0x64 0x00 0x00 [Total lsb msb] [later lsb msb] (Raw16 gives a bogus trailing zero)

"173 - erase count" has the max/avg order reversed from raw16(avg16) format. raw16 gives a bogus leading zero

attributes 9, 12, 168, 175 and 192 are set at raw48, but only the bottom 2 bytes are actually used (0x6464 lsb msb 0000 0000)

Bogons:

attribute 01 is fixed at 0x64 0x64 0xff 0xff 0xff 0x00 0x00 0x00

attribute 02, 03, 05, 07, 08, 10, 197 and 240 are all fixed at zeros.

attribute 194 has the usual temperature format, but byte 7 (the temperature) is never reported.

As such these could (and probably should!) all be filtered, particularly the reallocated and pending sector counts, as these will never move away from zero and that could be highly misleading to the casual observer (it certainly fooled our vendor!)

Comment:

I hope these help a bit. Trying to figure out what the "unknown attributes" really were was a bit of an adventure.

These devices are far more fragile than their 3000 cycle write duration indicates. They're optimised as read-only industrial controller drives (ie: don't write logs back to them and don't RAID1 them) but have been widely deployed as RAID1 boot pairs in production NASes (Certified by Nexenta, Open-E and ixSystems amongst other vendors) - where they break after a couple of years.

Thankfully they appear to be out of production, but even when they were sold there were higher-spec devices made by Innodisk - however getting hold of those devices was difficult as distributors took the attitude that "all models are the same".

Attachments (2)

443816-smart.pdf (93.6 KB ) - added by Stoat 7 years ago.
extended datasheet smart data structure page
innodisk_error_correction_detection_and_bad_block_management_white_paper_ver1 0_te.pdf (458.1 KB ) - added by Stoat 7 years ago.
innodisk bad block managment white paper.

Download all attachments as: .zip

Change History (10)

comment:1 by Christian Franke, 7 years ago

Component: alldrivedb
Milestone: Release 6.7

comment:2 by Alex Samorukov, 7 years ago

Thank you for update and clarification. I will merge your changes to the drivedb, however, i would like to not filter out values, there is a [small] chance to get them fixed in the next fw release.

by Stoat, 7 years ago

Attachment: 443816-smart.pdf added

extended datasheet smart data structure page

comment:3 by Stoat, 7 years ago

The fixed structures are codified in the extended datasheet (pages attached), which is over 6 years old in its last revision.

The chances of firmware updates occurring is vanishingly low. The devices are long-gone from Innodisk's catalog _and_ support pages.

I've attached the white paper which allowed deduction of the Later Bad Block structure too. (image on page 4)

Version 0, edited 7 years ago by Stoat (next)

by Stoat, 7 years ago

innodisk bad block managment white paper.

comment:4 by Alex Samorukov, 7 years ago

Please also provide smartctl -x output for that drive

comment:5 by Stoat, 7 years ago

Stock smartctl on FreeBSD 10.3

# smartctl -x /dev/ada0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     InnoLite SATADOM D150QV
Serial Number:    20141231AA1005224296
Firmware Version: 120319
User Capacity:    32,017,047,552 bytes [32.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Tue Nov 14 18:30:41 2017 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Unavailable
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Total time to complete Offline 
data collection:                (   30) seconds.
Offline data collection
capabilities:                    (0x00)         Offline data collection not supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging NOT supported.
                                        No General Purpose Logging support.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   050    -    16777215
  2 Throughput_Performance  P-S---   100   100   050    -    0
  3 Spin_Up_Time            POS---   100   100   050    -    0
  5 Reallocated_Sector_Ct   PO--C-   100   100   050    -    0
  7 Unknown_SSD_Attribute   PO-R--   100   100   050    -    0
  8 Unknown_SSD_Attribute   P-S---   100   100   050    -    0
  9 Power_On_Hours          -O--C-   100   100   000    -    18843
 10 Unknown_SSD_Attribute   PO--C-   100   100   050    -    0
 12 Power_Cycle_Count       -O--C-   100   100   000    -    122
168 Unknown_Attribute       -O--C-   100   100   000    -    0
175 Program_Fail_Count_Chip PO----   100   100   010    -    0
192 Power-Off_Retract_Count -O--C-   100   100   000    -    0
194 Temperature_Celsius     -O---K   000   100   000    -    0 (Min/Max 0/100)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
240 Unknown_SSD_Attribute   PO--C-   100   100   050    -    0
170 Unknown_Attribute       PO----   100   100   010    -    373674344448
173 Unknown_Attribute       -O--C-   100   100   000    -    305992140
229 Unknown_Attribute       -O----   100   100   000    -    727108061228
236 Unknown_Attribute       -O----   100   100   000    -    0
235 Unknown_Attribute       -O----   100   000   000    -    5701719
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Read SMART Log Directory failed: Input/output error

General Purpose Log Directory not supported

SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

SMART Error Log not supported

SMART Extended Self-test Log (GP Log 0x07) not supported

SMART Self-test Log not supported

Selective Self-tests/Logging not supported

SCT Commands not supported

Device Statistics (GP/SMART Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11) not supported

With the updated drivedb.h

# smartctl -B /usr/local/share/smartmontools/drivedb.h.new -x /dev/ada0
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     InnoDisk InnoLite SATADOM D150QV-L SSDs
Device Model:     InnoLite SATADOM D150QV
Serial Number:    20141231AA1005224296
Firmware Version: 120319
User Capacity:    32,017,047,552 bytes [32.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Tue Nov 14 18:41:50 2017 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Unavailable
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Total time to complete Offline 
data collection:                (   30) seconds.
Offline data collection
capabilities:                    (0x00)         Offline data collection not supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x00) Error logging NOT supported.
                                        No General Purpose Logging support.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   050    -    16777215
  2 Throughput_Performance  P-S---   100   100   050    -    0
  3 Spin_Up_Time            POS---   100   100   050    -    0
  5 Reallocated_Sector_Ct   PO--C-   100   100   050    -    0
  7 Unknown_SSD_Attribute   PO-R--   100   100   050    -    0
  8 Unknown_SSD_Attribute   P-S---   100   100   050    -    0
  9 Power_On_Hours          -O--C-   100   100   000    -    18843
 10 Unknown_SSD_Attribute   PO--C-   100   100   050    -    0
 12 Power_Cycle_Count       -O--C-   100   100   000    -    122
168 SATA_PHY_Error_Count    -O--C-   100   100   000    -    0
175 Bad_Cluster_Table_Count PO----   100   100   010    -    0
192 Unexpect_Power_Loss_Ct  -O--C-   100   100   000    -    0
194 Temperature_Celsius     -O---K   000   100   000    -    0 (Min/Max 0/100)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
240 Write_Head              PO--C-   100   100   050    -    0
170 Bad_Block_Count_New/Tot PO----   100   100   010    -    87 186 0
173 Erase_Count_Max/Avg     -O--C-   100   100   000    -    0 4669 4556
229 Flash_ID                -O----   100   100   000    -    0x00a94b04882c
236 Unstable_Power_Count    -O----   100   100   000    -    0
235 Lat_Bad_Blk_Era/Wri/Rea -O----   100   000   000    -    0 87 87
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Read SMART Log Directory failed: Input/output error

General Purpose Log Directory not supported

SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

SMART Error Log not supported

SMART Extended Self-test Log (GP Log 0x07) not supported

SMART Self-test Log not supported

Selective Self-tests/Logging not supported

SCT Commands not supported

Device Statistics (GP/SMART Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11) not supported

Attribute 170 has a trailing bogon 0 and 173 has a leading bogon 0 in the "raw value" column.

Both are using raw16 as I couldn't see a better way to get the numbers out in readable format whilst suppressing the 0

I've got a few of these things in 32/64GB and between 3-7 years poweron time. smartctl gives the same results on Linux (rhel and ubuntu).
They really are quite crufty devices with a _very_ low sequential write speed (18/20/40/40MB/s for 8/16/32/64GB) and an endurance of 3000 cycles.

advertising sheets quote them at 110MB/s - that's the read speed only.

One of our vendors has commented that they tend to simply stop working with little-to-no warning - but none of the vendor appliances have been monitoring bad block parameters - assuming that reallocated sectors was valid. That's why I recommended filtering these returns.

What we saw was a dramatic slowdown in write speeds when one of the 64Gb units hit 256 bad blocks - something less than 1MB/s. None of the units has changed the 0x64 "value" field for either bad blocks attribute no matter how many have failed and I'm going to push one through a few write cycles to see if they ever do (I suspect not)

Last edited 7 years ago by Stoat (previous) (diff)

comment:6 by Alex Samorukov, 7 years ago

Resolution: fixed
Status: newclosed

Done in r4627, thank you

comment:7 by Christian Franke, 7 years ago

Merged in r4635.

comment:8 by Christian Franke, 6 years ago

Milestone: Release 6.7Release 7.0

Milestone renamed

Note: See TracTickets for help on using tickets.