Opened 3 weeks ago
Last modified 3 weeks ago
#1905 new defect
smartctl stores system uptime in the LifeTime(hours) field of the smart self-test on an Samsung SSD 840 EVO 1TB
Reported by: | asm | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | undecided |
Component: | smartctl | Version: | |
Keywords: | ata | Cc: |
Description
When carrying out some SMART self-tests with smartctl, I wondered why such small values were always stored in the LifeTime(hours) field on my Samsung SSD 840 1 TB, even though the SSD already had over 47,180 operating hours on it. However, with my other SSDs, such as a Samsung SSD 860, the correct value is stored.
During further tests, I noticed that smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-27-amd64] saves the system uptime in this field. That is, the time since the system was booted.
So this is definitely a bug in smartctl.
I am using Debian 12 stable (Bookworm) here and the smartctl binary is from the official Debian 12 stable package repository.
I noticed yesterday that the Uptime value is entered here instead of the Power_On_Hours value.
$ uptime 04:38:45 up 14:19, 11 users, load average: 0.11, 0.25, 0.49
Yesterday the system was running for 14 hours and 19 minutes.
I started the first short smart self-test after the system was running for 4 hours.
Then i started an extended self-test, but this test was somehow never finished. The estimate said that the long test would take 4 hours and 10 minutes, but when 90% of the test was completed, it stayed at 90%. This seems to be another bug, although this could also be due to gsmartctl, but the bug with the incorrect uptime entry is due to smartctl, because today I called smartctl directly. After 13 hours of system uptime, I then manually aborted that long test. That's why there is a 13 in this field. Then I had the idea that smartctl stores the uptime here. So I carried out two more short tests, which confirmed this suspicion. That's why the last two values are a 14 which stands for an uptime of 14 h.
Here's the output. Take a look at the last 4 tests with the values 4, 13, 14 and 14.
I stored the output of "smartcontrol --all devicename" after the last one.
In the LifeTime(hours) field, the full hour value of the uptime value is stored instead of the Power_On_Hours value:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-27-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 840 EVO 1TB Serial Number: XXXXXXXXXXXXXXXXXXXX LU WWN Device Id: 5 002538 8a056d6ea Firmware Version: EXT0CB6Q User Capacity: 1.000.204.886.016 bytes [1,00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device TRIM Command: Available Device is: In smartctl database 7.3/5319 ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Nov 11 04:36:02 2024 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15000) seconds. Offline data collection capabilities: (0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 250) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 47174 12 Power_Cycle_Count 0x0032 095 095 000 Old_age Always - 4347 177 Wear_Leveling_Count 0x0013 096 096 000 Pre-fail Always - 42 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 074 059 000 Old_age Always - 26 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 099 099 000 Old_age Always - 2 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 229 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 103794737444 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 14 - # 2 Short offline Completed without error 00% 14 - # 3 Extended offline Aborted by host 00% 13 - # 4 Short offline Completed without error 00% 4 - # 5 Extended offline Completed without error 00% 9 - # 6 Short offline Completed without error 00% 2 - # 7 Short offline Completed without error 00% 2 - # 8 Short offline Completed without error 00% 13 - # 9 Short offline Completed without error 00% 3 - #10 Extended offline Completed without error 00% 2 - #11 Short offline Completed without error 00% 0 - #12 Short offline Completed without error 00% 11 - #13 Extended offline Completed without error 00% 6 - #14 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
This afternoon I ran some more tests after turning on the computer.
Today I only used the cli program smartctl directly.
$ uptime 15:14:41 up 30 min, 12 users, load average: 0,18, 0,50, 0,62
result after first short self-test of today:
=== START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 840 EVO 1TB Serial Number: XXXXXXXXXXXXXXXXXXXX ...<snip> Local Time is: Mon Nov 11 15:21:49 2024 CET ...<snip> SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 47177 12 Power_Cycle_Count 0x0032 095 095 000 Old_age Always - 4348 ...<snip> SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 0 - # 2 Short offline Completed without error 00% 14 - # 3 Short offline Completed without error 00% 14 - # 4 Extended offline Aborted by host 00% 13 - # 5 Short offline Completed without error 00% 4 - # 6 Extended offline Completed without error 00% 9 - # 7 Short offline Completed without error 00% 2 - # 8 Short offline Completed without error 00% 2 - # 9 Short offline Completed without error 00% 13 - #10 Short offline Completed without error 00% 3 - #11 Extended offline Completed without error 00% 2 - #12 Short offline Completed without error 00% 0 - #13 Short offline Completed without error 00% 11 - #14 Extended offline Completed without error 00% 6 - #15 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 ... <snip>
Now the value in the LifeTime(hours) field is a 0. This proves that the system uptime is stored here.
Then, after the uptime reached 1 hour, i did another short test:
$ uptime 16:11:11 up 1:27, 12 users, load average: 1,46, 1,04, 0,87
result, same drive:
=== START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 840 EVO 1TB ... <snip> Local Time is: Mon Nov 11 18:30:07 2024 CET ... <snip> SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 47180 12 Power_Cycle_Count 0x0032 095 095 000 Old_age Always - 4348 ... <snip> SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1 - # 2 Short offline Completed without error 00% 0 - # 3 Short offline Completed without error 00% 14 - # 4 Short offline Completed without error 00% 14 - # 5 Extended offline Aborted by host 00% 13 - # 6 Short offline Completed without error 00% 4 - # 7 Extended offline Completed without error 00% 9 - # 8 Short offline Completed without error 00% 2 - # 9 Short offline Completed without error 00% 2 - #10 Short offline Completed without error 00% 13 - #11 Short offline Completed without error 00% 3 - #12 Extended offline Completed without error 00% 2 - #13 Short offline Completed without error 00% 0 - #14 Short offline Completed without error 00% 11 - #15 Extended offline Completed without error 00% 6 - #16 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 ... <snip>
Now the value 1 is stored.
Then after waiting until the uptime reached the 3 as full hour i started another test:
$ uptime 18:31:02 up 3:46, 12 users, load average: 1,35, 0,68, 0,33
result:
Device Model: Samsung SSD 840 EVO 1TB ... <snip> Local Time is: Mon Nov 11 18:34:28 2024 CET ... <snip> SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 47180 12 Power_Cycle_Count 0x0032 095 095 000 Old_age Always - 4348 ... <snip> SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 3 - # 2 Short offline Completed without error 00% 1 - # 3 Short offline Completed without error 00% 0 - # 4 Short offline Completed without error 00% 14 - # 5 Short offline Completed without error 00% 14 - # 6 Extended offline Aborted by host 00% 13 - # 7 Short offline Completed without error 00% 4 - # 8 Extended offline Completed without error 00% 9 - # 9 Short offline Completed without error 00% 2 - #10 Short offline Completed without error 00% 2 - #11 Short offline Completed without error 00% 13 - #12 Short offline Completed without error 00% 3 - #13 Extended offline Completed without error 00% 2 - #14 Short offline Completed without error 00% 0 - #15 Short offline Completed without error 00% 11 - #16 Extended offline Completed without error 00% 6 - #17 Short offline Completed without error 00% 0 - SMART Selective self-test log data structure revision number 1 ... <snip>
Now the value is 3, as expected, the value of the system uptime again.
So this is definitely a bug. Normally the value of Power_On_Hours should be stored in this field.
Just like with my other SSDs and hard drives. I don't know why it's different with the Samsung SSD 840 EVO?
BTW, the older Smart self-test entries (#8 to #17 from the last output) were done a couple of years ago. At that time, i was running older versions of Debian and smartctl. This means that this bug seems to have existed in smartctl for a very long time.
And regarding the other bug with the long extended short tests, which I had to abort yesterday because it didn't continue, I will have to run another test and call smartctl directly as a cli tool. But I won't have time for that today. So I'll do it another time.
Change History (3)
comment:1 by , 3 weeks ago
Keywords: | ata added; Samsung EVO 840 Debian 12 Bookworm uptime removed |
---|---|
Milestone: | → undecided |
follow-up: 3 comment:2 by , 3 weeks ago
Set the system temporarily to suspend to RAM mode. This keeps the OS uptime but powers cycles the SSD. Then compare OS uptime and SSD self-test log again.
comment:3 by , 3 weeks ago
Replying to Christian Franke:
Set the system temporarily to suspend to RAM mode. This keeps the OS uptime but powers cycles the SSD. Then compare OS uptime and SSD self-test log again.
Thank you for your answer. You were right. Putting the Computer in standby mode and waking it up again kept the OS time, but power cycled the SSD.
After a short test, the value stored now is 0 hours, instead of 10, which was the uptime.
But i also discovered something strange.
I did the extended test only from command line today and it completed without errors. Only the number seems to be strange, i did this before the stand-by test.
SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 0 - # 2 Short offline Completed without error 00% 7 -
But after i did your standby test it shows now the following:
SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Aborted by host 00% 10 - # 2 Short offline Completed without error 00% 7 -
Here's the output of smartctl -x after the stand-by and short test:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-27-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 840 EVO 1TB Serial Number: XXXXXXXXXXXXXXXXXXXX LU WWN Device Id: 5 002538 8a056d6ea Firmware Version: EXT0CB6Q User Capacity: 1.000.204.886.016 bytes [1,00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device TRIM Command: Available Device is: In smartctl database 7.3/5319 ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Wed Nov 13 00:30:51 2024 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, NOT FROZEN [SEC1] Unexpected SCT status 0x0001 (action_code=4, function_code=2) Wt Cache Reorder: Unknown (SCT Feature Control command failed) === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15000) seconds. Offline data collection capabilities: (0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 250) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 5 Reallocated_Sector_Ct PO--CK 100 100 010 - 0 9 Power_On_Hours -O--CK 090 090 000 - 47202 12 Power_Cycle_Count -O--CK 095 095 000 - 4350 177 Wear_Leveling_Count PO--C- 096 096 000 - 42 179 Used_Rsvd_Blk_Cnt_Tot PO--C- 100 100 010 - 0 181 Program_Fail_Cnt_Total -O--CK 100 100 010 - 0 182 Erase_Fail_Count_Total -O--CK 100 100 010 - 0 183 Runtime_Bad_Block PO--C- 100 100 010 - 0 187 Uncorrectable_Error_Cnt -O--CK 100 100 000 - 0 190 Airflow_Temperature_Cel -O--CK 074 059 000 - 26 195 ECC_Error_Rate -O-RC- 200 200 000 - 0 199 CRC_Error_Count -OSRCK 099 099 000 - 2 235 POR_Recovery_Count -O--C- 099 099 000 - 229 241 Total_LBAs_Written -O--CK 099 099 000 - 103827485223 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning General Purpose Log Directory Version 1 SMART Log Directory Version 1 [multi-sector log support] Address Access R/W Size Description 0x00 GPL,SL R/O 1 Log Directory 0x01 SL R/O 1 Summary SMART error log 0x02 SL R/O 1 Comprehensive SMART error log 0x03 GPL R/O 1 Ext. Comprehensive SMART error log 0x06 SL R/O 1 SMART self-test log 0x07 GPL R/O 1 Extended self-test log 0x09 SL R/W 1 Selective self-test log 0x10 GPL R/O 1 NCQ Command Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log 0x80-0x9f GPL,SL R/W 16 Host vendor specific log 0xa1 GPL,SL VS 16 Device vendor specific log 0xce GPL,SL VS 16 Device vendor specific log 0xde SL VS 1 Device vendor specific log 0xe0 GPL,SL R/W 1 SCT Command/Status 0xe1 GPL,SL R/W 1 SCT Data Transfer SMART Extended Comprehensive Error Log Version: 1 (1 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 0 - # 2 Extended offline Aborted by host 00% 10 - # 3 Short offline Completed without error 00% 7 - # 4 Short offline Completed without error 00% 3 - # 5 Short offline Completed without error 00% 1 - # 6 Short offline Completed without error 00% 0 - # 7 Short offline Completed without error 00% 14 - # 8 Short offline Completed without error 00% 14 - # 9 Extended offline Aborted by host 00% 13 - #10 Short offline Completed without error 00% 4 - #11 Extended offline Completed without error 00% 9 - #12 Short offline Completed without error 00% 2 - #13 Short offline Completed without error 00% 2 - #14 Short offline Completed without error 00% 13 - #15 Short offline Completed without error 00% 3 - #16 Extended offline Completed without error 00% 2 - #17 Short offline Completed without error 00% 0 - #18 Short offline Completed without error 00% 11 - #19 Extended offline Completed without error 00% 6 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version: 3 SCT Version (vendor specific): 1 (0x0001) Device State: Active (0) Current Temperature: 26 Celsius Power Cycle Min/Max Temperature: ?/ ? Celsius Lifetime Min/Max Temperature: ?/ ? Celsius Under/Over Temperature Limit Count: 0/282907 SCT Temperature History Version: 2 Temperature Sampling Period: 10 minutes Temperature Logging Interval: 10 minutes Min/Max recommended Temperature: ?/ ? Celsius Min/Max Temperature Limit: ?/ ? Celsius Temperature History Size (Index): 128 (1) Index Estimated Time Temperature Celsius 2 2024-11-12 03:20 ? - ... ..(125 skipped). .. - 0 2024-11-13 00:20 ? - 1 2024-11-13 00:30 30 *********** SCT Error Recovery Control: Read: Disabled Write: Disabled Device Statistics (GP/SMART Log 0x04) not supported Pending Defects log (GP Log 0x0c) not supported SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0002 2 0 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0008 2 0 Device-to-host non-data FIS retries 0x0009 2 2 Transition from drive PhyRdy to drive PhyNRdy 0x000a 2 2 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS 0x000d 2 0 Non-CRC errors within host-to-device FIS 0x000f 2 0 R_ERR response for host-to-device data FIS, CRC 0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC 0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC 0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC
Sadly there is no new firmware with a fix available for this SSD. "EXT0CB6Q" is already the newest version.
Smartctl prints only what the drive reports itself via SMART READ LOG command. It never calls any OS API returning uptime. The drive apparently puts its own uptime instead of the lifetime here. This is a harmless drive firmware bug.
Stop using only
smartctl -a
. It only includes legacy SMART information, try also-x
and check the extended self-test log.Its the responsibility of the drive firmware to run the test properly, please see the FAQ.