#680 closed enhancement (wontfix)
"LifeTime(hours)" wrap around in SMART's self-test log
Reported by: | C_Aulbert | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | smartctl | Version: | 6.4 |
Keywords: | ata | Cc: |
Description (last modified by )
Hi
I've just seen that on a few quite old disks, the 16bit counter for "LifeTime(hours)" wraps around after a few years, even though the power on hours don't.
I'm not sure if this is caused the SMART specs or a problem with querying the raw values within smartmontools, but it would be nice to have a fix in smartmontools to have a uniformly growing number of LifeTime in the logs - especially as we use the difference between the latest entry and attribute no 9 to determine if a regular check was not performed.
What do you think?
Cheers
Carsten
smartctl 6.5 2016-04-02 r4276 [x86_64-linux-3.18.26-atlas] (daily-20160402) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Ultrastar A7K1000 Device Model: Hitachi HUA721075KLA330 Serial Number: GTE200P8G1Z07E LU WWN Device Id: 5 000cca 215c0e504 Firmware Version: GK8OA70M User Capacity: 750,156,374,016 bytes [750 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA/ATAPI-7 T13/1532D revision 1 Local Time is: Mon Apr 4 13:07:04 2016 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 249) Self-test routine in progress... 90% of test remaining. Total time to complete Offline data collection: (11471) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 191) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 130 130 054 Pre-fail Offline - 151 3 Spin_Up_Time 0x0007 118 118 024 Pre-fail Always - 542 (Average 472) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 82 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 2 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 132 132 020 Pre-fail Offline - 33 9 Power_On_Hours 0x0012 091 091 000 Old_age Always - 66329 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 82 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 868 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 868 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 36 (Min/Max 8/47) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 2 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 676 - # 2 Extended offline Completed without error 00% 508 - # 3 Extended offline Completed without error 00% 340 - # 4 Extended offline Completed without error 00% 172 - # 5 Extended offline Completed without error 00% 4 - # 6 Extended offline Completed without error 00% 65372 - # 7 Extended offline Completed without error 00% 65205 - # 8 Extended offline Completed without error 00% 64868 - # 9 Extended offline Completed without error 00% 64700 - #10 Extended offline Completed without error 00% 64532 - #11 Extended offline Completed without error 00% 64484 - #12 Extended offline Completed without error 00% 12930 - #13 Extended offline Completed without error 00% 12762 - #14 Extended offline Completed without error 00% 12594 - #15 Extended offline Completed without error 00% 12426 - #16 Extended offline Completed without error 00% 12258 - #17 Extended offline Completed without error 00% 12090 - #18 Extended offline Completed without error 00% 11922 - #19 Extended offline Completed without error 00% 11754 - #20 Extended offline Completed without error 00% 11586 - #21 Extended offline Completed without error 00% 11418 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Change History (6)
comment:1 by , 8 years ago
Component: | all → smartctl |
---|---|
Description: | modified (diff) |
Keywords: | ata added |
Milestone: | → undecided |
follow-up: 4 comment:2 by , 8 years ago
Type: | defect → enhancement |
---|
comment:3 by , 8 years ago
I am agree with chrfranke + also want to add that relying on attribute 9 is not an option, because different drives using different format for it + sometime it is just buggy or not working as it should due to fw issues.
comment:4 by , 8 years ago
OK, if the ATA specs are too old for such a disk age, I guess you are right. Let's ignore this edge case - although the disks are really good and fine.
If you think it would be a more or less trivial patch to check for the overflow in the logs and account for it, I would very much like that in the output, but I don't really know how much effort that would be or worth it. And I guess I won't be able to persuade Bruce to look into the code again ;)
With respect to the version used: Well, I thought with the system-installed version (5.41, please don't hit me too hard) I would only create a bad initial reputation here, thus I just used the very latest static build you gracefully provided.
comment:5 by , 8 years ago
Milestone: | undecided |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
This is a limitation of the ATA self-test logs. A possible workaround has only very limited use cases. See comments above for details.
Actually the ATA spec is the problem: The Life timestamp field is only 16 bit wide in both variants of the SMART self-test log.
Overflow could only be detected if any old test with up to 65535 hours is still in the log. I'm not sure whether this is worth the effort for rare disks with a power on time of at least ~7.5 years.
I leave the ticket open as 'undecided' for now.
BTW:
Thanks for testing our daily builds :-)