Opened 19 months ago
Closed 15 months ago
#1850 closed enhancement (fixed)
Ignore individual NVME temperature sensors
| Reported by: | Matalonder | Owned by: | Christian Franke |
|---|---|---|---|
| Priority: | minor | Milestone: | Release 7.5 |
| Component: | smartd | Version: | |
| Keywords: | nvme | Cc: |
Description
I have a Kingston Fury Renegade NVMe SSD, SFYRDK4000G. It reports two temperature sensors:
SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 62 Celsius ... Temperature Sensor 2: 67 Celsius
and as sensors output:
nvme-pci-0100
Adapter: PCI adapter
Composite: +61.9°C (low = -20.1°C, high = +83.8°C)
(crit = +88.8°C)
Sensor 2: +66.8°C
The problem is, only Composite is an actual temperature sensor. Sensor 2 seems to be just a "Highest temperature ever seen" tracking value. It's always 66.8, even when Composite is, like, 25.
I track this drive with -W 5,55,65, because I want to get desktop notifications when it goes over 65, and I figured out that passing the notification-creating script to -M works well enough.
This, however, now causes me to get the notification on every boot, because Sensor 2 is stuck at the highest-ever-seen 66.8 and smartd always uses its value:
Jun 30 15:38:29 hostname smartd[18016]: Device: /dev/disk/by-id/nvme-KINGSTON_SFYRDK4000G_..., Temperature 67 Celsius reached critical limit of 65 Celsius (Min/Max 67/67)
Effectively making the whole -W flag useless.
So it seems like this behaviour, described in the man page, is messing with me:
For NVMe devices, smartd checks the maximum of the Composite Temperature value and all Temperature Sensor values reported by SMART/Health Information log.
Is there a way to instruct smartd to ignore certain temperature sensor values, or use only the Composite one?
If there isn't, could you consider this enhancement? It seems like a valid use case with no other solution. For now I'll have to pass -W 0,0,0 for this SSD to avoid useless notifications and monitor it manually.
Change History (5)
comment:1 by , 19 months ago
| Keywords: | nvme added |
|---|---|
| Milestone: | → undecided |
comment:2 by , 19 months ago
Thank you for the quick answer!
Note that an over-temperature event should be reported by bit 1 of the Critical Warning byte which is checked if -H is set.
Is the temperature level used for that set in device firmware, or can be customized?
I kind of don't trust the device in this. It's spec says "max work temp" is 70°, but it seems to report it's happy with up to 85° (which is "max storage temp" by spec). And I'd like to have an earlier warning, anyway, which is why I set it to 65°.
But it's good to know I'll get a warning if it decides to fry itself, even without -W!
comment:3 by , 19 months ago
Is the temperature level used for that set in device firmware, or can be customized?
The current threshold for the composite temperature is reported by smartctl -c as:
Warning Comp. Temp. Threshold: 85 Celsius
According to NVMe Base Specification 2c, a drive may support customization of thresholds for both composite temperature and individual sensors via the NVMe command Get/Set Features 0x04. This is not yet supported by smartctl. The Linux tool nvme-set-feature should support this for example.
comment:4 by , 15 months ago
| Milestone: | undecided → Release 7.5 |
|---|---|
| Owner: | set to |
| Status: | new → accepted |
| Summary: | Ignore specific NVME temperature sensor → Ignore individual NVME temperature sensors |
Simplifying topic.

Sorry, no. I don't remember any similar report in the 8+ years since the first NVMe capable version of smartmontools (6.5, May 2016).
Will be decided later. Always using the composite temperature only would be a more easy solution.
Note that an over-temperature event should be reported by bit 1 of the
Critical Warningbyte which is checked if-His set.