Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#944 closed defect (worksforme)

WD Hard Drive smartd[484]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 102 to 100

Reported by: Veek Owned by:
Priority: minor Milestone:
Component: smartd Version: 6.6
Keywords: Cc:

Description

I have been querying my hard disk every 60s for the last few days and plotting temps using hddtemp/rrdtool. It hovers around the 40-42 C mark. https://imgur.com/a/NkIuP

However smartd reports spikes of 100-102 C multiple times per day. I thought this was erroneous BUT when i do

smartctl -a /dev/sda
a different picture emerges

194 Temperature_Celsius 0x0022 100 096 000 Old_age Always - 43

WORST shows 96 C which is unusual so I am concluding that smartd is correct after all. However how is it able to detect the spike in temp whereas hddtemp fails?

Is this value (smartd) erroneous? How is it being generated and how accurate is it? Can someone clarify if my temp is spiking to 96 C - could you guys shed some light on how all this works.

(I contacted WD but they just asked me to run some tool of theirs which is not documented very well - i have no idea if their Quick Test is destructive - so I've asked them about that - waiting for their reply)

Change History (3)

comment:1 by Veek, 6 years ago

It seems unlikely that overall disk temp could spike to 100C from 42 in 1 minute, unless this was localized to the actual thermal diode measuring the temp but I don't know enough about such things.

comment:2 by Christian Franke, 6 years ago

Keywords: WORST spike temperature removed
Resolution: worksforme
Status: newclosed

This is as expected. See the FAQ and the info about "Raw" and "Normalized" attributes in -A section of smartctl man page.

Add -I 194 directive to suppress tracking of normalized value. Add -W ... to track temperature. See smartd.conf man page for details.

comment:3 by Veek, 6 years ago

Just for completeness - I think I figured it out:

I noted that as my disk temp changes so does Raw:
Raw_Value=41C has a corresponding Value=102
Raw=42C has a corresponding Value=101
Raw=43 will have a Value=100

So basically WD has got an internal setting for ID#194 that maps VALUE=100 --> 43C which they probably consider to be a normal operating temperature for this region (perhaps). They only keep track of Value and convert from this to Centigrade when needed.

So using this mapping we can figure out what WORST=096 means..
It's 4 below 100 therefore it's 4 above 43 (since the normalized value and the Centigrade value move in opposite directions) Therefore 47C is the largest temp noted for my hard disk.

Note: See TracTickets for help on using tickets.