Custom Query (1560 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (457 - 459 of 1560)

Ticket Resolution Summary Owner Reporter
#1016 duplicate let smartd check a device (to be monitored) shortly after it's attached calestyo
Description

Hi.

This is kinda related to #1014...

When a removable device (e.g. external HDD/SSD) is added, and when smartd is configured to monitor it (e.g. via DEVICESCAN or explicitly naming the device), then such device should be checked shortly after it's been attached.

The reason is, that it's not so uncommon, that such removable devices are attached only for a short period of time,... and with the default check interval of 30 mins, it may easily happen that the device isn't scanned for long with SMART errors remaining unnoticed.

Cheers, Chris.

#1017 duplicate make smartd usable as temperature monitor / replace hddtemp calestyo
Description

Hi.

It seems to me that hddtemp, is more or less dead and unmaintained... and since (AFAIK) its temperature reading is also based on SMART, there is not much sense in having both, smartd and hddtemp.

For many of my newer devices like SSDs, hddtemp reports "no temp sensor found" or so... while smartmontools work perfectly on them (and display temperature).

smartd already seems to have some limited functionality to monitor a device's temperature, namely via: -W DIFF[,INFO[,CRIT]]

There are a number of problems with it:

1) Most importantly, it seems that warnings are not re-sent as they occur (but only once a day?). For example, I have had a line in smartd.conf like: /dev/disk/by-id/ata-Samsung_SSD_850_PRO_1TB_S252NXAG910017F -d auto -d removable -n standby,4 -a -W 0,50,55 -m root -M exec /usr/share/smartmo ntools/smartd-runner For testing purposes I changed that to -W 0,20,25 and got an alert. Changed it back to -W 0,50,55 (which is fine for that device) and restarted... and then I repeated this (i.e. going back to something that should trigger a warning). However, no further warning.

This behaviour may be reasonable for other smart values, e.g. things like:

  • Wear_Leveling_Count
  • Uncorrectable_Error_Cnt

would typically get only worse and not better again. And things like:

  • ECC_Error_Rate

may increase pretty fast on some devices (one such value does on Seagate) and it's perfectly fine for them.

But for temperature monitoring it's IMHO bad: My Samsung SSD for example, supports I think up to 70°C. So I'd like to get a warning at say 50°C ... but not only the first time per day, because the temperature may decrease again then (or I just decrease the IO load on the device)... only to rise again shortly after (which I wouldn't notice anymore, as no further warning is sent).

Especially on mobile devices like laptops, temperatures can easily go up and down quite regularly. Therefore it makes sense to send temperature errors every time they occur (i.e. that is once per check interval).

2) devices typically also have a minimum operation temperature This is typically pretty low, so I'm not sure if it's can be even monitored properly (=> do the temp sensors of the disks give reasonable values for such low temps?)... but if they can, it would be nice if smartd would also monitor for a minimum temperature.

3) smartmontools should know the max[/min] temperatures of the devices *if* smartd would become a replacement / alternative to hddtemp, it would of course be nice if it comes with a database of max[/min] temperatures for known devices. Example, my Samsung SSD (according to Samsung) operate in some range between 0-70°C. My HDDs take much less (~50°C or so? would need to look it up). So it would be nice, if there'd be a DB, that automatically selects reasonable values, like for the SSD in my case: INFO at 60°C, CRIT at 70°C

Cheers, Chris.

#1018 fixed allow for constant re-sending of warnings Christian Franke calestyo
Description

Hi.

This kinda overlaps with #1017.

Right now, AFAIU, smartd will emit a warning only once per day. E.g. if my temperature rises over the threshold, I get the warning, but no further one, if it does again (or simply if it stays over the threshold all time).

AFAIU it's the same for other attributes: E.g. if an reallocated block shows up, I'll get a warning (per day)... but if after that first occurrence (on that day) the number of reallocated blocks increase further, I would not get one until the next day.

Now having one reallocated block, may be not considered that alarming, so on the first occurrence I would perhaps simply say... well... forget about it. But if more were to come in short time, I'd probably start with backups/replacement immediately.

However, due to the behaviour that such warning comes only once a day,... I wouldn't even notice the further incidents.

For that reason, I think it would be nice if one could set the interval in which new warnings are emitted. Like e.g. "on every case"... or "every 10 mins".

Thanks, Chris.

Batch Modify
Note: See TracBatchModify for help on using batch modify.
Note: See TracQuery for help on using queries.