Opened 12 days ago

Last modified 8 days ago

#1471 new enhancement

Add Raw Format for Seagate Error Rates (ST4000DM004-2CV104)

Reported by: Kendy Kutzner Owned by:
Priority: minor Milestone: Release 7.3
Component: drivedb Version:
Keywords: hdd Cc:

Description

The Seagate Barracuda ST4000DM004-2CV104 I own reports error rates in attributes 1, 7, and 159 in an unusual way: First, the number of errors (16bit), then the number of operations (32 bit).

The added patch adds a new RAWFMT, and enables it in the drivesdb for my disk.

Attachments (2)

smartmontools-add-error-rate.patch (2.3 KB) - added by Kendy Kutzner 12 days ago.
bug-report.txt (13.6 KB) - added by Kendy Kutzner 8 days ago.
a seagate drive showing peculiar reporting in attributes 1, 7, and 195

Download all attachments as: .zip

Change History (7)

Changed 12 days ago by Kendy Kutzner

comment:1 Changed 12 days ago by Kendy Kutzner

Forgot to add: the scarce sources for my statement about the meaning of the bits are below. I've found nothing really authoritative and/or recent.

http://www.users.on.net/~fzabkar/HDD/Seagate_SER_RRER_HEC.html (and links from here, in particular to comp.sys.ibm.pc.hardware.storage in 2009)

https://www.truenas.com/community/threads/seagate-ironwolf-smart-test-raw_read_error_rate-seek_error_rate.68634/post-470741 (linking to the first source)

https://forums.unraid.net/topic/31038-solved-seagate-with-huge-seek-error-rate-rma/?do=findComment&comment=296103 (linking to the first source)

https://serverfault.com/a/495259

https://yksi.ml/

comment:2 Changed 11 days ago by Christian Franke

Keywords: ata added
Milestone: undecided

Thanks for this patch and the information about the Seagate attributes.

I prefer a more generic quotient syntax for error rates (errors/ops), sorry.

This could already be enabled with -v 1,raw24/raw32:543210 or even -v 1,raw24/raw32 if the reserved attribute byte is always zero.

We could possibly add raw16/raw32 for convenience. As any addition, it would require extra steps for database merges.

comment:3 Changed 11 days ago by Kendy Kutzner

Thanks Christian for your quick reply.

I prefer a more generic quotient syntax for error rates (errors/ops), sorry.

I'd don't fully comprehend what you desire here. Can you please elaborate? I'm happy to modify my patch.

The reserved attributes on the drives I have here are indeed zero right now, so -v 1,raw24/raw32 does produce output that is seemingly correct (sample size: 2).

An addition of raw16/raw32 format might be more technically correct. I don't understand the database merge requirements.

Please advise on how to proceed.

PS: some background information. I'm using https://github.com/influxdata/telegraf/tree/master/plugins/inputs/smart to graph some disk attributes, which (for better or worse) parses the output of smartctl.

comment:4 Changed 10 days ago by Christian Franke

Component: alldrivedb
Keywords: hdd added; ata removed
Milestone: undecidedRelease 7.3
Summary: Add Raw Format for Seagate Error RatesAdd Raw Format for Seagate Error Rates (ST4000DM004-2CV104)

A new output format would (IMO) not provide much benefit for the end user and would definitely break backward compatibility with drivedb.h branches.

This patch is backward compatible to all still maintained branches. If it works for you, I will apply it soon:

  • drivedb.h

     
    40894089      // ST4000DM004-2CV104/0001 (TRIM: no), ST4000DM005-2DP166/0001, ST8000DM004-2CX188/0001
    40904090    "ST(2000DM00[589]|3000DM007|4000DM00[45]|6000DM003|8000DM004)-.*",
    40914091    "", "",
     4092    "-v 1,raw24/raw32:543210 "
     4093    "-v 7,raw24/raw32:543210 "
     4094    "-v 188,raw16 "
     4095    "-v 195,raw24/raw32:543210 "
    40924096    "-v 200,raw48,Pressure_Limit "
    4093     "-v 188,raw16 -v 240,msec24hour32"
     4097    "-v 240,msec24hour32"
    40944098  },
    40954099  { "Seagate Desktop HDD.15", // tested with ST4000DM000-1CD168/CC43, ST5000DM000-1FK178/CC44,
    40964100      // ST6000DM001-1XY17Z/CC48

If possible, please also provide a smartctl -x -a ... output of an affected device.

PS: Telegraf developers might want to try smartctl --json (since 7.0) which could significantly ease parsing.

comment:5 Changed 8 days ago by Kendy Kutzner

Your patch looks sensible to me. (With the obvious caveat that creating a raw24 out of two bytes is not fully right).

I do agree that telegraf parsing the textual output of smartctl is sub-optimal.

Changed 8 days ago by Kendy Kutzner

Attachment: bug-report.txt added

a seagate drive showing peculiar reporting in attributes 1, 7, and 195

Note: See TracTickets for help on using tickets.