Opened 8 months ago

Last modified 8 months ago

#1300 new enhancement

Man page provides no clue how to interpret NVMe error log

Reported by: xypron Owned by:
Priority: minor Milestone: unscheduled
Component: smartctl Version:
Keywords: nvme Cc: xypron

Description

sudo smartctl -a /dev/nvme0

reports a lot of errors

=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 960 PRO 512GB
...
Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0       1181     0  0x0015  0x421a  0x028            0     0     -
  1       1180     0  0x0013  0x4212  0x028            0     0     -
  2       1179     0  0x0015  0x421a  0x028            0     0     -
  3       1178     0  0x0013  0x4212  0x028            0     0     -

But unfortunately neither the smartctl output nor the man pag provides any clue how to interpret this output.

I think the documentation should provide both a description of each of the error log columns as well as links to the documents describing the values seen here.

Change History (4)

comment:1 Changed 8 months ago by xypron

Cc: xypron added

comment:2 Changed 8 months ago by xypron

sudo nvme error-log /dev/nvme0

provides a bit more information

Error Log Entries for device:nvme0 entries:64
.................
 Entry[ 0]   
.................
error_count  : 1181
sqid         : 0
cmdid        : 0x15
status_field : 0x421a(FEATURE_NOT_SAVEABLE: The Feature Identifier specified does not support a saveable value)
parm_err_loc : 0x28
lba          : 0
nsid         : 0
vs           : 0
cs           : 0
.................
 Entry[ 1]   
.................
error_count  : 1180
sqid         : 0
cmdid        : 0x13
status_field : 0x4212(INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested)
parm_err_loc : 0x28
lba          : 0
nsid         : 0
vs           : 0
cs           : 0
.................

The nvme-cli tool is available here: https://github.com/linux-nvme/nvme-cli.

As both smartmontools and nvme-cli are under GPLv2 it should be easy to just copy the status field texts to smartmontools.

comment:3 Changed 8 months ago by xypron

Here are some commands that provoke one of the errors I observed:

sudo nvme sanitize-log /dev/nvme0
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

sudo nvme self-test-log /dev/nvme0
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

sudo nvme endurance-log /dev/nvme0
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

sudo nvme ana-log /dev/nvme0
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

sudo nvme changed-ns-list-log /dev/nvme0
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

sudo nvme telemetry-log /dev/nvme0 -o foo
NVMe status: INVALID_LOG_PAGE: The log page indicated is invalid. This error condition is also returned if a reserved log page is requested(0x2109)

comment:4 Changed 8 months ago by Christian Franke

Component: allsmartctl
Keywords: nvme added
Milestone: unscheduled
Summary: Man page provides no clue how to interpret error logMan page provides no clue how to interpret NVMe error log
Note: See TracTickets for help on using tickets.