Version 7 (modified by Christian Franke, 14 years ago) ( diff )

Add note about drivedb.h update

Using smartctl with Samsung F4 EcoGreen drives may result in data loss

WARNING: Do not use smartmontools with these drives!

This warning also applies to other tools (like hdparm) which use IDENTIFY DEVICE to obtain drive information.

2010-11-24, updated 2010-11-30: Here are the details as reported by German c't magazine and on our mailing list:

  • Affected disk: SAMSUNG SpinPoint F4 EcoGreen 2TB

Device Model: SAMSUNG HD204UI
Firmware Version: 1AQ10001

  • Problem: If the system writes to this disk and smartctl -a (5.40) is used at the same time, write errors are reported and bad blocks appear on the disk.

This was reported by a reader and could be reproduced in c't magazine lab on the following system:

Windows 7 x64 Ultimate
Core i3-560
Intel H55 chipset
SATA-AHCI-Driver: Intel Rapid Storage Technology (RST) 9.6

  • It could also be reproduced under Linux (Fedora 14) if AHCI is enabled and the following commands are run in parallel:
    # badblocks -svw -b 4096 /dev/sdd 4000000
    # smartctl -a /dev/sdd 
  • It could not be reproduced if the Intel H55 chipset is set to IDE mode.
  • It could also be reproduced on another system with an AMD chipset under Windows and drivers msahci.sys or amdsata.sys.
  • It could not be reproduced on the same AMD system and amdsata.sys driver with the following other disks:

SAMSUNG HD153WI (F3 EcoGreen)
SAMSUNG SP2504C (P120)

  • It could also be reproduced on an system with Intel P45 chipset.
  • It could also be reproduced on an AMD based system with NVIDIA nForce 520 chipset.
  • It could also be reproduced if only smartctl -i is used. This command sends only one ATA command to the disk: IDENTIFY DEVICE. No SMART functionality is used then.
  • It could also be reproduced with hdparm -I on Linux.

2010-11-24: Drive database file for smartmontools 5.39.X and 5.40 is updated. A warning is printed if such a drive is detected. Please note that it might be too late then because the IDENTIFY DEVICE command is the actual problem.

2010-11-26, updated 2010-11-30: The info is published on heise online news (German).

How to reproduce

2010-11-30: We could reproduce the problem.

Tested on an Intel based system with P35 chipset under Linux (grml 2010.04 Live CD). NCQ and disk write cache are enabled.

# uname -a
Linux grml.somewhere 2.6.33-grml #1 SMP PREEMPT Fri Apr 2 10:16:25 UTC 2010 i686 GNU/Linux

# smartctl -i -q noserial /dev/sda
smartctl 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen,

Device Model:     SAMSUNG HD204UI
Firmware Version: 1AQ10001
User Capacity:    2,000,398,934,016 bytes

# cat /sys/block/sda/device/queue_depth

# hdparm -W /dev/sda
 write-caching =  1 (on)

First run one of these commands in another terminal window:

# watch -n 1 smartctl -i /dev/sda


# watch -n 1 hdparm -I /dev/sda

With the above command running concurrently the problem can be reproduced as follows:

# dd if=/dev/zero of=/dev/sda count=1000000  
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 12.7394 s, 40.2 MB/s

# badblocks -vw -b 512 -t 0x55 /dev/sda 1000000
Checking for bad blocks in read-write mode
From block 0 to 1000000
Testing with pattern 0x55: done                                
Reading and comparing: 36608
Pass completed, 256 bad blocks found.

# od -A x -x -N 100000b /dev/sda
  000000 5555 5555 5555 5555 5555 5555 5555 5555
 1180000 0000 0000 0000 0000 0000 0000 0000 0000
 1188000 5555 5555 5555 5555 5555 5555 5555 5555
 a7c0000 0000 0000 0000 0000 0000 0000 0000 0000
 a7c8000 5555 5555 5555 5555 5555 5555 5555 5555
12810000 0000 0000 0000 0000 0000 0000 0000 0000
12818000 5555 5555 5555 5555 5555 5555 5555 5555
1ab80000 0000 0000 0000 0000 0000 0000 0000 0000
1ab88000 5555 5555 5555 5555 5555 5555 5555 5555

The above suggests that the disk sometimes discards a pending 64 sector write command when a IDENTIFY DEVICE command is received. This data loss occurs silently. There is no error message in kernel log, SMART Error log, NCQ Command Error log page, or SATA Phy Event Counters log page.

Please note that the badblocks command reported "256 bad blocks" in the above test because the data read differs from the data written before. None of the tests resulted in actual bad (unreadable) blocks on the disk. Testing did not damage the disk itself. The problem is that new data already sent to the disk may not be written. Previously written data is not affected.

The problem could not be reproduced with the above test if any of the following conditions are met:

  • Disk write cache is disabled.
  • NCQ is disabled. This may not always be true as the c't lab also reported problems with NCQ disabled.
  • A modified test version of smartctl which does not issue IDENTIFY DEVICE commands is used. Then all other SMART and non-SMART commands used by smartctl work without any data loss.

Christian Franke

If you have additional info, please report it to the smartmontools-support mailing list.

Note: See TracWiki for help on using the wiki.