Changes between Version 3 and Version 4 of SamsungF4EGBadBlocks


Ignore:
Timestamp:
Nov 30, 2010, 4:47:00 PM (13 years ago)
Author:
Christian Franke
Comment:

Add my tests results, update warning

Legend:

Unmodified
Added
Removed
Modified
  • SamsungF4EGBadBlocks

    v3 v4  
    1 = Using smartctl with Samsung F4 !EcoGreen drives may result in bad blocks =
     1= Using smartctl with Samsung F4 !EcoGreen drives may result in data loss =
    22
    33'''WARNING: Do not use smartmontools with these drives! '''
    44
     5'''This warning also applies to other tools (like hdparm) which use IDENTIFY DEVICE to obtain drive information.'''
     6
    57----
    6 2010-11-24, updated 2010-11-29: Here are the details as reported by German c't magazine and [http://sourceforge.net/mailarchive/forum.php?thread_name=201011251136.03927.michael%40trunner.de&forum_name=smartmontools-support on our mailing list]:
     82010-11-24, updated 2010-11-30: Here are the details as reported by German c't magazine and [http://sourceforge.net/mailarchive/forum.php?thread_name=201011251136.03927.michael%40trunner.de&forum_name=smartmontools-support on our mailing list]:
    79
    810* Affected disk: SAMSUNG !SpinPoint F4 !EcoGreen 2TB
     
    4648* It could also be reproduced if only {{{smartctl -i}}} is used. This command sends only one ATA command to the disk: IDENTIFY DEVICE. No SMART functionality is used then.
    4749
     50* It could also be reproduced with {{{hdparm -I}}} on Linux.
     51
    4852----
    4953
    50 2010-11-26: The info is published on [http://www.heise.de/newsticker/meldung/SMART-Tool-beschaedigt-Daten-auf-Samsung-Festplatte-1143120.html heise online news] (German).
     542010-11-26, updated 2010-11-30: The info is published on [http://www.heise.de/newsticker/meldung/SMART-Tool-beschaedigt-Daten-auf-Samsung-Festplatte-Update-1143120.html heise online news] (German).
     55
     56----
     57
     582010-11-30: We could reproduce the problem.
     59
     60Tested on an Intel based system with P35 chipset under Linux ([http://grml.org/ grml] 2010.04 Live CD). NCQ and disk write cache are enabled.
     61
     62{{{
     63# uname -a
     64Linux grml.somewhere 2.6.33-grml #1 SMP PREEMPT Fri Apr 2 10:16:25 UTC 2010 i686 GNU/Linux
     65
     66# smartctl -i -q noserial /dev/sda
     67smartctl 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build)
     68Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
     69
     70=== START OF INFORMATION SECTION ===
     71Device Model:     SAMSUNG HD204UI
     72Firmware Version: 1AQ10001
     73User Capacity:    2,000,398,934,016 bytes
     74...
     75
     76# cat /sys/block/sda/device/queue_depth
     7731
     78
     79# hdparm -W /dev/sda
     80/dev/sda:
     81 write-caching =  1 (on)
     82}}}
     83
     84First run one of these commands in another terminal window:
     85{{{
     86# watch -n 1 smartctl -i /dev/sda
     87}}}
     88or:
     89{{{
     90# watch -n 1 hdparm -I /dev/sda
     91}}}
     92
     93With the above command running concurrently the problem can be reproduced as follows:
     94{{{
     95# dd if=/dev/zero of=/dev/sda count=1000000 
     961000000+0 records in
     971000000+0 records out
     98512000000 bytes (512 MB) copied, 12.7394 s, 40.2 MB/s
     99
     100# badblocks -vw -b 512 -t 0x55 /dev/sda 1000000
     101Checking for bad blocks in read-write mode
     102From block 0 to 1000000
     103Testing with pattern 0x55: done                               
     104Reading and comparing: 36608
     105...
     10636671
     107107200
     108...
     109107263
     110169984
     111...
     112170047
     113245824
     114...
     115245887
     116321216
     117...
     118343615
     119606336
     120...
     121606399
     122875520
     123...
     124875583
     125done                               
     126Pass completed, 256 bad blocks found.
     127
     128# od -A x -x -N 100000b /dev/sda
     129  000000 5555 5555 5555 5555 5555 5555 5555 5555
     130*
     131 1180000 0000 0000 0000 0000 0000 0000 0000 0000
     132*
     133 1188000 5555 5555 5555 5555 5555 5555 5555 5555
     134*
     135 a7c0000 0000 0000 0000 0000 0000 0000 0000 0000
     136*
     137 a7c8000 5555 5555 5555 5555 5555 5555 5555 5555
     138*
     13912810000 0000 0000 0000 0000 0000 0000 0000 0000
     140*
     14112818000 5555 5555 5555 5555 5555 5555 5555 5555
     142*
     1431ab80000 0000 0000 0000 0000 0000 0000 0000 0000
     144*
     1451ab88000 5555 5555 5555 5555 5555 5555 5555 5555
     146*
     1471e848000
     148}}}
     149
     150The above suggests that the disk sometimes discards a pending 64 sector write command when a IDENTIFY DEVICE command is received. This data loss occurs silently. There is no error message in kernel log, SMART Error log, NCQ Command Error log page, or SATA Phy Event Counters log page.
     151
     152The problem could '''not''' be reproduced with the above test if any of the following conditions are met:
     153
     154* Disk write cache is disabled.
     155
     156* NCQ is disabled. This may not always be true as the c't lab also reported problems with NCQ disabled.
     157
     158* A modified test version of smartctl which does not issue IDENTIFY DEVICE commands is used. Then all other SMART and non-SMART commands used by smartctl work without any data loss.
     159
     160Christian Franke
    51161
    52162----