Changes between Version 18 and Version 19 of BadBlockHowto


Ignore:
Timestamp:
04/11/2017 12:53:40 PM (3 years ago)
Author:
Gabriele Pohl
Comment:

Move section about LVM repairs to chapter "case studies"

Legend:

Unmodified
Added
Removed
Modified
  • BadBlockHowto

    v18 v19  
    499499For creating, destroying, resizing, checking and copying partitions, and the file systems on them, GNU's `parted` is worth examining. The [http://www.tldp.org/HOWTO/Large-Disk-HOWTO.html Large Disk HOWTO] is also a useful resource.
    500500
     501=== Bad block reassignment ===
     502
     503The SCSI disk command set and associated disk architecture are assumed in this section. SCSI disks have their own logical to physical mapping allowing a damaged sector (usually carrying 512 bytes of data) to be remapped irrespective of the operating system, file system or software RAID being used.
     504
     505The terms ''block and sector'' are used interchangeably, although block tends to get used in higher level or more abstract contexts such as a ''logical block''.
     506
     507When a SCSI disk is formatted, defective sectors identified during the manufacturing process (the so called primary list: PLIST), those found during the format itself (the certification list: CLIST), those given explicitly to the format command (the DLIST) and optionally the previous grown list (GLIST) are not used in the logical block map. The number (and low level addresses) of the unmapped sectors can be found with the `READ DEFECT DATA SCSI` command.
     508
     509SCSI disks tend to be divided into zones which have spare sectors and perhaps spare tracks, to support the logical block address mapping process. The idea is that if a logical block is remapped, the heads do not have to move a long way to access the replacement sector. Note that spare sectors are a scarce resource.
     510
     511Once a SCSI disk format has completed successfully, other problems may appear over time. These fall into two categories:
     512
     513* recoverable: the Error Correction Codes (ECC) detect a problem but it is small enough to be corrected. Optionally other strategies such as retrying the access may retrieve the data.
     514* unrecoverable: try as it may, the disk logic and ECC algorithms cannot recover the data. This is often reported as a ''medium error''.
     515
     516Other things can go wrong, typically associated with the transport and they will be reported using a term other than ''medium error''. For example a disk may decide a read operation was successful but a computer's host bus adapter (HBA) checking the incoming data detects a CRC error due to a bad cable or termination.
     517
     518Depending on the disk vendor, recoverable errors can be ignored. After all, some disks have up to 68 bytes of ECC above the payload size of 512 bytes so why use up spare sectors which are limited in number ^[#footnote8 [8]]^ ? If the disk can recover the data and does decide to re-allocate (reassign) a sector, then first it checks the settings of the `ARRE` and `AWRE` bits in the read-write error recovery mode page. Usually these bits are set ^[#footnote9 [9]]^ enabling automatic (read or write) re-allocation. The automatic re-allocation may also fail if the zone (or disk) has run out of spare sectors.
     519
     520Another consideration with RAIDs, and applications that require a high data rate without pauses, is that the controller logic may not want a disk to spend too long trying to recover an error.
     521
     522Unrecoverable errors will cause a ''medium error'' sense key, perhaps with some useful additional sense information. If the extended background self test includes a full disk read scan, one would expect the self test log to list the bad block, as shown in section [#Repairsinafilesystem Repairs in a file system]. Recent SCSI disks with a periodic background scan should also list unrecoverable read errors (and some recoverable errors as well). The advantage of the background scan is that it runs to completion while self tests will often terminate at the first serious error.
     523
     524SCSI disks expect unrecoverable errors to be fixed manually using the `REASSIGN BLOCKS SCSI` command since loss of data is involved. It is possible that an operating system or a file system could issue the `REASSIGN BLOCKS` command itself but the authors are unaware of any examples. The `REASSIGN BLOCKS` command will reassign one or more blocks, attempting to (partially ?) recover the data (a forlorn hope at this stage), fetch an unused spare sector from the current zone while adding the damaged old sector to the GLIST (hence the name ''grown'' list). The contents of the GLIST may not be that interesting but `smartctl` prints out the number of entries in the grown list and if that number grows quickly, the disk may be approaching the end of its useful life.
     525
     526Here is an alternate brute force technique to consider: if the data on the SCSI or ATA disk has all been backed up (e.g. is held on the other disks in a RAID 5 enclosure), then simply reformatting the disk may be the least cumbersome approach.
     527
     528==== Example ====
     529
     530Given a ''bad block'', it still may be useful to look at the `fdisk` command (if the disk has multiple partitions) to find out which partition is involved, then use `debugfs` (or a similar tool for the file system in question) to find out which, if any, file or other part of the file system may have been damaged. This is discussed in section [#Repairsinafilesystem Repairs in a file system].
     531
     532Then a program that can execute the `REASSIGN BLOCKS SCSI` command is required. In Linux (2.4 and 2.6 series), FreeBSD, Tru64(OSF) and Windows the author's `sg_reassign` utility in the `sg3_utils` package can be used. Also found in that package is `sg_verify` which can be used to check that a block is readable.
     533
     534Assume that `logical block address 1193046` (which is `123456` in hex) is corrupt ^[#footnote10 [10]]^ on the disk at `/dev/sdb`. A long selftest command like `smartctl -t long /dev/sdb` may result in log results like this:
     535
     536{{{
     537# smartctl -l selftest /dev/sdb
     538smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
     539Home page is http://smartmontools.sourceforge.net/
     540SMART Self-test log
     541Num  Test              Status            segment  LifeTime  LBA_first_err [SK ASC ASQ]
     542     Description                         number   (hours)
     543# 1  Background long   Failed in segment      -     354           1193046 [0x3 0x11 0x0]
     544# 2  Background short  Completed              -     323                 - [-   -    -]
     545# 3  Background short  Completed              -     194                 - [-   -    -]
     546}}}
     547
     548The `sg_verify` utility can be used to confirm that there is a problem at that address:
     549
     550{{{
     551# sg_verify --lba=1193046 /dev/sdb
     552verify (10):  Fixed format, current;  Sense key: Medium Error
     553 Additional sense: Unrecovered read error
     554  Info fld=0x123456 [1193046]
     555  Field replaceable unit code: 228
     556  Actual retry count: 0x008b
     557medium or hardware error, reported lba=0x123456
     558}}}
     559
     560Now the GLIST length is checked before the block reassignment:
     561
     562{{{
     563# sg_reassign --grown /dev/sdb
     564>> Elements in grown defect list: 0
     565}}}
     566
     567And now for the actual reassignment followed by another check of the GLIST length:
     568
     569{{{
     570# sg_reassign --address=1193046 /dev/sdb
     571# sg_reassign --grown /dev/sdb
     572>> Elements in grown defect list: 1
     573}}}
     574
     575The GLIST length has grown by one as expected. If the disk was unable to recover any data, then the ''new'' block at lba `0x123456` has vendor specific data in it. The `sg_reassign` utility can also do bulk reassigns, see `man sg_reassign` for more information.
     576
     577The `dd` command could be used to read the contents of the ''new'' block:
     578
     579{{{
     580# dd if=/dev/sdb iflag=direct skip=1193046 of=blk.img bs=512 count=1
     581}}}
     582
     583and a hex editor ^[#footnote11 [11]]^ used to view and potentially change the `blk.img` file. An altered `blk.img` file (or `/dev/zero`) could be written back with:
     584
     585{{{
     586# dd if=blk.img of=/dev/sdb seek=1193046 oflag=direct bs=512 count=1
     587}}}
     588
     589More work may be needed at the file system level, especially if the reassigned block held critical file system information such as a superblock or a directory.
     590
     591Even if a full backup of the disk is available, or the disk has been ''ejected'' from a RAID, it may still be worthwhile to reassign the bad block(s) that caused the problem (or simply format the disk (see `sg_format` in the `sg3_utils package`)) and re-use the disk later (not unlike the way a replacement disk from a manufacturer might be used).
     592
     593
     594== Case Studies ==
     595
     596This section is intended to collect step-by-step descriptions of some real-life use cases.
     597
     598=== Recovering a (mostly) unreadable sector of a Notebook HDD ===
     599
     600This was done in March 2016 under Windows 7 using ''Cygwin''^[#footnote12 12.]^ ports of ''GNU ddrescue''^[#footnote13 13.]^ and ''The Sleuth Kit (TSK)''^[#footnote14 14.]^. All commands shown should work similar on other platforms and with other filesystems.
     601
     602==== Determine Logical Block Address of unreadable sector ====
     603
     604Examine smartctl output:
     605{{{
     606root:~# smartctl -x /dev/sdb
     607smartctl 6.5 2016-02-29 r4227 [x86_64-w64-mingw32-win7-sp1] (daily-20160229)
     608...
     609Model Family:     SAMSUNG SpinPoint MP5
     610Device Model:     SAMSUNG HM640JJ
     611...
     612Firmware Version: 2AK10001
     613User Capacity:    640.135.028.736 bytes [640 GB]
     614Sector Size:      512 bytes logical/physical
     615Rotation Rate:    7200 rpm
     616Form Factor:      2.5 inches
     617...
     618ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
     619...
     620  5 Reallocated_Sector_Ct   PO--CK   252   252   010    -    0
     621...
     622  9 Power_On_Hours          -O--CK   100   100   000    -    251  <=== See Self-test Log below
     623...
     624197 Current_Pending_Sector  -O--CK   100   100   000    -    1    <=== At least 1 bad sector
     625...
     626SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
     627Device Error Count: 351 (device log contains only the most recent 8 errors)
     628...
     629Error 351 [6] occurred at disk power-on lifetime: 251 hours (10 days + 11 hours)
     630  When the command that caused the error occurred, the device was active or idle.
     631
     632  After command completion occurred, registers were:
     633  ER -- ST COUNT  LBA_48  LH LM LL DV DC
     634  -- -- -- == -- == == == -- -- -- -- --
     635  40 -- 51 00 01 00 00 33 3f d8 a6 40 00  Error: UNC 1 sectors at LBA = 0x333fd8a6 = 859822246  <=== Its LBA
     636
     637  Commands leading to the command that caused the error were:
     638  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
     639  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
     640  25 00 00 00 01 00 00 33 3f d8 a6 40 00     00:00:06.924  READ DMA EXT
     641...
     642
     643SMART Extended Self-test Log Version: 1 (2 sectors)
     644Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
     645# 1  Short offline       Completed: read failure       90%       176         859822246  <=== Detected 75 power on hours ago
     646}}}
     647
     648A read scan helps to verify the LBA and checks for other possible bad sectors
     649(alternatively replace `/dev/null` by a file path to create a disk image):
     650
     651{{{
     652root:~# ddrescue --ask --verbose --binary-prefixes --idirect --force /dev/sdb /dev/null disk.map
     653GNU ddrescue 1.21-rc2
     654About to copy 610480 MiBytes from /dev/sdb [SAMSUNG HM640JJ::...] to /dev/null [0].
     655Proceed (y/N)? y
     656...
     657non-tried:        0 B,     errsize:      512 B,      run time:          2h
     658  rescued: 610480 MiB,      errors:        1,  remaining time:         n/a
     659percent rescued:  99.99%      time since last successful read:         20s
     660Finished
     661}}}
     662
     663The `ddrescue` map file now shows byte ranges of good and bad disk areas:
     664
     665{{{
     666root:~# cat disk.map
     667...
     668#      pos        size      status
     669  0x00000000  0x667FB14C00  +
     6700x667FB14C00    0x00000200  -  <=== 512 bytes unreadable
     6710x667FB14E00  0x2E8B541200  +
     672}}}
     673
     674Translate the byte position to the LBA:
     675
     676{{{
     677root:~# echo $((0x667FB14C00/512))
     678859822246
     679}}}
     680
     681Or convert the map file to a `badblocks` like list with `ddrescuelog` (part of recent versions of ''ddrescue'' package):
     682
     683{{{
     684root:~# ddrescuelog --list-blocks=- disk.map
     685859822246
     686}}}
     687
     688Both match the LBA reported by `smartctl`.
     689
     690==== Find affected file ====
     691
     692Get start offset of affected partition:
     693
     694{{{
     695root:~# fdisk --list /dev/sdb
     696...
     697Device     Boot Start        End    Sectors   Size Id Type
     698/dev/sdb1          63 1250258624 1250258562 596.2G  7 HPFS/NTFS/exFAT
     699}}}
     700
     701Get filesystem block (cluster) size if unknown (4096 in many cases):
     702
     703{{{
     704root:~# fsstat /dev/sdb1
     705...
     706File System Type: NTFS
     707...
     708Sector Size: 512
     709Cluster Size: 4096
     710...
     711}}}
     712
     713Calculate number of bad cluster as `(BAD_LBA - START_LBA) / SECTORS_PER_CLUSTER`:
     714
     715{{{
     716root:~# echo $(((859822246-63)/8))
     717107477772
     718}}}
     719
     720Find inode (here: MFT entry) used by this cluster:
     721
     722{{{
     723root:~# ifind -d 107477772 /dev/sdb1
     724663-128-2
     725}}}
     726
     727Print some info about this inode:
     728
     729{{{
     730root:~# istat /dev/sdb1 663-128-2
     731...
     732Name: Backup_2015-12-17.zip
     733Parent MFT Entry: 30    Sequence: 1
     734Allocated Size: 4660039680      Actual Size: 4660039516
     735Created:        2015-12-17 13:43:30.460000000 (CET)
     736File Modified:  2015-12-17 13:46:19.647000000 (CET)
     737...
     738Type: $DATA (128-2)   Name: N/A   Non-Resident   size: 4660039516  init_size: 4660039516
     739106950180 106950181 ...
     740...
     741107477772  <=== The bad cluster
     742...
     743108087884
     744}}}
     745
     746Find full path of affected file:
     747
     748{{{
     749root:~# ffind /dev/sdb1 663-128-2
     750/Backups/2015/Backup_2015-12-17.zip
     751}}}
     752
     753If the file is no longer needed, it could be overwritten in place and removed then. This is easy with `shred` from ''GNU coreutils'': `shred --iterations=1 --remove /PATH/TO/FILE`. This should reallocate the bad sector in most cases.
     754
     755==== Try to recover the bad sector ====
     756
     757Start with 100 read retries of the bad sector, write to `recovered.bin` if successful:
     758
     759{{{
     760root:~# ddrescue --ask --verbose --binary-prefixes --idirect --retry=100 \
     761                 --input-position=859822246s --output-position=0 --size=1s \
     762                 /dev/sdb recovered.bin recovered.map
     763...
     764Current status
     765     ipos: 419835 MiB, non-trimmed:        0 B,  current rate:      32 B/s
     766     opos:        0 B, non-scraped:        0 B,  average rate:       4 B/s
     767non-tried:        0 B,     errsize:        0 B,      run time:      1m 49s
     768  rescued:      512 B,      errors:        0,  remaining time:         n/a
     769percent rescued: 100.00%      time since last successful read:          0s
     770Finished
     771}}}
     772
     773We were very lucky:
     774
     775{{{
     776root:~# cat recovered.map
     777...
     778#      pos        size      status
     779  0x00000000  0x667FB14C00  ?
     7800x667FB14C00  0x00000200    +  <=== Now OK!
     7810x667FB14E00  0x2E8B541200  ?
     782}}}
     783
     784Check whether the disk firmware took the chance to reallocate the sector using the recovered data:
     785
     786{{{
     787root:~# dd skip=859822246 count=1 iflag=direct if=/dev/sdb of=test.bin
     788dd: error reading ‘/dev/sdb’: Input/output error
     7890+0 records in
     7900+0 records out
     7910 bytes (0 B) copied, 23.5006 s, 0.0 kB/s
     792}}}
     793
     794No luck in this case. So overwrite the sector manually:
     795
     796{{{
     797root:~# dd seek=859822246 count=1 oflag=direct if=recovered.bin of=/dev/sdb
     7981+0 records in
     7991+0 records out
     800512 bytes (512 B) copied, 1.05331 s, 0.5 kB/s
     801}}}
     802
     803Read data back and check:
     804
     805{{{
     806root:~# dd skip=859822246 count=1 iflag=direct if=/dev/sdb of=test.bin
     8071+0 records in
     8081+0 records out
     809512 bytes (512 B) copied, 0.0211745 s, 24.2 kB/s
     810
     811root:~# diff -s recovered.bin test.bin
     812Files recovered.bin and test.bin are identical
     813}}}
     814
     815Finally, run a SMART self-test and check its result:
     816
     817{{{
     818root:~# smartctl -t short /dev/sdb
     819...
     820Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
     821...
     822Please wait 2 minutes for test to complete.
     823
     824root:~# sleep 120 # :-)
     825
     826root:~# smartctl -x /dev/sdb
     827...
     828ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
     829...
     830  5 Reallocated_Sector_Ct   PO--CK   252   252   010    -    0   <=== Interesting...
     831...
     832  9 Power_On_Hours          -O--CK   100   100   000    -    252
     833...
     834197 Current_Pending_Sector  -O--CK   100   100   000    -    0   <=== As expected
     835...
     836SMART Extended Self-test Log Version: 1 (2 sectors)
     837Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
     838# 1  Short offline       Completed without error       00%       252         -          <=== Works again!
     839# 2  Short offline       Completed: read failure       90%       176         859822246
     840}}}
     841
     842Interestingly the `Reallocated_Sector_Ct` did not increase. Either the firmware did not record the reallocation or decided to reuse the original sector.
     843
     844Done!
    501845
    502846=== LVM repairs ===
     
    504848This section was written by Frederic BOITEUX. It was titled: "HOW TO LOCATE AND REPAIR BAD BLOCKS ON AN LVM VOLUME".
    505849
    506 Smartd reports an error in a short test :
     850Smartd reports an error in a short test :
    507851
    508852{{{
     
    516860So the disk has a bad block located in `LBA block 37383668`
    517861
    518 In which physical partition is the bad block ?
     862In which physical partition is the bad block ?
    519863
    520864{{{
     
    538882We have to find in which LVM2 logical partition the block belongs to.
    539883
    540 In which logical partition is the bad block ?
    541 
    542 ''IMPORTANT'' : LVM2 can use different schemes dividing its physical partitions to logical ones : linear, striped, contiguous or not... The following example assumes that allocation is linear !
     884In which logical partition is the bad block ?
     885
     886''IMPORTANT'' : LVM2 can use different schemes dividing its physical partitions to logical ones : linear, striped, contiguous or not... The following example assumes that allocation is linear !
    543887
    544888The physical partition used by LVM2 is divided in PE (Physical Extent) units of the same size, starting at `pe_start` 512 bytes blocks from the beginning of the physical partition.
    545889
    546 The `pvdisplay` command gives the size of the PE (in KB) of the LVM partition :
    547 
    548 {{{
    549 #  part=/dev/hdb3 ; pvdisplay -c $part | awk -F: '{print $8}'
     890The `pvdisplay` command gives the size of the PE (in KB) of the LVM partition :
     891
     892{{{
     893#  part=/dev/hdb3 ; pvdisplay -c $part | awk -F: '{print $8}'
    5508944096
    551895}}}
    552896
    553 To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this number by 2 : `4096 * 2 = 8192 blocks` for each PE.
    554 
    555 To find the offset from the beginning of the physical partition is a bit more difficult : if you have a recent LVM2 version, try :
     897To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this number by 2 : `4096 * 2 = 8192 blocks` for each PE.
     898
     899To find the offset from the beginning of the physical partition is a bit more difficult : if you have a recent LVM2 version, try :
    556900
    557901{{{
     
    559903}}}
    560904
    561 Either, you can look in `/etc/lvm/backup` :
     905Either, you can look in `/etc/lvm/backup` :
    562906{{{
    563907# grep pe_start $(grep -l $part /etc/lvm/backup/*)
     
    565909}}}
    566910
    567 Then, we search in which PE is the badblock, calculating the PE rank in which the faulty block of the partition is : `physical partition's bad block number / sizeof(PE)` =
     911Then, we search in which PE is the badblock, calculating the PE rank in which the faulty block of the partition is : `physical partition's bad block number / sizeof(PE)` =
    568912{{{
    56991336194858 / 8192 = 4418.3176
    570914}}}
    571915
    572 So we have to find in which LVM2 logical partition is used the PE number 4418 (count starts from 0) :
     916So we have to find in which LVM2 logical partition is used the PE number 4418 (count starts from 0) :
    573917{{{
    574918# lvdisplay --maps |egrep 'Physical|LV Name|Type'
     
    605949So the PE #4418 is in the `/dev/WDC80Go/ext1` LVM logical partition.
    606950
    607 Size of logical block of file system on `/dev/WDC80Go/ext1` :
    608 
    609 It's a ext3 fs, so I get it like this :
     951Size of logical block of file system on `/dev/WDC80Go/ext1` :
     952
     953It's a ext3 fs, so I get it like this :
    610954{{{
    611955# dumpe2fs /dev/WDC80Go/ext1 | grep 'Block size'
     
    614958}}}
    615959
    616 bad block number for the file system :
    617 
    618 The logical partition begins on `PE 3072` :
     960bad block number for the file system :
     961
     962The logical partition begins on `PE 3072` :
    619963{{{
    620964 (# PE's start of partition * sizeof(PE)) + parttion offset[pe_start] =
     
    622966}}}
    623967
    624 512b block of the physical partition, so the bad block number for the file system  is :
     968512b block of the physical partition, so the bad block number for the file system  is :
    625969{{{
    626970(36194858 - 25166208) / (sizeof(fs block) / 512)
     
    628972}}}
    629973
    630 Test of the fs bad block :
     974Test of the fs bad block :
    631975{{{
    632976dd if=/dev/WDC80Go/ext1 of=block1378581 bs=4096 count=1 skip=1378581
    633977}}}
    634978
    635 If this `dd` command succeeds, without any error message in console or syslog, then the block number calculation is probably wrong ! *Don't* go further, re-check it and if you don't find the error, please renounce !
    636 
    637 Search / correction follows the same scheme as for simple partitions :
     979If this `dd` command succeeds, without any error message in console or syslog, then the block number calculation is probably wrong ! *Don't* go further, re-check it and if you don't find the error, please renounce !
     980
     981Search / correction follows the same scheme as for simple partitions :
    638982* find possible impacted files with `debugfs` (`icheck <fs block nb>`, then `ncheck <icheck nb>`).
    639 * reallocate bad block writing zeros in it, ''using the fs block size'' :
     983* reallocate bad block writing zeros in it, ''using the fs block size'' :
    640984
    641985{{{
     
    643987}}}
    644988
    645 Et voilà !
    646 
    647 === Bad block reassignment ===
    648 
    649 The SCSI disk command set and associated disk architecture are assumed in this section. SCSI disks have their own logical to physical mapping allowing a damaged sector (usually carrying 512 bytes of data) to be remapped irrespective of the operating system, file system or software RAID being used.
    650 
    651 The terms ''block and sector'' are used interchangeably, although block tends to get used in higher level or more abstract contexts such as a ''logical block''.
    652 
    653 When a SCSI disk is formatted, defective sectors identified during the manufacturing process (the so called primary list: PLIST), those found during the format itself (the certification list: CLIST), those given explicitly to the format command (the DLIST) and optionally the previous grown list (GLIST) are not used in the logical block map. The number (and low level addresses) of the unmapped sectors can be found with the `READ DEFECT DATA SCSI` command.
    654 
    655 SCSI disks tend to be divided into zones which have spare sectors and perhaps spare tracks, to support the logical block address mapping process. The idea is that if a logical block is remapped, the heads do not have to move a long way to access the replacement sector. Note that spare sectors are a scarce resource.
    656 
    657 Once a SCSI disk format has completed successfully, other problems may appear over time. These fall into two categories:
    658 
    659 * recoverable: the Error Correction Codes (ECC) detect a problem but it is small enough to be corrected. Optionally other strategies such as retrying the access may retrieve the data.
    660 * unrecoverable: try as it may, the disk logic and ECC algorithms cannot recover the data. This is often reported as a ''medium error''.
    661 
    662 Other things can go wrong, typically associated with the transport and they will be reported using a term other than ''medium error''. For example a disk may decide a read operation was successful but a computer's host bus adapter (HBA) checking the incoming data detects a CRC error due to a bad cable or termination.
    663 
    664 Depending on the disk vendor, recoverable errors can be ignored. After all, some disks have up to 68 bytes of ECC above the payload size of 512 bytes so why use up spare sectors which are limited in number ^[#footnote8 [8]]^ ? If the disk can recover the data and does decide to re-allocate (reassign) a sector, then first it checks the settings of the `ARRE` and `AWRE` bits in the read-write error recovery mode page. Usually these bits are set ^[#footnote9 [9]]^ enabling automatic (read or write) re-allocation. The automatic re-allocation may also fail if the zone (or disk) has run out of spare sectors.
    665 
    666 Another consideration with RAIDs, and applications that require a high data rate without pauses, is that the controller logic may not want a disk to spend too long trying to recover an error.
    667 
    668 Unrecoverable errors will cause a ''medium error'' sense key, perhaps with some useful additional sense information. If the extended background self test includes a full disk read scan, one would expect the self test log to list the bad block, as shown in section [#Repairsinafilesystem Repairs in a file system]. Recent SCSI disks with a periodic background scan should also list unrecoverable read errors (and some recoverable errors as well). The advantage of the background scan is that it runs to completion while self tests will often terminate at the first serious error.
    669 
    670 SCSI disks expect unrecoverable errors to be fixed manually using the `REASSIGN BLOCKS SCSI` command since loss of data is involved. It is possible that an operating system or a file system could issue the `REASSIGN BLOCKS` command itself but the authors are unaware of any examples. The `REASSIGN BLOCKS` command will reassign one or more blocks, attempting to (partially ?) recover the data (a forlorn hope at this stage), fetch an unused spare sector from the current zone while adding the damaged old sector to the GLIST (hence the name ''grown'' list). The contents of the GLIST may not be that interesting but `smartctl` prints out the number of entries in the grown list and if that number grows quickly, the disk may be approaching the end of its useful life.
    671 
    672 Here is an alternate brute force technique to consider: if the data on the SCSI or ATA disk has all been backed up (e.g. is held on the other disks in a RAID 5 enclosure), then simply reformatting the disk may be the least cumbersome approach.
    673 
    674 ==== Example ====
    675 
    676 Given a ''bad block'', it still may be useful to look at the `fdisk` command (if the disk has multiple partitions) to find out which partition is involved, then use `debugfs` (or a similar tool for the file system in question) to find out which, if any, file or other part of the file system may have been damaged. This is discussed in section [#Repairsinafilesystem Repairs in a file system].
    677 
    678 Then a program that can execute the `REASSIGN BLOCKS SCSI` command is required. In Linux (2.4 and 2.6 series), FreeBSD, Tru64(OSF) and Windows the author's `sg_reassign` utility in the `sg3_utils` package can be used. Also found in that package is `sg_verify` which can be used to check that a block is readable.
    679 
    680 Assume that `logical block address 1193046` (which is `123456` in hex) is corrupt ^[#footnote10 [10]]^ on the disk at `/dev/sdb`. A long selftest command like `smartctl -t long /dev/sdb` may result in log results like this:
    681 
    682 {{{
    683 # smartctl -l selftest /dev/sdb
    684 smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
    685 Home page is http://smartmontools.sourceforge.net/
    686 SMART Self-test log
    687 Num  Test              Status            segment  LifeTime  LBA_first_err [SK ASC ASQ]
    688      Description                         number   (hours)
    689 # 1  Background long   Failed in segment      -     354           1193046 [0x3 0x11 0x0]
    690 # 2  Background short  Completed              -     323                 - [-   -    -]
    691 # 3  Background short  Completed              -     194                 - [-   -    -]
    692 }}}
    693 
    694 The `sg_verify` utility can be used to confirm that there is a problem at that address:
    695 
    696 {{{
    697 # sg_verify --lba=1193046 /dev/sdb
    698 verify (10):  Fixed format, current;  Sense key: Medium Error
    699  Additional sense: Unrecovered read error
    700   Info fld=0x123456 [1193046]
    701   Field replaceable unit code: 228
    702   Actual retry count: 0x008b
    703 medium or hardware error, reported lba=0x123456
    704 }}}
    705 
    706 Now the GLIST length is checked before the block reassignment:
    707 
    708 {{{
    709 # sg_reassign --grown /dev/sdb
    710 >> Elements in grown defect list: 0
    711 }}}
    712 
    713 And now for the actual reassignment followed by another check of the GLIST length:
    714 
    715 {{{
    716 # sg_reassign --address=1193046 /dev/sdb
    717 # sg_reassign --grown /dev/sdb
    718 >> Elements in grown defect list: 1
    719 }}}
    720 
    721 The GLIST length has grown by one as expected. If the disk was unable to recover any data, then the ''new'' block at lba `0x123456` has vendor specific data in it. The `sg_reassign` utility can also do bulk reassigns, see `man sg_reassign` for more information.
    722 
    723 The `dd` command could be used to read the contents of the ''new'' block:
    724 
    725 {{{
    726 # dd if=/dev/sdb iflag=direct skip=1193046 of=blk.img bs=512 count=1
    727 }}}
    728 
    729 and a hex editor ^[#footnote11 [11]]^ used to view and potentially change the `blk.img` file. An altered `blk.img` file (or `/dev/zero`) could be written back with:
    730 
    731 {{{
    732 # dd if=blk.img of=/dev/sdb seek=1193046 oflag=direct bs=512 count=1
    733 }}}
    734 
    735 More work may be needed at the file system level, especially if the reassigned block held critical file system information such as a superblock or a directory.
    736 
    737 Even if a full backup of the disk is available, or the disk has been ''ejected'' from a RAID, it may still be worthwhile to reassign the bad block(s) that caused the problem (or simply format the disk (see `sg_format` in the `sg3_utils package`)) and re-use the disk later (not unlike the way a replacement disk from a manufacturer might be used).
    738 
    739 
    740 == Case Studies ==
    741 
    742 This section is intended to collect step-by-step descriptions of some real-life use cases.
    743 
    744 === Recovering a (mostly) unreadable sector of a Notebook HDD ===
    745 
    746 This was done in March 2016 under Windows 7 using ''Cygwin''^[#footnote12 12.]^ ports of ''GNU ddrescue''^[#footnote13 13.]^ and ''The Sleuth Kit (TSK)''^[#footnote14 14.]^. All commands shown should work similar on other platforms and with other filesystems.
    747 
    748 ==== Determine Logical Block Address of unreadable sector ====
    749 
    750 Examine smartctl output:
    751 {{{
    752 root:~# smartctl -x /dev/sdb
    753 smartctl 6.5 2016-02-29 r4227 [x86_64-w64-mingw32-win7-sp1] (daily-20160229)
    754 ...
    755 Model Family:     SAMSUNG SpinPoint MP5
    756 Device Model:     SAMSUNG HM640JJ
    757 ...
    758 Firmware Version: 2AK10001
    759 User Capacity:    640.135.028.736 bytes [640 GB]
    760 Sector Size:      512 bytes logical/physical
    761 Rotation Rate:    7200 rpm
    762 Form Factor:      2.5 inches
    763 ...
    764 ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
    765 ...
    766   5 Reallocated_Sector_Ct   PO--CK   252   252   010    -    0
    767 ...
    768   9 Power_On_Hours          -O--CK   100   100   000    -    251  <=== See Self-test Log below
    769 ...
    770 197 Current_Pending_Sector  -O--CK   100   100   000    -    1    <=== At least 1 bad sector
    771 ...
    772 SMART Extended Comprehensive Error Log Version: 1 (2 sectors)
    773 Device Error Count: 351 (device log contains only the most recent 8 errors)
    774 ...
    775 Error 351 [6] occurred at disk power-on lifetime: 251 hours (10 days + 11 hours)
    776   When the command that caused the error occurred, the device was active or idle.
    777 
    778   After command completion occurred, registers were:
    779   ER -- ST COUNT  LBA_48  LH LM LL DV DC
    780   -- -- -- == -- == == == -- -- -- -- --
    781   40 -- 51 00 01 00 00 33 3f d8 a6 40 00  Error: UNC 1 sectors at LBA = 0x333fd8a6 = 859822246  <=== Its LBA
    782 
    783   Commands leading to the command that caused the error were:
    784   CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
    785   -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
    786   25 00 00 00 01 00 00 33 3f d8 a6 40 00     00:00:06.924  READ DMA EXT
    787 ...
    788 
    789 SMART Extended Self-test Log Version: 1 (2 sectors)
    790 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    791 # 1  Short offline       Completed: read failure       90%       176         859822246  <=== Detected 75 power on hours ago
    792 }}}
    793 
    794 A read scan helps to verify the LBA and checks for other possible bad sectors
    795 (alternatively replace `/dev/null` by a file path to create a disk image):
    796 
    797 {{{
    798 root:~# ddrescue --ask --verbose --binary-prefixes --idirect --force /dev/sdb /dev/null disk.map
    799 GNU ddrescue 1.21-rc2
    800 About to copy 610480 MiBytes from /dev/sdb [SAMSUNG HM640JJ::...] to /dev/null [0].
    801 Proceed (y/N)? y
    802 ...
    803 non-tried:        0 B,     errsize:      512 B,      run time:          2h
    804   rescued: 610480 MiB,      errors:        1,  remaining time:         n/a
    805 percent rescued:  99.99%      time since last successful read:         20s
    806 Finished
    807 }}}
    808 
    809 The `ddrescue` map file now shows byte ranges of good and bad disk areas:
    810 
    811 {{{
    812 root:~# cat disk.map
    813 ...
    814 #      pos        size      status
    815   0x00000000  0x667FB14C00  +
    816 0x667FB14C00    0x00000200  -  <=== 512 bytes unreadable
    817 0x667FB14E00  0x2E8B541200  +
    818 }}}
    819 
    820 Translate the byte position to the LBA:
    821 
    822 {{{
    823 root:~# echo $((0x667FB14C00/512))
    824 859822246
    825 }}}
    826 
    827 Or convert the map file to a `badblocks` like list with `ddrescuelog` (part of recent versions of ''ddrescue'' package):
    828 
    829 {{{
    830 root:~# ddrescuelog --list-blocks=- disk.map
    831 859822246
    832 }}}
    833 
    834 Both match the LBA reported by `smartctl`.
    835 
    836 ==== Find affected file ====
    837 
    838 Get start offset of affected partition:
    839 
    840 {{{
    841 root:~# fdisk --list /dev/sdb
    842 ...
    843 Device     Boot Start        End    Sectors   Size Id Type
    844 /dev/sdb1          63 1250258624 1250258562 596.2G  7 HPFS/NTFS/exFAT
    845 }}}
    846 
    847 Get filesystem block (cluster) size if unknown (4096 in many cases):
    848 
    849 {{{
    850 root:~# fsstat /dev/sdb1
    851 ...
    852 File System Type: NTFS
    853 ...
    854 Sector Size: 512
    855 Cluster Size: 4096
    856 ...
    857 }}}
    858 
    859 Calculate number of bad cluster as `(BAD_LBA - START_LBA) / SECTORS_PER_CLUSTER`:
    860 
    861 {{{
    862 root:~# echo $(((859822246-63)/8))
    863 107477772
    864 }}}
    865 
    866 Find inode (here: MFT entry) used by this cluster:
    867 
    868 {{{
    869 root:~# ifind -d 107477772 /dev/sdb1
    870 663-128-2
    871 }}}
    872 
    873 Print some info about this inode:
    874 
    875 {{{
    876 root:~# istat /dev/sdb1 663-128-2
    877 ...
    878 Name: Backup_2015-12-17.zip
    879 Parent MFT Entry: 30    Sequence: 1
    880 Allocated Size: 4660039680      Actual Size: 4660039516
    881 Created:        2015-12-17 13:43:30.460000000 (CET)
    882 File Modified:  2015-12-17 13:46:19.647000000 (CET)
    883 ...
    884 Type: $DATA (128-2)   Name: N/A   Non-Resident   size: 4660039516  init_size: 4660039516
    885 106950180 106950181 ...
    886 ...
    887 107477772  <=== The bad cluster
    888 ...
    889 108087884
    890 }}}
    891 
    892 Find full path of affected file:
    893 
    894 {{{
    895 root:~# ffind /dev/sdb1 663-128-2
    896 /Backups/2015/Backup_2015-12-17.zip
    897 }}}
    898 
    899 If the file is no longer needed, it could be overwritten in place and removed then. This is easy with `shred` from ''GNU coreutils'': `shred --iterations=1 --remove /PATH/TO/FILE`. This should reallocate the bad sector in most cases.
    900 
    901 ==== Try to recover the bad sector ====
    902 
    903 Start with 100 read retries of the bad sector, write to `recovered.bin` if successful:
    904 
    905 {{{
    906 root:~# ddrescue --ask --verbose --binary-prefixes --idirect --retry=100 \
    907                  --input-position=859822246s --output-position=0 --size=1s \
    908                  /dev/sdb recovered.bin recovered.map
    909 ...
    910 Current status
    911      ipos: 419835 MiB, non-trimmed:        0 B,  current rate:      32 B/s
    912      opos:        0 B, non-scraped:        0 B,  average rate:       4 B/s
    913 non-tried:        0 B,     errsize:        0 B,      run time:      1m 49s
    914   rescued:      512 B,      errors:        0,  remaining time:         n/a
    915 percent rescued: 100.00%      time since last successful read:          0s
    916 Finished
    917 }}}
    918 
    919 We were very lucky:
    920 
    921 {{{
    922 root:~# cat recovered.map
    923 ...
    924 #      pos        size      status
    925   0x00000000  0x667FB14C00  ?
    926 0x667FB14C00  0x00000200    +  <=== Now OK!
    927 0x667FB14E00  0x2E8B541200  ?
    928 }}}
    929 
    930 Check whether the disk firmware took the chance to reallocate the sector using the recovered data:
    931 
    932 {{{
    933 root:~# dd skip=859822246 count=1 iflag=direct if=/dev/sdb of=test.bin
    934 dd: error reading ‘/dev/sdb’: Input/output error
    935 0+0 records in
    936 0+0 records out
    937 0 bytes (0 B) copied, 23.5006 s, 0.0 kB/s
    938 }}}
    939 
    940 No luck in this case. So overwrite the sector manually:
    941 
    942 {{{
    943 root:~# dd seek=859822246 count=1 oflag=direct if=recovered.bin of=/dev/sdb
    944 1+0 records in
    945 1+0 records out
    946 512 bytes (512 B) copied, 1.05331 s, 0.5 kB/s
    947 }}}
    948 
    949 Read data back and check:
    950 
    951 {{{
    952 root:~# dd skip=859822246 count=1 iflag=direct if=/dev/sdb of=test.bin
    953 1+0 records in
    954 1+0 records out
    955 512 bytes (512 B) copied, 0.0211745 s, 24.2 kB/s
    956 
    957 root:~# diff -s recovered.bin test.bin
    958 Files recovered.bin and test.bin are identical
    959 }}}
    960 
    961 Finally, run a SMART self-test and check its result:
    962 
    963 {{{
    964 root:~# smartctl -t short /dev/sdb
    965 ...
    966 Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    967 ...
    968 Please wait 2 minutes for test to complete.
    969 
    970 root:~# sleep 120 # :-)
    971 
    972 root:~# smartctl -x /dev/sdb
    973 ...
    974 ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
    975 ...
    976   5 Reallocated_Sector_Ct   PO--CK   252   252   010    -    0   <=== Interesting...
    977 ...
    978   9 Power_On_Hours          -O--CK   100   100   000    -    252
    979 ...
    980 197 Current_Pending_Sector  -O--CK   100   100   000    -    0   <=== As expected
    981 ...
    982 SMART Extended Self-test Log Version: 1 (2 sectors)
    983 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    984 # 1  Short offline       Completed without error       00%       252         -          <=== Works again!
    985 # 2  Short offline       Completed: read failure       90%       176         859822246
    986 }}}
    987 
    988 Interestingly the `Reallocated_Sector_Ct` did not increase. Either the firmware did not record the reallocation or decided to reuse the original sector.
    989 
    990 Done!
     989Et voilà !
    991990
    992991== Footnotes ==