Changes between Version 27 and Version 28 of BadBlockHowto


Ignore:
Timestamp:
May 21, 2023, 12:02:47 PM (11 months ago)
Author:
Artoria2e5
Comment:

AWRE works on unrecoverable too

Legend:

Unmodified
Added
Removed
Modified
  • BadBlockHowto

    v27 v28  
    542542=== Bad block reassignment ===
    543543
    544 The SCSI disk command set and associated disk architecture are assumed in this section. SCSI disks have their own logical to physical mapping allowing a damaged sector (usually carrying 512 bytes of data) to be remapped irrespective of the operating system, file system or software RAID being used.
    545 
    546 The terms ''block'' and ''sector'' are used interchangeably, although block tends to get used in higher level or more abstract contexts such as a ''logical block''.
    547 
    548 When a SCSI disk is formatted, defective sectors identified during the manufacturing process (the so called primary list: PLIST), those found during the format itself (the certification list: CLIST), those given explicitly to the format command (the DLIST) and optionally the previous grown list (GLIST) are not used in the logical block map. The number (and low level addresses) of the unmapped sectors can be found with the `READ DEFECT DATA SCSI` command.
     544The SCSI disk command set and associated disk architecture are assumed in this section. SCSI disks have their own logical to physical mapping allowing a damaged sector (usually carrying 512 bytes of data) to be remapped irrespective of the operating system, file system or software RAID being used. The terms ''block'' and ''sector'' are used interchangeably, although block tends to get used in higher level or more abstract contexts such as a ''logical block''.
     545
     546When a SCSI disk is formatted (see command `sg_format`), defective sectors identified during the manufacturing process (the so called primary list: PLIST), those found during the format itself (the certification list: CLIST), those given explicitly to the format command (the DLIST) and optionally the previous grown list (GLIST) are not used in the logical block map. The number (and low level addresses) of the unmapped sectors can be found with the `READ DEFECT DATA SCSI` command.
    549547
    550548SCSI disks tend to be divided into zones which have spare sectors and perhaps spare tracks, to support the logical block address mapping process. The idea is that if a logical block is remapped, the heads do not have to move a long way to access the replacement sector. Note that spare sectors are a scarce resource.
     
    563561Unrecoverable errors will cause a ''medium error'' sense key, perhaps with some useful additional sense information. If the extended background self test includes a full disk read scan, one would expect the self test log to list the bad block, as shown in section [#Repairsinafilesystem Repairs in a file system]. Recent SCSI disks with a periodic background scan should also list unrecoverable read errors (and some recoverable errors as well). The advantage of the background scan is that it runs to completion while self tests will often terminate at the first serious error.
    564562
    565 SCSI disks expect unrecoverable errors to be fixed manually using the `REASSIGN BLOCKS SCSI` command since loss of data is involved. It is possible that an operating system or a file system could issue the `REASSIGN BLOCKS` command itself but the authors are unaware of any examples. The `REASSIGN BLOCKS` command will reassign one or more blocks, attempting to (partially ?) recover the data (a forlorn hope at this stage), fetch an unused spare sector from the current zone while adding the damaged old sector to the GLIST (hence the name ''grown'' list). The contents of the GLIST may not be that interesting but `smartctl` prints out the number of entries in the grown list and if that number grows quickly, the disk may be approaching the end of its useful life.
     563SCSI disks expect unrecoverable errors to be fixed in one of the two ways (SBC-4 sect 4.13.1):
     564
     5651. Simple overwriting, if `AWRE` bit is set. If writing suceeds, no remap would happen. If it does not, automatic write reassignment (AWRE) happens. SCSI specifically mentions that writing should only be done with valid data (SBC-4 sect 6.5.10), e.g. recovered from RAID.
     5662. The `REASSIGN BLOCKS` SCSI command. This command will reassign one or more blocks, attempting to (partially ?) recover the data (a forlorn hope at this stage).
     567
     568In either case, remapping will fetch an unused spare sector from the current zone while adding the damaged old sector to the GLIST (hence the name ''grown'' list). The difference is in the `REASSIGN STATUS` field from Background Scan Results, which describes how a reassignment happened. The contents of the GLIST may not be that interesting but `smartctl` prints out the number of entries in the grown list and if that number grows quickly, the disk may be approaching the end of its useful life.
     569
     570In the ATA command set, the OS is not given access to such fine-grained control as in SCSI. The equiavelant of AWRE nearly always happens, so all you do is write over the defect.
    566571
    567572Here is an alternate brute force technique to consider: if the data on the SCSI or ATA disk has all been backed up (e.g. is held on the other disks in a RAID 5 enclosure), then simply reformatting the disk may be the least cumbersome approach. Make sure to disable "quick format" so the formatting actually write through the entire disk!