Changes between Version 2 and Version 3 of Ticket #658, comment 12


Ignore:
Timestamp:
Feb 26, 2016, 3:35:11 PM (6 years ago)
Author:
Ch.Ris
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #658, comment 12

    v2 v3  
    1111The error recovery (ERC) time of a drive *must* be shorter than the system's controller timeout. Otherwise errors will cause a controller reset and the loss of all unwritten data. Unfortunately, many drives by default have very long or disabled timeouts.
    1212
    13 With redundant RAID hardware or software configurations a drive's timeout shorter than the controller's timeout is equally important. Here, resetting an entire drive instead of just retrying the failed block causes entire drives being marked as unusable, reducing the redundancy and performance. Furthermore, during the re-sync of a drive there is a high likelihood of errors to occur (seldom used areas), and a drive reset during the re-sync can render the entire array unusable. Limiting the drives' recovery timeout also allows for improved error handling in hardware or software RAID environments. Instead of waiting for one drive to recover requested data, it can quickly be read from another (redundant) drive.
     13With redundant RAID hardware or software configurations a drive's timeout shorter than the controller's timeout is equally important. Here, resetting an entire drive instead of just retrying the failed block causes entire drives being marked as unusable, reducing the redundancy and performance. Furthermore, during the re-sync of a drive there is a high likelihood of errors to occur (seldom used areas), and a drive reset during the re-sync can render the entire array unusable as all unwritten meta data is lost. Limiting the drives' recovery timeout also allows for improved error handling in hardware or software RAID environments. Instead of waiting for one drive to recover requested data, it can quickly be read from another (redundant) drive.
    1414
    1515...