3

My understanding of SCSI timeouts is that any read, write, flush and other commands have a limited time to complete. If exceeded, the command is aborted and an error is reported to the upper layer. While waiting for the command to complete, any application depending on the I/O will stall.

My next layer would be mdraid, the Linux software RAID. From what I read, mdraid has no timeouts on its own but relies on the lower layer to timeout commands.

The default SCSI timeout value is 90 seconds for Kernel 3.2 (Debian).

A hard disk encountering a read error will try hard to correct the error within a time frame defined by firmware. That timeout is set high for desktop drives (typically stand-alone, so correction has high priority) and low for server drives (typically RAID, so report bad sector soon, let other drive answer). Sometimes it can be adjusted via smartctl (SCTERC, TLER, etc.).

So I guess if an HDD is set to a high ERC timeout, kernel will wait for 90 seconds by default before aborting the request. Only then will mdraid redirect the application's request to another disk.

90 seconds is a loooong time for a webpage to load.

Is it correct to assume the default SCSI timeout is meant for desktop purposes or non-hdd SCSI equipment (tape drive, tape library come to mind), and safe to tune down to, say, 7 seconds for RAID usage?

korkman
  • 1,647
  • 2
  • 13
  • 26

1 Answers1

3

Suitability depends on your needs. For you, it sounds like 90 seconds is not a good fit.

I have seen vendor-documentation in the past recommending that HBA timeouts be set over 60 seconds in order to better handle things like array failover, firmware updates to controllers, and suchlike. The down-side is as you point out: it can lead to very long lags to return storage.

And actually that's not a bad thing. Many operating systems will forcibly dismount a LUN if it gets HBA timeouts on it, which can be far more disruptive than an occasional long lag to return a block. The trick is to balance the following:

  • Your storage stack's likelihood of producing long lags
  • Your tolerance of late data
  • Your tolerance of dismounted LUNs

In general, the disks you put into a RAID array should have a low timeout value since it lets the RAID controller know to handle the block request elsewhere. This is one big reason why consumer-grade drives are a bad idea when used with hardware RAID cards; their timeouts are very long, which can lead to just the problem you don't want.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • Reading this http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/task_controlling-scsi-command-timer-onlining-devices.html carefully, I think I underestimated the scope of SCSI command timeouts. Setting a low timeout would turn bad sectors ultimately into HBA resets, affecting the whole array attached. Right? – korkman Apr 17 '12 at 20:53
  • @korkman It's called a loop-reset and they do happen. Linux has been pretty good about it the last several years, but losing all storage attached to that SCSI-bus is one of the possible failure modes. – sysadmin1138 Apr 17 '12 at 21:18
  • I did see controller resets happen, but only with RAID cards (Adaptec 5xx5, 6xx5, LSI 9825) and they take minutes too boot. Is this different with HBAs? I feel resetting the bus is an overreaction to communication loss with one device ... – korkman Apr 18 '12 at 16:51
  • **You _don't_ necessarily want your disks to have a low timeout value as it could make you _lose_ data!** In a degraded RAID for example, see [this post](http://forums.storagereview.com/index.php/topic/29208-h/?p=266337) I'm quoting: "The danger in TLER lies that if you lost your redundancy, then if a weak sector occurs that COULD be recovered, TLER will force the drive to STOP TRYING after 7 seconds. If it didn't fix it by then, and you lost your redundancy, then TLER is a harmful property instead of a useful one." – Totor Nov 02 '13 at 16:31