Linux swap robustness and/or mirroring - kernel 3.2

2

i have 8x 2TB disks in raid6 with mdadm and I decided to add a little "slack" partition to each drive just in case a replacement drive is not exactly the same size of the drives i now use. The purpose being that I don't risk not being able to add a disk to the raid because it is too small.

Now, for the slack partitions I simply designated them as swap partitions (can't hurt). Now from what I understand Linux does a round robin on those disks when it allocates pages for swap. Again to my best knowledge this means that all my swap space is roughly equivalent to a raid0 stripe set. Now if one disk should fail would it mean that...

A: My entire swap space is corrupted or in a invalid state?

B: Any program that has pages on the swap device that failed is now compromized (or terminated?!)

C: I would be better off running a mdadm raid10 on the swap partitions and creating a swapfile instead or swapping directly to the mdX device?!

I appreciate if someone could actually shed some light on how Linux handles swap in case of an failure.

Waxhead

Posted 2013-05-07T18:57:34.677

Reputation: 1 092

Similar to http://serverfault.com/questions/195839/where-should-my-swap-partition-s-live-when-using-software-raid1-performance-lv, but not a duplicate.

– Jonathan Ben-Avraham – 2013-05-07T22:14:06.960

Answers

2

There are no dependencies in the kernel between swap partitions or files, only a low-level mapping of pages within each swap area. Reading pages from swap is a low-level read of several contiguous sectors from the disk. So if one swap area fails, the others are not affected.

If a swap partition fails, no one will notice until there is a hard page fault in some process that uses the failed partition. The likely outcome is that the memory page allocated to receive the swapped-in page will remain bzeroed and the process will segfault. You will likely see "Read-error on swap-device ..." in the kernel log but other than logging the error the kernel does not mark the swap area as bad. On write to swap, the kernel prints "Write-error on swap-device" in the klog and re-dirties the page so that it isn't written to again, but the damage is already done as far as the current process is concerned. There is no code to re-try the bad write at another position in the partition or in a different swap partition.

There are some folks who swap on RAID1, see the SE post in my comment to your OP. I find it difficult to believe that this will not negatively affect performance in a swappy system. Perhaps they don't see it because they have a lot of RAM and their applications don't cause swappiness. The purpose of RAID in any event is to protect your peristent data, not your swap. It's a little like mounting /tmp on RAID5 and doing a nightly incremental backup. My recommendation is to buy "RAID Edition" (i.e. highest quality) disks and swap on the raw partitions as you are currently doing.

Jonathan Ben-Avraham

Posted 2013-05-07T18:57:34.677

Reputation: 936