RAID 0 (stripe) set can not be booted. `write same failed` error

1

I have a Dell precision T5400 workstation with a Dell hardware SAS raid controller and 2 SAS drives in striping.

Since kernel 3.9 was released I seem to be unable to boot properly and get a "Write Same Failed" error that floods my screen. The issue is reported here by another user.

I wanted to change the value max_write_same_blocks as described in the link above but was unable to locate this file anywhere in /sys/devices/.

I do have 2 entries in /sys/devices/pci0000:00 folder that have in total 8 ATA entries within them 0000:00:1f.1 and 0000:00:1f.2 but they keep moving around.

One boot 0000:00:1f.1 might have the folders ATA1 and ATA2 listed with host0 and host1 listed within those folders respectively. Next boot it might have ATA7 and ATA8 listed with a entirely different host listed inside as well.

I tried making a /etc/tmpfiles.d/scsi.conf file to write the max_write_same_blocks command and add the value 0 to it but since the assignments keep changing it still fails.

I also tried the following.

  • I tried installing the drivers from Dell (with much difficulty as they are RPM's and not build for my ARCH system).
  • I tried the 3.10 kernel as well from the testing repo's, same deal.
  • I Tried a mirror instead off a striping set but nothing changed.
  • I Tried upgrading the firmware of the raid controller via Windows (I was desperate).

I am desperate at this point as the only machine left to me is a little atom netbook :-) The only way to access my machine is via a chroot from the install DVD.

Is there a way to generally disable write_same if the machine doesn't support it ? Is this actually a bug in the kernel that is going to be fixed, or did they permanently change something > Kernel 3.9 that breaks my install ?

I would appreciate any insight or suggestion you an give me.

Grand Master Tux

Posted 2013-07-08T11:11:10.083

Reputation: 11

Which RAID system are you using. The onboard one? The optional PERC 6? https://patchwork.kernel.org/patch/1898441/ patches drivers/md/dm-table.c to I assume onboard (ATA?) Intel fake RAID. Is this correct?

– Hennes – 2013-07-08T11:47:43.203

I am using the Perc6 controller that was fitted to the unit when i bought it, the motherboard has no build in SAS raid controller as far as i know. – Grand Master Tux – 2013-07-08T12:17:56.843

Can you boot an older kernel (one which worked) and then try a new kernel with this patch ? I know manually patching a kernel for each new version is not a great solution, but it will allow you work on something better than your little netbook until a proper solution is found.

– Hennes – 2013-07-08T12:49:56.010

i can indeed use the LTS kernel and boot, but that was not what i was looking for. i will try the patch though. – Grand Master Tux – 2013-07-08T12:59:12.083

The Idea is to boot a working kernel (e.g. the LTS one), download the tarball of a modern kernel, patch the modern kernel and boot the patched, modern kernel until a fix is found. Given that this problem happens to several configuration (not just PERC6, but also at least also to some H200's and some 3ware cards) there should be a fix 'soon'. – Hennes – 2013-07-08T15:27:48.313

No answers