Is there anyway, with Linux, to purposely cause a block device to report an I/O error, or possibly simulate one for testing purposes?
-
Are you simulating a disk failure? Perhaps you could mount a directory and then unmount it while it was in use. – Shef Apr 12 '13 at 20:50
-
2I'd write a little kernel module that you could load with `modprobe`, behaving like a block device, and then another little program that sends `ioctl()'s` to the driver to make it return the value you want. – ott-- Apr 12 '13 at 21:06
-
Same question on [Stack Overlflow](http://stackoverflow.com/questions/11228509/generate-a-read-error) and on [Unix and Linux](http://unix.stackexchange.com/questions/77492/special-file-that-causes-i-o-error). – Gilles 'SO- stop being evil' May 29 '13 at 21:30
-
To follow up the comment @Gilles made, this was was also asked on http://stackoverflow.com/questions/1361518/how-can-i-simulate-a-failed-disk-during-testing (several different fault injection answers) and http://stackoverflow.com/questions/1870696/simulate-a-faulty-block-device-with-read-errors (use device mapper). – Anon Jun 14 '14 at 05:30
5 Answers
Yes, theres a very plausible way to do this with device mapper.
The device mapper can recombine block devices into a new mapping/order of your choosing. LVM does this. It also supports other targets, (some which are quite novel) like 'flakey' to simiulate a failing disk and 'error' to simulate failed regions of disk.
One can construct a device which deliberate has IO blackholes on it which will report IO errors when crossed.
First, create some virtual volume to use as a target and make it addressable as a block device.
dd if=/dev/zero of=/var/lib/virtualblock.img bs=512 count=1048576
losetup /dev/loop0 /var/lib/virtualblock.img
So, to start this creates a 512M file that is the basis of our virtual block device which we will punch a 'hole' in. No hole exists yet though. If you were to mkfs.ext4 /dev/loop0
you'd get a perfectly valid filesystem.
So, lets use dmsetup which, using this block device -- will create a new device which has some holes in it. Here is an example first
dmsetup create errdev0
0 261144 linear /dev/loop0 0
261144 5 error
261149 787427 linear /dev/loop0 261139
This will create a device called 'errdev0' (typically in /dev/mapper). When you type dmsetup create errdev0
it will wait for stdin and will finish on ^D being input.
In the example above, we've made a 5 sector hole (2.5kb) at sectors 261144 of the loop device. We then continue through the loop device as normal.
This script will attempt to generate you a table that will place holes at random locations approximately spread out around 16Mb (although its pretty random).
#!/bin/bash
start_sector=0
good_sector_size=0
for sector in {0..1048576}; do
if [[ ${RANDOM} == 0 ]]; then
echo "${start_sector} ${good_sector_size} linear /dev/loop0 ${start_sector}"
echo "${sector} 1 error"
start_sector=$((${sector}+1))
good_sector_size=0
else
good_sector_size=$((${good_sector_size}+1))
fi
done
echo "${start_sector} $((${good_sector_size}-1)) linear /dev/loop0 ${start_sector}"
The script assumes you have also created a 512Mb device and that your virtual block device is on /dev/loop0
.
You can just output this data to a text file as a table and pipe it into dmsetup create errdev0
.
Once you have created the device you can then begin to use it like a normal block device, first by formatting it and then by placing files on it. At some point you should come across some IO problems where you hit sectors that are really IO holes in the virtual device.
Once you have finished use dmsetup remove errdev0
to remove the device.
If you want to make it more likely to get an IO error you can add holes more frequently or change the size of the holes you create. Note putting errors in certain sections is likely to cause problems off of the get-go, I.E at 32mb into a device you cant write a superblock which ext normally tries to do, so the format wont work..
For added fun -- you can actually just losetup
then mkfs.ext4 /dev/loop0
and fill it with data. Once you've got a nice working filesystem on there, simply unmount the filesystem and add some holes using dmsetup and remount that!
- 22,927
- 2
- 54
- 71
For checking program's robustness in case their output fails, you can use the pseudodevice /dev/full
, which always returns "ENOSPACE" when written to.
$ dd if=/dev/zero of=/dev/full
dd: writing to `/dev/full': No space left on device
1+0 records in
0+0 records out
- 330
- 2
- 7
-
Hi, I tried your suggestions and it worked, but it only displays the error message in cmd and not in the dmesg logfile. Is there a way to dispaly such error mesages to the dmesg or messages logfile? – nick_g Dec 14 '20 at 11:00
-
nick_g: why wouldn't you want to get the error at the point where the system call is executed , and get it asynchronously and mixed with a lot of other possible messages from all around your system instead? Maybe you need to improve your error-handling mechanisms all around your codebase? – Raúl Salinas-Monteagudo Jan 11 '21 at 13:03
Depends on what you want to test. Using an LD_PRELOAD
ed library, you can trick applications into thinking things like 'all writes fail with ENOSPC
or EIO
' for instance.
- 18,793
- 2
- 43
- 69
You can do that in oh so many interesting ways. See https://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt
- 17,764
- 2
- 30
- 47
-
3Could you highlight the relevant "interesting" ways that are specific for disk requests (`fail_make_request`)? Would also be great to prevent link rot. – Deer Hunter Apr 13 '13 at 03:57
Maybe you could change the partition table and make the partition bigger that it really is. That would probably cause an i/o error. Or if your disks are hot pluggable you could just pull one out.
- 3,692
- 1
- 21
- 28