9

I am going to be testing 'xfs_repair' on some large file systems ( around 50 TB ) as in the past the memory usage is high. While I could test the program only on file systems which were correct it would be good to test them on a corrupt system.

So what would be the best way to corrupt a file system. Extra credit if the method repeatedly gives the same corruption every time....

To give people an idea of what I mean in 2006 ish

"To successfully check or run repair on a multi-terabyte filesystem, you need:

  • a 64bit machine
  • a 64bit xfs _ repair/xfs _ check binary
  • ~2GB RAM per terabyte of filesystem
  • 100-200MB of RAM per million inodes in the filesystem.

xfs_repair will usually use less memory than this, but these numbers give you a ballpark figure for what a large filesystem that is > 80% full can require to repair.

FWIW, last time this came up internally, the 29TB filesystem in question took ~75GB of RAM+swap to repair."

James
  • 2,212
  • 1
  • 13
  • 19

3 Answers3

13

xfs_db has an option blocktrash which

Trash randomly selected filesystem metadata blocks. Trashing occurs to randomly selected bits in the chosen blocks. This command is available only in debugging versions of xfs_db. It is useful for testing xfs_repair(8) and xfs_check(8).

For example

xfs_db -x -c blockget -c "blocktrash -s 512109 -n 1000" /dev/xfstest/testfs

James
  • 2,212
  • 1
  • 13
  • 19
2

dd blocks to the device where the filesystem resides. You can script this so it is repeatable. Just a few random blocks at random locations, then move on.

Posipiet
  • 186
  • 1
  • In a 50TB file system which is mostly empty surely you would have to be quiet lucky to corrupt the system ? – James Jul 14 '09 at 07:48
  • Well, you just have to use enough random blocks :-). Either way, a "collision" is probably more likely than you think, due to the Birthday Paradox : http://en.wikipedia.org/wiki/Birthday_Paradox . – sleske Jul 14 '09 at 08:51
1

You could try overwriting first 512 bytes (MBR and partition table) of the block device.

Back it up first:

dd if=/dev/device bs=512 count=1 of=backup.bin

And zero it later:

dd if=/dev/zero bs=512 count=1 of=/dev/device

Your machine should not boot after that, you can test XFS repair using live CD.

Karolis T.
  • 2,709
  • 7
  • 32
  • 45
  • I want to have a relatively small corruption as run time and memory usage are dependent on the number of files and the size of the filesystem – James Jul 14 '09 at 07:57
  • This is just 512 bytes of corruption. This just checks whether the filesystem is able to recover without any information on how the filesystem should look like - if xfs hasn't stowed away some spare superblocks somewhere. – towo Jul 14 '09 at 11:41