I have a Linux server with many 2 TB disks, all currently in a LVM resulting in about 10 TB of space. I use all this space on an ext4 partition, and currently have about 8,8 TB of data.
Problem is, I often get errors on my disks, and even if I replace (that is to say, I copy the old disk to a new one with dd then i put the new one in the server) them as soon as errors appear, I often get about 100 MB of corrupted data on it. That makes e2fsck go crazy everytime, and it often takes a week to get the ext4 filesystem in a sane state again.
So the question is : What would you recommend me to use as a filesystem on my LVM ? Or what would you recommend me to do instead (I don't really need the LVM) ?
Profile of my filesystem :
- many folder of different total sizes (some totalling 2 TB, some totalling 100 MB)
- almost 200,000 files with different sizes (3/4 of them about 10 MB, 1/4 between 100 MB and 4 GB; I can't currently get more statistics on files as my ext4 partition is completely wrecked up for some days)
- many reads but few writes
- and I need fault tolerance (I stopped using mdadm RAID because it doesn't like having ONE error on the whole disk, and I sometimes have failing disks, that I replace as soon as I can, but that means I can get corrupted data on my filesystem)
The major problem are failing disks; I can lose some files, but I can't afford lose everything at the same time.
If I continue to use ext4, I heard that I should best try to make smaller filesystems and "merge" them somehow, but I don't know how.
I heard btrfs would be nice, but I can't find any clue as to how it manages losing a part of a disk (or a whole disk), when data is NOT replicated (mkfs.btrfs -d single
?).
Any advice on the question will be welcome, thanks in advance !