2
I run a number of CentOS 6 64bit servers with ext3/ext4 file systems. As far as I can tell, none of them have been shutdown improperly, but all of them have accumulated some file system errors that fsck now reports.
Now, a few drives (not file systems) have IO errors which are going to lead to hard drive failures (we run raid1) so is that leading to file system errors? I wouldn't think those errors would be allowed to get up to the file system?
At least one doesn't show any signs of hard drive failure but has fsck errors.
So, do ext3/4 file systems accumulate errors naturally over time or is something bad going on?
Why would you think a I/O error wouldn't interact with a file system error - if the I/O error is reading the file, what do you think the file system will do? - it's going to error if it can't read the file. No matter the cause. – djsmiley2k TMW – 2017-01-16T16:16:56.923
Without more details it's difficult to say what happened exactly. ext3 is quite mature, I haven't seen any actual FS accumulating errors naturally over use in years. Unrecoverable I/O errors (unlikely for RAID 1) will lead to FS errors if they happen inside the FS structure. If RAID 1 somehow screws up error recovery (don't have personal experience with that), that also could lead to FS errors. I'd look closely at which blocks had errors, how raid behaved, and which blocks lead to FS errors. – dirkt – 2017-01-16T16:19:37.633
Thanks for the replies, @djsmiley2k, @dirkt. The IO errors reported by
dmesg
are at the device level, and only on one device, so I figured raid1 would do the right thing from the good device. Also, at least one server doesn't have any drive errors but does have file system errors. – Shovas – 2017-01-16T16:39:24.657So I presume you're using mdadm or some software raid, not hardware raid? – djsmiley2k TMW – 2017-01-16T16:43:08.660
@djsmiley2k Yes, mdadm software raid1 mirror. – Shovas – 2017-01-16T16:50:33.740