7

I have been having some electrical issues,mostly power going out suddenly for a few months,and although ups solve the thing mostly.

But still i am worried about filesystem corruption,and data lost.

It is xfs worse than ext3 or less reliable when the computer goes off,or crashes or other filesystem issues?

Having a ups and a good backup strategy (i have a 1,5 tb disk that i want to use to backup all the critical data) is enough ,and i shouldn't be worried ?

I have been reading that xfs zero out data (although i think this has been solved) when the power goes off ,and things like that XFS is not safe against data corruption.

Enabling write barriers and tweaking xfs correctly, and having a ups and backups , could be xfs as reliable as ext3, or at least acceptable enough?

If i use xfs for / and /home to have more performance (with large files mostly), I will risk more my data,than using xfs?

mattdm
  • 6,550
  • 1
  • 25
  • 48
Abel Coto
  • 81
  • 1
  • 3

2 Answers2

6

My personal experience with XFS has been that the fsck for it is not as good as for ext3. We ran our mirror server on XFS, with around 3TB of space for linux distros and the like which we mirror. At one point it had some issue (we didn't have a power failure, IIRC it just started reporting errors). So on a reboot it wanted to do an fsck. However, the fsck took more RAM than the 2GB we had on the system, so it started swapping. After 3 days it ran out of memory. I maxed out the box to 3GB and it was then able to complete the fsck fairly quickly. I know 3GB isn't much ram these days, but at the time that was a pretty sizable box.

I also tried XFS on my laptop for a while. This is more closely like your "power failure" situation, because I was having problems with my laptop locking up, so I had to hard power cycle it somewhat frequently. I ran into several cases where files I had been working on prior to the crash would revert to a copy several hours old. I'd edit a file and save it several times while working on it, then the system would lock up and I'd be back to several hours before.

Because of these issues, I tend to avoid XFS. It seems like the XFS fsck isn't as mature as EXT, probably because XFS almost never has to fsck but ext2/3/4 do so regularly.

But, I'll admit that these experiences were probably 5 years ago. Hopefully they're better now. Just thought I'd pass along my experience.

In retrospect, I realize that EXT3 at that time also had corruption issues. But I run hundreds of EXT3-based servers now and can't remember the last time a hard power cycle caused corruption.

Really, the thing that really soured me on XFS was the fsck taking so much RAM and thrashing the system. In the case of our mirror server, it could afford to be down. If this was a business server, knowing that fsck would take a while but not days would have been critical.

Sean Reifschneider
  • 10,370
  • 3
  • 24
  • 28
  • thanks , i will investigate a bit about xfs fsck ,to see if nowadays happen the same – Abel Coto Nov 28 '10 at 22:39
  • 5
    The new xfs_repair (2008 or so) rocks. I had a rather old server fsck a 10TB volume on a SATA RAID in about 10 minutes. – Ryan Bair Nov 29 '10 at 01:00
  • Nice pointer, thanks for hte information EvilRyry. – Sean Reifschneider Nov 29 '10 at 05:25
  • 2
    If xfs_repair is taking too much memory, try passing something like `-o bhash=1024` per "xfs_repair sometimes hangs during repair" - http://oss.sgi.com/archives/xfs/2008-03/msg01014.html . See also http://blog.jcuff.net/2011/10/try-not-to-be-so-worried-about-xfs-100t.html and https://plus.google.com/107770072576338242009/posts/KboCG9XPAXN . I found this answer after searching for "Is XFS reliable enough for production?" :) – Philip Durbin Jun 09 '12 at 11:33
4

nobody can answer this question better than youself.

in other words, test it.

it's not hard to do: take a typical machine, make it perform a similar load to what you want to do (in my case it was about copying small files between two SAN volumes). and while it's under heavy load, make it fail. try every failure you can imagine (in my case it was mostly about pulling the plug on one volume, on the other, on the server and on the SAN switch).

repeat with all candidate filesystems in my case it was ext3, XFS, ReiserFS and JFS. now i would do ext4 and btrfs instead of ReiserFS and JFS.

what i found is that ext3 lost around 5-10 files out of every million, XFS around 5-30 per million and both Reiser and JFS went to several hundreds, a thousand lost files in at least one case.

so in my testcase, yes: ext3 was the most resilient filesystem, but XFS wasn't so far as I feared. And given that i was approaching ext3 8TB limit, the clear answer was XFS.

I plan to use the slow holidays season to repeat with more modern filesystems; I have high hopes for ext4, but won't bet my data until I see how it performs under real failures. btrfs will be a fun test, but don't thing it's mature enough yet.

Javier
  • 9,078
  • 2
  • 23
  • 24