JFS: long fsck time on large filesystem?

Question

There was a power failure recently that took down one of my servers. On reboot, the main storage filesystem - JFS on a 7TB (9x1TB RAID6) filesystem - needed an fsck before mounting read-write. After I started the fsck, I watched it for awhile in top - memory usage was rising steadily (but not too rapidly), and CPU usage was pegged at or near 100%.

Now, about 12 hours in, the fsck process has consumed almost 94% of the 4GB of memory in the system and CPU usage has dropped to around 2%. The process is still running (and gives no indication as to further running time).

First off: is this indicative of a problem? I'm worried by the fact that the CPU usage has dropped so dramatically - it seems almost as though the process has become memory-bound, and the fsck will take forever to complete because it's spending all its time swapping. (I noticed that kswapd0 is floating uncomfortably close to the top of the list in top, actually beating out the fsck process for CPU usage more than half the time.) If this isn't the case, if fsck just slows down CPU-wise near the end of the process, that's fine - I just need to know that.

If this is a problem, what can I do to improve fsck performance? I'm open to almost anything, up to and including "buy more memory for the system."

The relevant line from top:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
 5201 root      20   0 58.1g 3.6g  128 D    2 93.8   1071:27 fsck.jfs

And the result of free -m:

             total       used       free     shared    buffers     cached 
Mem:          3959       3932         26          0          0          6 
-/+ buffers/cache:       3925         33 
Swap:          964        482        482

Buy a UPS and/or generator for the server so it doesn't happen again? — Andrew, Jun 25 '10 at 06:45
I've definitely thought about that :) the issue there is that I would have to power down the server again to switch its power source over to the UPS, which would kill the fsck if it's still running. — Tim, Jun 25 '10 at 06:48

score 2 · Answer 1 · answered Jun 25 '10 at 09:45

2

Correct me if I'm wrong, but JFS is not a full journaling file system: it only handles the metadata in the journal. This means that the fsck command will take a looong time to complete if you have lots of data.

I suggest you investigate the possibility to switch to a fully journaled file system (etx3/4): that should remove the need for the command to be run in case of abrupt failure.

answered Jun 25 '10 at 09:45

Stephane

6,382
3
25
47

What filesystem might you recommend? From the looks of the Wiki articles, XFS does the same thing and journals metadata only. I'm hesitant about both Reiser3 and 4, for various reasons, and I dunno about ext3's performance over such a large filesystem. Are either btrfs or ext4 stable enough yet for large-system (read: up to 15TB) production use? – Tim Jun 25 '10 at 19:51
I don't have enough experience with various FS and this kind of data size so I'll refrain from recommending anything. Your problem, however, is supposed to be solved by the use of a fully journaling file system. Of course, this comes with a performance cost as well so you might want to run your own tests anyway to see if you can meet your requirements. – Stephane Jun 28 '10 at 15:05

score 0 · Accepted Answer · answered Jun 27 '10 at 16:58

0

Based on the virtual memory usage, I figured it'd be impossible to run a full fsck on the volume in any reasonable amount of time (even with extra RAM), so I backed up all the files on the volume and reformatted with XFS.

answered Jun 27 '10 at 16:58

Tim

1,148
1
14
23

By the way, `xfs_check` tends to use a lot of memory too. On a 2TB filesystem (with 16489 inodes), top reports that it uses 2.9GB of memory. – Cristian Ciupitu Jul 31 '10 at 13:32
What's why it's generally advised to keep several small filesystems, instead of a single large one. – cnst Jun 29 '14 at 18:40

JFS: long fsck time on large filesystem?

2 Answers2