0

I have a client on cloud hosting with some sizable drives on the vps. Twice in the past year after a forced reboot the server was down for a few hours because fsck ran automatically when the server came back up. Since this is on a cloud hosting platform I am assuming that there are other integrity checks performed on the disk at the hardware level. The host indicated that there are differences between an integrity check at the infrastructure level and OS level, and that they strongly recommended that fsck not be disabled. However, the client is concerned because very large amounts of downtime result in customer and revenue loss.

It looks as if I can disable those checks by running the following commands:

tune2fs -c 0 -i 0 /dev/mapper/vgArchiveStorage-archive
tune2fs -c 0 -i 0 /dev/mapper/vg_maindisc-lv_root

Would this fix the issue, and if so, how bad of an idea is it to do so? Is there a way to do similar periodic checks without actually bringing the whole server offline (both disks are integral to the operation of his website).

1 Answers1

0

In short: If your hoster says that you shouldn't do it, don't do it. Unless you want even longer downtimes because you have to restore the server from backup.

It is usually a bad idea to go against the system defaults when dealing with checks. I would say especially when dealing with filesystems. Therefore I would not change the settings for the fsck.

If the customer does not want to be down for a longer period, have a failover system. In this case you can't have the cake and eat it. you can see it as the magical triangle where you can only pack two. On the corners there is "safe", "cheap", "little downtime". A system that can not tollerate downtimes, needs to be redundant in anyway to allow for maintenance.

Christopher Perrin
  • 4,741
  • 17
  • 32
  • Assume you were running an unmanaged vps you had set up on Amazon AWS and you do not have a host to advise you. This is not a bare metal server, and Amazon has their own integrity checks that they run on their disks. If a disk goes bad, it's not your disk that goes bad, is a disk that serves an unknown number of clients each with their own hardware abstraction layer, and if Amazon needs to replace a disk you might never know due to the nature of their redundancy. In this scenario would you still need to have `fsck` run every x number of reboots? If so, why? – Michael VanDeMar Dec 20 '18 at 21:36