8

I'm developing for a headless embedded appliance, running CentOS 6.2. The user can connect a keyboard, but not a monitor, and a serial console would require opening the case, something we don't want the user to have to do. This all pretty much obviates the possibility of using a recovery USB drive to boot from, unless all it does is blindly reimage the harddrive. I would like to provide some recovery facilities, and I have written a tool that comes up on /dev/tty1 in place of getty to provide these functions.

One such function is fsck. I have found out how to remount the root and other file systems read-only. Now that they are read-only, it should be safe to fsck them and then reboot. Unfortunately, fsck complains to me that the filesystems are mounted and refuses to do anything.

How can I force fsck to run on a read-only mounted partition?

Based on my research, this is going to have to be something obscure. "-f" just means to force repair of a clean (but unmounted) partition. I need to repair a clean or unclean mounted partition. From what I read, this is something "only experts" should do, but no one has bothered to explain how the experts do it. I'm hoping someone can reveal this to me.

BTW, I've noticed that e2fsck 1.42.4 on Gentoo will let you fsck a mounted partition, even mounted read-write, but it seems only to do so if fsck is run from a terminal, so it can ask the user if they're sure they want to do something so dangerous. I'm not sure if the CentOS version does the same thing, but it appears that fsck CAN repair a mounted partition, but it flatly refuses to when not run from a terminal.

One last-resort option is for me to compile my own hacked fsck. But I'm afraid I'll mess it up in some unexpected way.

Thanks!

Note: Originally posted here.

Update: I didn't think it would matter at the time I wrote this, but in order to remount the fs read-only, I had to do this:

echo s > /proc/sysrq-trigger
echo s > /proc/sysrq-trigger
echo u > /proc/sysrq-trigger

That was the only way I could find to do this. Everything else complained about the file system being busy. As far as I know, this is 'safe', but it probably remounts a bit differently from the usual approach. And this may be a reason why fsck doesn't want to repair it. It still thinks it's mounted read-write.

Mark Henderson
  • 68,316
  • 31
  • 175
  • 255
Timothy Miller
  • 291
  • 1
  • 2
  • 11
  • 2
    Voted to close as off-topic, should probably be on [unix.stackexchange.com](http://unix.stackexchange.com). – Jimmy Sawczuk Jul 05 '12 at 14:25
  • 1
    Even `fsck` on a R/O partition is dangerous, since it can change the structures under the OS. – Ignacio Vazquez-Abrams Jul 05 '12 at 15:01
  • It's not THAT dangerous, if you intend to reboot immediately when it's done. –  Jul 05 '12 at 16:20
  • Why don't you reboot and force fsck? filesystem wouldn't be mounted at all : shutdown -rF now (but be careful : if anything goes wrong, os will drop you to single user mode). –  Dec 28 '12 at 12:41

3 Answers3

6

You can fsck a read-only filesystem, because mounting read-only doesn't mark it as "dirty" the way read-write mounting does. There are no changes sitting in a write cache that might be only partially flushed to disk, so all the on-disk structures are consistent and safe for fsck to modify.

However, if fsck makes any changes, the kernel's filesystem driver might become confused, because things that it expected to remain constant have instead changed out from under it. This won't affect the integrity of the filesystem itself (since the driver isn't writing to it), but it may make the running system unstable. To avoid that, you should reboot if fsck made any changes to your filesystem.

Wyzard
  • 1,143
  • 6
  • 13
  • I do intend to reboot immediately afterwards. The problem is getting the fsck to start in the first place. – Timothy Miller Jul 06 '12 at 14:05
  • 1
    you can, but fsck will call you a BONEHEAD. no, I'm not making that up, check the source. It literally calls you a BONEHEAD. – stew Jul 06 '12 at 14:12
  • 1
    Interesting, e2fsck actually refuses to run on a read-only mounted filesystem even if `-f` is specified. I hadn't noticed that, and I'm pretty sure I've done it in the past. And most systems have startup scripts that check the root filesystem early in the boot process while it's still mounted read-only, so it's allowed in that case... – Wyzard Jul 06 '12 at 23:42
  • 1
    @Wyzard, works for me http://pastie.org/6365035 – poige Mar 02 '13 at 11:20
3

Having been on an "appliance" type project in the past, I've done a few things which partially work around this sort of problem.

One appliance had enough memory, so the root filesystem ran directly from initrd. The initrd had enough to fsck (force), then mount "/mounts/persistent" and "/mount/static"; almost all files needed after that were on one of these two filesystems.

This had the advantage that the root filesystem never needed "fixing" - if anything went wrong, it would reboot, and the initrd came up clean (since the one being used was not the one on disk). Any updates to the initrd were just put in place (with previous ones being available, for booting); any files not on the original "static" needed after a "firmware upgrade" (=new initrd) went on the initrd from then on. The "static" filesystem was read-only in any case. Only the persistent filesystem needed to be backed up, and the "current firmware version". I had copies of all the firmwares before they were sent out.

0

Have you tried with -p or -y switches? I do always do that on a Debian headless machine and it works.

From fsck.ext2 man page:

   -p     Automatically repair ("preen") the  file  system.   This  option
          will  cause  e2fsck to automatically fix any filesystem problems
          that can be safely fixed without human intervention.  If  e2fsck
          discovers  a  problem which may require the system administrator
          to take  additional  corrective  action,  e2fsck  will  print  a
          description  of the problem and then exit with the value 4 logi-
          cally or'ed into the exit code.  (See the  EXIT  CODE  section.)
          This  option  is normally used by the system's boot scripts.  It
          may not be specified at the same time as the -n or -y options.

   -y     Assume  an answer of `yes' to all questions; allows e2fsck to be
          used non-interactively.  This option may not be specified at the
          same time as the -n or -p options.

Remember that you have to reboot before remounting read-write!

mmoya
  • 284
  • 2
  • 8
xOneca
  • 41
  • 1
  • 10