13

Ubuntu 12.04

File system goes to readonly mode frequently. First of all I have read this question file system is going into read only mode frequently already. But I have to know if it's not caused by something else than dying hard drive. This is server provided by my client and I am just runing there some node.js workers + one node.js server and I am using mongodb.

From time to time (every 20-50h) system suddenly makes filesystem read only, mongodb process fails (due read-only fs) and my node workers/server (which are started by forever) are just killed.

Here is the log from dmesg - I can see there some errors and messages that FS is going to read-only, and there is also some JOURNAL error but I would like to find cause of those errors..

http://speedy.sh/Ux2VV/dmesg.log.txt


edit

smartctl -t long /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-23-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

SMART support is: Unavailable - device lacks SMART capability.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

What I am doing wrong? Same is for sda2.

Morover now when I type any command that not exists in shell I get this:

Sorry, command-not-found has crashed! Please file a bug report at:
https://bugs.launchpad.net/command-not-found/+filebug
Please include the following information with the report:

edit2

I just got info that this server is actually VPS and they told me that hard drives are OK and they are on RAID 10. And they told me that "forcing fsck in fstab should help"...


edit3

here is output from mount command:

/dev/sda2 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /media/psf type prl_fs (rw,nosuid,nodev,sync,noatime,share,_netdev)

So there is no actually sda drive? Only sda2?


edit4

Output from fsck -N command:

root@ubuntu:~# fsck -N sda
fsck from util-linux 2.20.1
[/sbin/fsck.ext4 (1) -- /] fsck.ext4 sda /dev/sda2 
user606521
  • 241
  • 1
  • 2
  • 8
  • I using the same issue, My ubuntu having NodeJS app, MongoDB, Chrome, VSCode, Robomongo, tilix terminal, Matermost, Thunderbird and Postman active applications daily – Ankur Loriya Dec 21 '18 at 07:28

6 Answers6

9
[26729.124569] Write(10): 2a 00 03 96 5a b0 00 00 08 00
[26729.124576] end_request: I/O error, dev sda, sector 60185264
[26729.125298] Buffer I/O error on device sda2, logical block 4593494
[26729.125986] lost page write due to I/O error on sda2

For me, that's pretty strong evidence that your /dev/sda is on its way out. You could run a smartctl test on it for confirmation (smartctl -t long /dev/sda), but I'd be inclined to replace it as soon as possible.

Edit: the smartctl command I gave is correct as written. Thanks for showing the failure mode in your question; this looks like either you have very old hardware, or there's some kind of translation layer in the way: either virtualisation, or a hardware RAID controller. Can you clarify?

May I repeat my assertion that your HDD is on its way out? Testing's all very well, but getting the hardware replaced before your system packs up and your data are lost should be your priority now. Please, at the very least make sure that your backups are completely up-to-date before wasting any more time on smartctl.

Edit 2: it's certainly worth trying what they've suggested - fscking the file system - but I have little hope that that will fix the problem because your FS isn't dropping to ro mode because of FS inconsistencies, it's dropping to ro mode because of problems talking to the underlying hardware.

If they have confidence that the underlying hardware is fine, then it's an issue between the kernel and the hardware, ie, the virtualisation layer. You should probably get your VPS provider to confirm that the distro, and the exact kernel version, that you're running are fully supported on their VPS system.

MadHatter
  • 78,442
  • 20
  • 178
  • 229
2

I have had this problem on my computer for over 1 year and tried everything to solve the problem. Suddenly Linux goes into read-only mode. If you are editing something you are unable to save and have to execute fsck command and reset the computer. The computer is also very slow and freezing all the time. I removed the dual boot and left only Ubuntu, upgraded Ubuntu from version 18.04 LTS to version 20.04 LTS, and it didn't work. What was crucial to solving the problem is the use of the dmesg command. The experience didn't work out for me, just this command. The function of this command is to monitor the computer.

In my case, the problem was related to the SSD incompatibility with Ubuntu. I used HDD and after I switched to SSD the problem came up. The problem was solved by updating the SSD firmware, which was only possible by partitioned Windowns, because Kingston does not have the program to update firmware through Linux. I also installed the dual boot Windowns and Linux, first installing Windows over the entire SSD, then deallocating space through Windowns and installing Ubuntu, but it is very unlikely that this was the solution to the problem.

2

More perfect way to find the exact error may be during the read only period and run the command dmesg for any bugs/issues. You may also try running the fsck in dry mode to figure out what is the issue. (sorry due to access restriction I am unable to view your attachment. If its during the issue period, I will check it later)

rootslash
  • 102
  • 3
2

I also had faced the same issue, wherein the server FS was going into read-only. Do a check of inode, they probably might be full :

df -i

0

I am facing the same issue after installing wine, I removed Ubuntu 16.04 and tried  Ubuntu 18.04, Ubuntu 20.04 but no luck, facing the same issue again and again. Then I changed my Hard drive, the issue was fixed.

0

If this was a physical machine, I would suspect a dying hard drive. If sda is a RAID then it makes sense for it to not support smartctl, and you should check the raid logs next. If this was a physical drive that was suppose to support smartctl, then this message means that the drive controller and/or media is failing so badly that smartctl no longer works and the drive needs to be replaced immediately.

Considering that this is a VPS, the I/O error on the drive might mean that the network connection between the virtual machine and the software defined storage drive was interrupted by something. This may be a "harmless" transient condition with no danger of the drive dying on you, but it is a sign of poor quality service from the provider, especially if it is causing crashes.

user10489
  • 474
  • 1
  • 2
  • 12